It Seemed Like a Good Idea at the Time Coding, Mostly

29Oct/082

YACCs are Large and Unwieldy, as a Rule

A while back, I decided to try and write an implementation of Scheme, or at least a mostly Scheme-like Lisp. After six tries, I was finally forced to admit that a hand-coded parser just wasn't going to cut it.  So I decided that I would make things easier on myself by using Lex and Yacc*.

Now I had two problems.

Now don't get me wrong, Lex and Yacc are probably a couple of very great tools for very certain problems.  Unfortunately, they are also a very big pain in the ass to use.

My first issue has to do with input.  Lex is a sort of prima donna, refusing to handle any input that isn't handed to it on a gold-plated FILE* handle.  The most reliable method I can find for feeding it arbitrary strings is basically "don't bother".  Instead, I am told to write out my text to a temp file and read it back in through Lex.

But that isn't so bad.  It's certainly possible to work around finicky input conventions.  Hell, that's half of what programming seems to be about most of the time.

So then I come to my next issue.  I ask myself "What if, during the lexing of one file, I want to go lex another one instead?"  As best as I can tell, the answer is again "don't bother".  The only facility I can see for opening another file is in the yywrap() function, and I don't even want to begin to contemplate what would happen if I  switched pointers mid-lexing.

Now I'm beginning to get uneasy.  Are these really the tools that I want to use?

But so far, at least my code is working.  It's taken some trial and error, but I have a working program.  It reads input, tokenizes it, and even parses it a little.  Then something breaks.

Now I run into the next problem.  The code output by lex and yacc is opaque.  It may just barely be possible to understand what's going on if you have a very good understanding of finite state automata and LR parser theory, but for mere mortals such as myself, no dice.  I spend about four hours trying to figure out where in my six parser rules the error lies, but have no luck.

So I decide to go work on something else for a while.  I really don't care about parsing.  I want to write a scheme interpreter, and parsing scheme code is just a messy prerequisite to this.  To reflect this viewpoint, I believe that the lexer and parser should logically be subordinate to the main program.  But again, yacc screws me over.  I can't tell it "hey, take this bit of text, lex and parse it, then give me back the results."  No, instead I have to write stuff to a file, hope that all relevant variables (global of course, no encapsulation for this library) are cleared, and then let yacc take over.

So I take a step back.  In times like this, there's a quote that it pays to remember. It says that "When you're up to your ass in alligators, it's hard to remember that your initial objective was to drain the swamp."

I had just spent four hours of my (very limited) free time trying to achieve a goal radically different from what I had set out to do.  I couldn't care less about the complexities of yacc, I just wanted to parse some scheme code.

So now I had a decision to make.  Do I continue muddling along with lex and yacc, hoping against hope that I'll get something useful out of it, or do I toss the whole thing and hope I can figure out a better way instead?

From where I stood, the wheel under consideration was old, and chipped, and looked more lika a square than a circle.  So I decided to reinvent it.  Using the state-of-the-art technology of C (Hey, it's one letter better than the original Lex/Yacc), I decided to write my own lexer and parser libraries.

It should be interesting.

* Okay, technically I was using Flex and Bison, but I never got far enough to use the non-backwards-compatible bits.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)
Tagged as: , , 2 Comments
14Oct/080

Long Time, No Post

Until last Sunday, I hadn't posted anything recently.  Mostly, this is because I got distracted by Super Mario 64 two weeks ago, and went directly on to Paper Mario 64 the week after that.  Partly, it's because I experienced a sudden upsurge in my homework load two weeks ago, which took until yesterday to finish all of.  A little bit can be blamed on college applications, and many other little things that ate into my time.  But I think a large part of the problem was that the first few posts I wrote on this blog were rather long, and I felt something of an obligation to keep up the quantity.

So. Change.  Shorter posts, but posts nonetheless.  Hopefully now I'll be able to get myself to write more often.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)
Filed under: Uncategorized No Comments
11Oct/080

I ♥ Unit Testing

I have recently been converted.  For a while I've been aware that I should probably be doing unit tests to verify that my code is doing what I expect it to, but I could never find a framework that was unobtrusive enough to work for me.  Since most of my code is currently in C, I needed a good C unit testing framework to go with it.

Finally, I found one.  MinUnit is possibly the simplest unit testing framework imaginable.  Measuring in at three lines long, with one of those lines just defining a counter for the number of tests run, I figured it was probably simple enough to start using without any hassle.  I was right.

So I started using it.  It's a really nice feeling not having to worry about making a change and breaking functionality somewhere else.  It's also become a lot easier to write code in small chunks.  I can sit down and tell myself "Okay, I'm just going to write tests for this and this case, and when they pass, I'll go do something else."  This comes in very handy when you only have a 20 minute break between homework assignments to work in.

Unfortunately, about a week after I started using unit tests, my workload drastically increased, and I got distracted by playing Super Mario 64 and Paper Mario 64.  But now things seem to have quieted down again with my completion of Paper Mario 64 yesterday.

And thanks to unit testing, I was able to pick up my coding right where I left off.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)