On Thu, Aug 19, 2010 at 8:05 PM, Michael Litchard <michael@schmong.org> wrote:
I'd like the community to give me feedback on the difficulty level of
implementing an awk interpreter. What language features would be
required? Specifically I'm hoping that TH is not necessary because I'm
nowhere near that skill level.

I'd love to have portable pure haskell implementations of the traditional unix tools.  If it were done well, it would allow you to 'cabal install' yourself into a usable dev environment on windows :)  I'd much rather do that than deal with cygwin/mingw.

Someone (was it Stephen Hicks?) was writing (or finished writing?) an sh parser and I got really excited for the same reason.  It would be a cool project, but I'm not sure I can justify to myself spending my spare cycles on it.
 


An outline of a possible approach would be appreciated. I am using
http://www.math.utah.edu/docs/info/gawk_toc.html
as a guide to the language description.

I think this is a good opportunity for you to learn about monad transformers.  To that end, I think you will like this paper (quite easy for beginners to pick up):
http://www.grabmueller.de/martin/www/pub/Transformers.en.html

At least, that's how I first learned about them and I though it was easy to read at the time :)

You might also want to read (and try) some of the tutorials that focus on creating interpreters just to sort of get some practice in that area.  I haven't read it, but I've heard good things about this one:
http://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours

You might also focus on the 'core' of awk.  Think about, what is the minimal language and start from there.  Grow your implementation adding features bit by bit.  It's also a good opportunity to do testing.  You have a reference implementation and so you can write lots of tests for each feature as you add them.

I hope that helps,
Jason