Syntax programming with lexemes rather than trees?

Hello All Modern functional programming languages give you algebraic data types that are mighty convenient for programming with syntax trees. However, I'm working in a domain (music typesetting) where modelling syntax with trees can be problematic and I'm wondering whether I should work at a lower level - essentially a list / stream of lexemes and some notion of a context stack for processing, tracking when I'm inside a tuplet and the metrical calculation is scaled, for example. Does anyone know of any previous work that takes a "lexical" view of syntax rather than an abstract syntax tree view? Any domain is good - I can't imagine there's any prior work on music typesetting. Pointers to papers would be more digestible than code but either is fine. Similarly, implementation in any functional language is fine. Thanks Stephen

I'm working in a domain (music typesetting) where modelling syntax with trees can be problematic and I'm wondering whether I should work at a lower level - essentially a list / stream of lexemes and some notion of a context stack for processing, tracking when I'm inside a tuplet and the metrical calculation is scaled, for example.
It sounds like your domain is generation of syntax, rather than parsing. It also sounds rather similar to the difference between the high-level language accepted by a compiler (represented by an AST) and the target language produced by a compiler (generally a flat list of almost-atomic instructions and labels). So maybe techniques for dealing with low-level assembly-like languages would be worth investigating. Regards, Malcolm

Hi Malcolm Thanks - particularly I don't want to go to an AST because its I'm finding it too convoluted 'shape wise' - processing beam groups inside tuplets etc. is a nightmare - music representations have had at least eight centuries of ad hoc extension. I know Norman Ramsey and colleagues papers on low-level representations - I'll give them a re-reading. Thanks again Stephen

Hi Stephen,
It actually sounds like your representation has structure, but you
dislike structure because it's hard to work with. The solution is a
generics library, and I recommend Uniplate:
http://community.haskell.org/~ndm/uniplate
Imagine you have a ridiculously complex AST, with beam groups under
tuplets etc, and you want to delete the first item from every beam
group:
transform f
where f (Beam (x:xs)) = Beam xs
f x = x
Uniplate moves through tuplets and any other hierarchical syntax, so
you just specify the transformation. My guess is if you throw away the
structure then you'll need it back for some operations.
Thanks, Neil
2010/3/22 Stephen Tetley
Hi Malcolm
Thanks - particularly I don't want to go to an AST because its I'm finding it too convoluted 'shape wise' - processing beam groups inside tuplets etc. is a nightmare - music representations have had at least eight centuries of ad hoc extension.
I know Norman Ramsey and colleagues papers on low-level representations - I'll give them a re-reading.
Thanks again
Stephen _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Hi Neil Thanks - music has has lots of structure, true, but unfortunately the structure is often arbitrary, sometimes even conflicting (with respect to a nested representation). I've tried various representations: plain algebraic data types, HOAS, generics with KURE, tagless, attribute grammar with UUAG. From all this, it turns out that the transformations I want to are pretty simple: - 'rename' pitch - two versions - one context free so I can use fmap; the other context sensitive, so I use stmap - a variation of mapAccumL coded as a typeclass - its surprising pleasant to use as an alternative to a state monad when you can live with left-to-right traversals only: class StateMap f where stmap :: (st -> a -> (b, st)) -> st -> f a -> (f b, st) - 'rename' duration - same thing, two different traversals one context sensitive, one context free. - metrical splitting - more complicated - segmenting note lists into bars and beams groups (which must be communicated to LilyPond and ABC) somewhat reminiscent of groupBy in Data.List but more involved. I've rewritten the algorithm about 5 times, each time a slight improvement though it still isn't pretty (not a big deal, however, so long as it works). At the moment I can do quite a lot with what I've got, but its too unwieldy for real use - recently adding support for above-staff fret diagrams has caused a lot of duplicated code even though it should be superficially simple (fret diagrams don't need tuplets or beam groups in their note list syntax). So I need a more different representation. Although Malcolm didn't mention it, his message reminded me of streaming XML processing. I've found a few papers since by Alain Frisch, Keisuke Nakano and colleagues that cast streaming in a functional light, so I'll see how far I can get with a similar approach. Thanks again Stephen

Stephen Tetley schrieb:
Hello All
Modern functional programming languages give you algebraic data types that are mighty convenient for programming with syntax trees. However, I'm working in a domain (music typesetting) where modelling syntax with trees can be problematic and I'm wondering whether I should work at a lower level - essentially a list / stream of lexemes and some notion of a context stack for processing, tracking when I'm inside a tuplet and the metrical calculation is scaled, for example.
Does anyone know of any previous work that takes a "lexical" view of syntax rather than an abstract syntax tree view? Any domain is good - I can't imagine there's any prior work on music typesetting.
Pointers to papers would be more digestible than code but either is fine. Similarly, implementation in any functional language is fine.
There was some unpublished work to emit Lilypond code for Haskore songs.

Hi Henning Thanks - yes, there is a report by Matt Munz, a student of Paul Hudak's. Last year I tried to get my library to work with Haskore, but Haskore has numerical durations - for scores you need symbolic ones, and its a lot of work deriving symbolic durations from numeric ones. There are few other problems such as they way Haskore handles overlays that don't lend well to score output, at least not for LilyPond (originally Haskore worked with the Lisp score system CNM I believe). There's an example here of how far I got with Haskore - I don't think Chick Corea's publishers will be getting in touch... http://www.flickr.com/photos/44929957@N03/ Best wishes Stephen
participants (4)
-
Henning Thielemann
-
Malcolm Wallace
-
Neil Mitchell
-
Stephen Tetley