
Justin Bailey wrote:
2008/9/8 Daryoush Mehrtash
Thanks.
Pattern matching and memory management in Haskell (or may be GHC implementation of it) is somewhat of a mystery to me. Are there any references that explains the underlying implementation?
Be careful what you ask for. This paper is 16 years old but fairly relevant. Click the "view or download" link at the bottom:
"Implementing Lazy Functional Languages on Stock Hardware: The Spineless Tagless G-Machine" http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.53.3729
That's an excellent paper for getting down to the gritty details of what's going on under the covers. However, I think it's not clear that that's what you're really looking for; you needn't know anything about the STG in order to know how pattern matching works enough to use it. In short, a pattern is a (free) variable, or a data constructor applied to patterns. So if we have:
data MyList = Nil | Cons Int MyList
Then we can have the patterns: Nil, (Cons x xs), (Cons 0 xs),..., (Cons x Nil), (Cons 0 Nil),..., (Cons x (Cons x2 xs)), etc. Other notes: * As demonstrated above, numeric literals count as "data constructors". * Since a data constructor can be an infix operator (either spelled with backticks or a symbolic name beginning with ':' ) we can also write our patterns with infix notation. * Even though there is an intentional homoiconicity between patterns and an expression of data constructors, you can't use arbitrary expressions. This falls out from only allowing data constructors in patterns, rather than any arbitrary function. ** In particular, you can't use partial application. You also can't use anything like (.), ($), flip,... ** (While n+k patterns exist for legacy reasons, they are an abomination. They should not be used and are slated for removal in haskell prime.) ** For record syntax this homoiconicity means that you get Foo{xpart=x} as a pattern binding the variable x. This follows because that's what the expression would look like to construct a Foo setting the xpart to a variable x. Perhaps confusingly, the '=' involved here is the one from record syntax, not the one from let bindings. The homoiconicity generally makes code easier to read, though it can be somewhat confusing when discussing theoretical concerns. The reason is that a single lexeme, e.g. 'Cons', is being used both as a data constructor (in expressions) and as a data *de*structor (in patterns). Identically, a field name in a record is used both as an injector and as a projector. This conceptual overloading is perfectly valid, but it sometimes leads to people conflating the ideas which is invalid. There are sometimes reasons to want to throw a wrench into the works, breaking up the homoiconicity. One particular example (which I believe will be available in 6.10 though it's not approved for haskell prime) is to allow "view patterns". The idea behind view patterns is to allow functions to be called behind the scenes in order to convert the in-memory representation into a view type, and then do pattern matching on that view of the value rather than on the value itself. There are two primary uses of this: (1) improving legibility of pattern matching for complex datastructures, (2) allowing multiple types to all be pattern matched interchangeably, e.g. association lists, Maps, HashMaps,... There are drawbacks to views (and anything else that breaks homoiconicity). First off is that it greatly complicates the story of what's going on during pattern matching. More importantly, however, is that it means that pattern matches are no longer in correspondence with the in-memory representations of values. This means that there is a hidden performance cost which can get quite high for deep patterns. -- Live well, ~wren