looking for alex+happy examples that uses location annotated tokens and location information in err msgs

Does anyone have any example alex+happy programs that uses location location annotated tokens and location information of token while reporting parse errors? I already wrote lexer and my function type is something like:
runLexer :: String -> [(Token, AlexPosn)]
where AlexPosn type contains location information. But I have no idea how can I use that in happy to parse and use location information in parse error messages. There are lots of examples in interwebs but as far as I can see none of them works on location-annotated tokens. Thanks.. --- Ömer Sinan Ağacan http://osa1.net

On 08/03/14 21:47, Ömer Sinan Ağacan wrote:
Does anyone have any example alex+happy programs that uses location location annotated tokens and location information of token while reporting parse errors?
I already wrote lexer and my function type is something like:
runLexer :: String -> [(Token, AlexPosn)]
where AlexPosn type contains location information. But I have no idea how can I use that in happy to parse and use location information in parse error messages. There are lots of examples in interwebs but as far as I can see none of them works on location-annotated tokens.
Thanks..
--- Ömer Sinan Ağacan http://osa1.net _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
I believe GHC does this. See the compiler/parser directory of the GHC repo. Perhaps it's not the most lightweight example but it's the only readily available one that I can think of. -- Mateusz K.

I think GHC uses alex+happy in very different way than explained in
the docs.. For example it doesn't use "posn", "monadUserState" etc.
wrappers as explained in alex documentations. I also can't figure out
how does it keep track of token locations.. I guess It's just too
complicated for me to understand without spending several hours.
---
Ömer Sinan Ağacan
http://osa1.net
2014-03-09 9:49 GMT+02:00 Mateusz Kowalczyk
On 08/03/14 21:47, Ömer Sinan Ağacan wrote:
Does anyone have any example alex+happy programs that uses location location annotated tokens and location information of token while reporting parse errors?
I already wrote lexer and my function type is something like:
runLexer :: String -> [(Token, AlexPosn)]
where AlexPosn type contains location information. But I have no idea how can I use that in happy to parse and use location information in parse error messages. There are lots of examples in interwebs but as far as I can see none of them works on location-annotated tokens.
Thanks..
--- Ömer Sinan Ağacan http://osa1.net _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
I believe GHC does this. See the compiler/parser directory of the GHC repo. Perhaps it's not the most lightweight example but it's the only readily available one that I can think of.
-- Mateusz K. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On 09/03/14 11:15, Ömer Sinan Ağacan wrote:
I think GHC uses alex+happy in very different way than explained in the docs.. For example it doesn't use "posn", "monadUserState" etc. wrappers as explained in alex documentations. I also can't figure out how does it keep track of token locations.. I guess It's just too complicated for me to understand without spending several hours.
--- Ömer Sinan Ağacan http://osa1.net
It doesn't use the wrappers provided, it instead uses its own data structures to do what the wrappers let you do and more. You can look for the data types defined in those modules to try and shed some light on how it's all done. I did mention it's not the most lightweight example ;) -- Mateusz K.

Ha! I finally managed to use any kinds of tokens in Happy parsers ...
Will post the code soon.
---
Ömer Sinan Ağacan
http://osa1.net
2014-03-09 13:44 GMT+02:00 Mateusz Kowalczyk
On 09/03/14 11:15, Ömer Sinan Ağacan wrote:
I think GHC uses alex+happy in very different way than explained in the docs.. For example it doesn't use "posn", "monadUserState" etc. wrappers as explained in alex documentations. I also can't figure out how does it keep track of token locations.. I guess It's just too complicated for me to understand without spending several hours.
--- Ömer Sinan Ağacan http://osa1.net
It doesn't use the wrappers provided, it instead uses its own data structures to do what the wrappers let you do and more. You can look for the data types defined in those modules to try and shed some light on how it's all done. I did mention it's not the most lightweight example ;)
-- Mateusz K.

Okay guys, I finally did it and it turned out that Happy is actually
very easy to use .. only problem is that IMO documentation is not
great(or maybe I didn't pay enough attention reading it .. reading
docs is boring). Anyway, here's a short explanation(link for code is
also below):
Let's say you already wrote the lexer that returns `[TokPos]` where
data Token = ... your tokens ...
type TokPos = (Token, AlexPosn)
(AlexPosn contains location information)
and after telling Happy your token types, all you need to do is to
declare how tokens used in left-hand side of productions are matched
against tokens. Let's say I have this tokens:
data Token = A String | B Int | C
I'll need to add this token declarations in Happy file:
%token
a { (A $$, _) }
b { (B $$, _) }
c { (C, _) }
notice how I'm using ordinary pattern syntax. I didn't know this part
before. After that I can use a, b and c in left-hand side parts of
productions, like:
Exp : a b { SomeConstructor $1 $2 }
| c { SomeOtherConstructor }
and that's it. I hope that helps somebody.
Working code:
* Lexer(Alex): https://github.com/osa1/minCaml.hs/blob/master/src/Lexer.x
* Parser(Happy): https://github.com/osa1/minCaml.hs/blob/master/src/Parser.y
---
Ömer Sinan Ağacan
http://osa1.net
2014-03-09 18:45 GMT+02:00 Ömer Sinan Ağacan
Ha! I finally managed to use any kinds of tokens in Happy parsers ... Will post the code soon.
--- Ömer Sinan Ağacan http://osa1.net
2014-03-09 13:44 GMT+02:00 Mateusz Kowalczyk
: On 09/03/14 11:15, Ömer Sinan Ağacan wrote:
I think GHC uses alex+happy in very different way than explained in the docs.. For example it doesn't use "posn", "monadUserState" etc. wrappers as explained in alex documentations. I also can't figure out how does it keep track of token locations.. I guess It's just too complicated for me to understand without spending several hours.
--- Ömer Sinan Ağacan http://osa1.net
It doesn't use the wrappers provided, it instead uses its own data structures to do what the wrappers let you do and more. You can look for the data types defined in those modules to try and shed some light on how it's all done. I did mention it's not the most lightweight example ;)
-- Mateusz K.

Hi Ömer
Are you rewriting Eijiro Sumii's MinCaml in Haskell?
I did the same a couple of years ago, though I used Parsec for parsing
and only implemented the higher transformations (beta reduction,
inlining, constant folding, useless variable elimination). As I have
no access to a Sparc machine, I didn't go as far as closure conversion
and code generation.
The code was only for my personal learning as MinCaml was the nicest /
smallest real compiler I could find. Code is in my Google repository -
it might be useful for reference, obviously for a learning exercise
you'd want to do-it-yourself:
https://code.google.com/p/copperbox/source/browse/ >> trunk >>
compiler >> HMinCaml
Best wishes
Stephen
On 9 March 2014 19:18, Ömer Sinan Ağacan
Working code: * Lexer(Alex): https://github.com/osa1/minCaml.hs/blob/master/src/Lexer.x * Parser(Happy): https://github.com/osa1/minCaml.hs/blob/master/src/Parser.y

Hi Stephen,
Are you rewriting Eijiro Sumii's MinCaml in Haskell?
Yes!
I did the same a couple of years ago, though I used Parsec for parsing and only implemented the higher transformations (beta reduction, inlining, constant folding, useless variable elimination). As I have no access to a Sparc machine, I didn't go as far as closure conversion and code generation.
At first I also used Parsec, but I wasn't happy with the parser. The original grammar has 4 left-recursive productions and I had to manually eliminate them(I guess you also did that). But after that the grammar was still not LL(1) so I had to place some `try`s very carefully. It worked fine, but resulting parser was parsing nested lets in exponential time. For example, this program took about a minute to parse: https://github.com/esumii/min-caml/blob/master/test/spill.ml . So I rewrote it in Happy and now it works great. (I still need to work on error messages though. Also, I did not resolve all ambiguities so Happy reports lots of shift-reduce conficts) I guess that problem with Parsec parser could also be fixed but I just prefer implementation to be similar to original definitions. (also, I always wanted to learn Happy since GHC is using it and I want to start contributing to GHC in the future) I love starting with some tutorials and then improving final product by adding more features etc. I started learning Haskell 3 years ago with "write yourself a Scheme" tutorial and then I added continuations/call-cc, exceptions and several other features. It helped me a lot and it was very fun way to learn. I'm currently planning to do something similar, I want to implement more features once I have compiler for the original language working. I also don't have a Sparc machine, so I'm planning to compile to x86-64. I only know basics of x86-64, so I want to first compile to C. After I have that working correctly, by looking to generated ASM from the generated C sources, I want to implement x86-64 code generator. So I also aim to learn x86-64 assembly. At some point I also want to implement a garbage collector.. AFAIK no heap allocation is done in the original language, so I may need some more data types.. Maybe sum-types or records like in Haskell.
The code was only for my personal learning as MinCaml was the nicest / smallest real compiler I could find. Code is in my Google repository - it might be useful for reference, obviously for a learning exercise you'd want to do-it-yourself:
https://code.google.com/p/copperbox/source/browse/ >> trunk >> compiler >> HMinCaml
Thanks! I may give it a look if I get stuck at some point. --- Ömer Sinan Ağacan http://osa1.net

Hi Ömer
Sounds good, good luck with the project.
One thing I struggled with was re-implementing the type checker as the
Caml implementation uses ref cells. Fortunately there are a lot of
tutorials on the web covering type checking with a good few using
Haskell and greater or lesser degree of assignment.
On 9 March 2014 22:07, Ömer Sinan Ağacan
Hi Stephen,
Are you rewriting Eijiro Sumii's MinCaml in Haskell?
Yes!
[Snip]

Hi Stephen,
Sounds good, good luck with the project.
Thanks!
One thing I struggled with was re-implementing the type checker as the Caml implementation uses ref cells. Fortunately there are a lot of tutorials on the web covering type checking with a good few using Haskell and greater or lesser degree of assignment.
I think ref cells used in OCaml code can be emulated in Haskell using a map from ints to types. This more-or-less corresponds to managing your own heap, where pointers point to types. I have something like this in mind: type Unifications = M.Map TyVar Ty -- TyVar is basically an int data UnificationError = OccursCheck Ty Ty -- circular definition | UnificationError Ty Ty -- can't unify types | StrErr String -- required for Error instance deriving (Show) instance Error UnificationError where strMsg = StrErr -- | Unification monad that keeps track of unifications. newtype Unify a = Unify { unwrapUnify :: StateT Unifications (ErrorT UnificationError Identity) a } deriving (Functor, Applicative, Monad, MonadState Unifications, MonadError UnificationError) type TyEnv = M.Map Id Ty -- map from identifiers to types -- | Follow chains of type variables in the heap and remove type variables -- by connecting type variables in the type to final types in the chain. -- e.g. 1 |-> TyVar 2 -- 2 |-> TyVar 3 -- 3 |-> TyBool -- after `prune (TyVar 1)`, heap should be like: -- 1 |-> TyBool -- 2 |-> TyVar 3 -- 3 |-> TyBool prune :: Ty -> Unify Ty prune = ... -- | Unify two types. unify :: TyEnv -> Ty -> Ty -> Unify () unify = ... -- | Infer type of an expression. typeCheck :: TyEnv -> Exp -> Unify Ty typeCheck = ... I didn't implement it yet but it seems to me that this shouuld work. Another approach might be using STRefs. --- Ömer Sinan Ağacan http://osa1.net

Hi Stephen, I just finished implementing the type checker. I used the first method I mentioned in my previous mail, and it worked great. It may still contain some bugs but it works fine on current test suite. (test programs in original min-caml implementation). If you're still interested, source code is here https://github.com/osa1/minCaml.hs/blob/master/src/MinCaml/Typing.hs --- Ömer Sinan Ağacan http://osa1.net

Ömer Sinan Ağacan wrote:
Does anyone have any example alex+happy programs that uses location location annotated tokens and location information of token while reporting parse errors?
I already wrote lexer and my function type is something like:
runLexer :: String -> [(Token, AlexPosn)]
where AlexPosn type contains location information. But I have no idea how can I use that in happy to parse and use location information in parse error messages. There are lots of examples in interwebs but as far as I can see none of them works on location-annotated tokens.
Also look at the DDC codebase: https://github.com/DDCSF/ddc Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/

I don't think DDC uses alex/happy for parsing:
```
➜ ddc git:(master) find . -iname "*.y"
➜ ddc git:(master) find . -iname "*.x"
```
and there are some Parsec parsers in `packages/ddc-core/DDC/Core/Parser`.
---
Ömer Sinan Ağacan
http://osa1.net
2014-03-09 10:47 GMT+02:00 Erik de Castro Lopo
Ömer Sinan Ağacan wrote:
Does anyone have any example alex+happy programs that uses location location annotated tokens and location information of token while reporting parse errors?
I already wrote lexer and my function type is something like:
runLexer :: String -> [(Token, AlexPosn)]
where AlexPosn type contains location information. But I have no idea how can I use that in happy to parse and use location information in parse error messages. There are lots of examples in interwebs but as far as I can see none of them works on location-annotated tokens.
Also look at the DDC codebase:
Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/ _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (4)
-
Erik de Castro Lopo
-
Mateusz Kowalczyk
-
Stephen Tetley
-
Ömer Sinan Ağacan