Unicode support for Haddock

Hi GHCers, I recently ran into a problem where Haddock does not correctly handle Unicode in doc comments. So for example with this file: """ module Example where -- | 好 ok :: Int -> Int ok x = x -- | 个 misinterp :: Int -> Int misinterp _ = (-1) -- | 漢 failure :: Int -> Int failure x = x-1 """ Current versions of Haddock will output the documentation for "ok" correctly, will output an empty bulleted list as the documentation for "misinterp" and not output any documentation at all for "failure" (echoing a warning to stderr instead). This is kind of sad. There is a very old open ticket about this issue: http://trac.haskell.org/haddock/ticket/20. The patches I've attached to that ticket fix the problem by using the native Unicode support in Alex 3. I've also attached to the ticket a patch which makes the necessary changes to GHC's build system required to build this new Haddock correctly. Do these patches seem OK? Is it fine to insist on Alex 3? I think it was released in 2011 so I think by now we can assume that it is available on all machines that will want to build GHC. If this patch is accepted, at some point we might want to think about switching to Alex 3's unicode support in GHC's own lexer rather than relying on the current hacks. My patches do not make any change along those lines. Cheers, Max

[CCing the haddock list] On Sun, Feb 03, 2013 at 07:07:54PM +0000, Max Bolingbroke wrote:
Hi GHCers,
I recently ran into a problem where Haddock does not correctly handle Unicode in doc comments. So for example with this file:
""" module Example where
-- | 好 ok :: Int -> Int ok x = x
-- | 个 misinterp :: Int -> Int misinterp _ = (-1)
-- | 漢 failure :: Int -> Int failure x = x-1 """
Current versions of Haddock will output the documentation for "ok" correctly, will output an empty bulleted list as the documentation for "misinterp" and not output any documentation at all for "failure" (echoing a warning to stderr instead).
This is kind of sad. There is a very old open ticket about this issue: http://trac.haskell.org/haddock/ticket/20. The patches I've attached to that ticket fix the problem by using the native Unicode support in Alex 3. I've also attached to the ticket a patch which makes the necessary changes to GHC's build system required to build this new Haddock correctly.
Do these patches seem OK? Is it fine to insist on Alex 3? I think it was released in 2011 so I think by now we can assume that it is available on all machines that will want to build GHC.
I'll leave looking at the patches to the haddock guys, but I think that it's reasonable to require that GHC developers have alex 3 now. If I understand the Haskell Platform pages correctly, the last release included alex 3.0.2, and the one before that included 3.0.1.
If this patch is accepted, at some point we might want to think about switching to Alex 3's unicode support in GHC's own lexer rather than relying on the current hacks. My patches do not make any change along those lines.
Yes; if haddock requires alex 3.0, then GHC effectively will too, so we may as well make use of it. Thanks Ian

2013/2/10 Ian Lynagh
Do these patches seem OK? Is it fine to insist on Alex 3? I think it was released in 2011 so I think by now we can assume that it is available on all machines that will want to build GHC.
I'll leave looking at the patches to the haddock guys, but I think that it's reasonable to require that GHC developers have alex 3 now. If I understand the Haskell Platform pages correctly, the last release included alex 3.0.2, and the one before that included 3.0.1.
Thanks Max for the patches! We're happy to accept them and require alex 3. David

2013/2/11 David Waern
2013/2/10 Ian Lynagh
: Do these patches seem OK? Is it fine to insist on Alex 3? I think it was released in 2011 so I think by now we can assume that it is available on all machines that will want to build GHC.
I'll leave looking at the patches to the haddock guys, but I think that it's reasonable to require that GHC developers have alex 3 now. If I understand the Haskell Platform pages correctly, the last release included alex 3.0.2, and the one before that included 3.0.1.
Thanks Max for the patches! We're happy to accept them and require alex 3.
David
Patches applied. Sorry for the delay. David
participants (3)
-
David Waern
-
Ian Lynagh
-
Max Bolingbroke