Non-ASCII characters in .cabal files break Haddock?

I recently uploaded a package to Hackage[1], which contains a non-ASCII character in the .cabal file. That file is encoded with UTF-8, and it displays fine in Vim, Firefox, and even the Hackage package landing page. However, Haddock is failing to build it[2] -- apparently it can't handle UTF-8 encoded text properly. In documentation sections, Haddock supports special syntax for escaping non-ASCII. Is there any equivalent syntax for Cabal files? [1] http://hackage.haskell.org/package/natural-sort' [2] http://hackage.haskell.org/packages/archive/natural-sort/0.1/logs/failure/gh...

I recently uploaded a package to Hackage[1], which contains a non-ASCII character in the .cabal file. That file is encoded with UTF-8, and it displays fine in Vim, Firefox, and even the Hackage package landing page. However, Haddock is failing to build it[2] -- apparently it can't handle UTF-8 encoded text properly.
In documentation sections, Haddock supports special syntax for escaping non-ASCII. Is there any equivalent syntax for Cabal files?
What happens if you use the Haddock syntax in the cabal file (description field)? It's not even certain that Haddock is choking on UTF8 - it might be some other encoding. The description from the cabal file is written to a temp file, and this is passed to haddock with the --prologue option. In your log output we can see: /usr/local/bin/haddock --use-contents=/package/natural-sort-0.1 --prologue=dist/doc/html/natural-sort/haddock-prolog17177.txt ... haddock: internal Haddock or GHC error: dist/doc/html/natural-sort/haddock-prolog17177.txt: hGetContents: invalid argument (Invalid or incomplete multibyte or wide character) So it is actually hGetContents that is choking on the haddock-prolog17177.txt file. That might well depend on the default encoding for the shell that runs the build. What happens on your own machine? Alistair ***************************************************************** Confidentiality Note: The information contained in this message, and any attachments, may contain confidential and/or privileged material. It is intended solely for the person(s) or entity to which it is addressed. Any review, retransmission, dissemination, or taking of any action in reliance upon this information by persons or entities other than the intended recipient(s) is prohibited. If you received this in error, please contact the sender and delete the material from any computer. *****************************************************************

On Tue, Jan 19, 2010 at 01:00:07PM -0800, John Millikin wrote:
I recently uploaded a package to Hackage[1], which contains a non-ASCII character in the .cabal file. That file is encoded with UTF-8, and it displays fine in Vim, Firefox, and even the Hackage package landing page. However, Haddock is failing to build it[2] -- apparently it can't handle UTF-8 encoded text properly.
It seems that openTempFile writes the text without any encoding (#3832). Haddock quite reasonably fails because it expects encoded text.
In documentation sections, Haddock supports special syntax for escaping non-ASCII. Is there any equivalent syntax for Cabal files?
The Description field is haddock markup, so that should be a workaround. (or you could use more naive spelling)
participants (3)
-
Bayley, Alistair
-
John Millikin
-
Ross Paterson