
#10907: GHC fails to read file with byte-order mark when LANG=C -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 (Parser) | Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: GHC doesn't work | (amd64) at all | Test Case: Blocked By: | Blocking: Related Tickets: #6016, #6037 | Differential Revisions: -------------------------------------+------------------------------------- Comment (by nomeata): The problem seems to be `skipBOM` in `StringUtils.hs`, which switches to text mode so that `hLookAhead` is able to consume the whole BOM, instead of just the first character. But in text mode we are locale dependent. At first I thought it would make sense to stay in binary mode, but then `hLookAhead` returns just one bytes, which is not enough to detect a bom. Using `hGetChar` twice would help, but if there is no BOM, we’d have to rewind. Are we sure we can `hSeek` on all buffers that we need to? A `Word16` encoding would help. Or maybe it works well enough to force utf8 for this single `hLookAhead`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10907#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler