Int-based lexer flag bitmask exhausted

Hello Simon (et al.), While doing #9224[1] as a finger-exercise to extend the lexer to support base-2 integer literals, I got stuck on the lexer extension map being represented as an 'Int', which (in GHC) is only guaranteed to hold least 32bits. -- for reasons of efficiency, flags indicating language extensions (eg, -- -fglasgow-exts or -XParallelArrays) are represented by a bitmap -- stored in an unboxed Int However, as all 32bits are already taken up by language extensions, and I'd need a 33th bit, I'm wondering how to proceed. Can we replace the 'Int' by an 'Int64' (or even better a Word64, ideally with a newtype or at least a type-synonym around it?) which would give us a bit more headroom while being semantically sound even for 'bitSize Int == 32'? [1]: https://ghc.haskell.org/trac/ghc/ticket/9224 Cheers, hvr

As long as you check the performance impact on 32-bit, sounds good to me. Edward Excerpts from Herbert Valerio Riedel's message of 2014-06-21 11:09:44 +0100:
Hello Simon (et al.),
While doing #9224[1] as a finger-exercise to extend the lexer to support base-2 integer literals, I got stuck on the lexer extension map being represented as an 'Int', which (in GHC) is only guaranteed to hold least 32bits.
-- for reasons of efficiency, flags indicating language extensions (eg, -- -fglasgow-exts or -XParallelArrays) are represented by a bitmap -- stored in an unboxed Int
However, as all 32bits are already taken up by language extensions, and I'd need a 33th bit, I'm wondering how to proceed. Can we replace the 'Int' by an 'Int64' (or even better a Word64, ideally with a newtype or at least a type-synonym around it?) which would give us a bit more headroom while being semantically sound even for 'bitSize Int == 32'?
[1]: https://ghc.haskell.org/trac/ghc/ticket/9224
Cheers, hvr

On 2014-06-21 at 13:35:28 +0200, Herbert Valerio Riedel wrote:
On 2014-06-21 at 13:09:31 +0200, Edward Z. Yang wrote:
As long as you check the performance impact on 32-bit, sounds good to me.
What's the currently accepted/canonical lexer/parser benchmark to evaluate that?
PS: I've submitted a preparatory refactoring as https://phabricator.haskell.org/D23 for review, which allows upgrading to Word64 by simply modifying one source line.

I'm all for it. Just need to watch for performance regressions Simon | -----Original Message----- | From: Herbert Valerio Riedel [mailto:hvriedel@gmail.com] | Sent: 21 June 2014 11:10 | To: Simon Peyton Jones | Cc: ghc-devs | Subject: Int-based lexer flag bitmask exhausted | | Hello Simon (et al.), | | While doing #9224[1] as a finger-exercise to extend the lexer to support | base-2 integer literals, I got stuck on the lexer extension map being | represented as an 'Int', which (in GHC) is only guaranteed to hold least | 32bits. | | -- for reasons of efficiency, flags indicating language extensions (eg, | -- -fglasgow-exts or -XParallelArrays) are represented by a bitmap | -- stored in an unboxed Int | | However, as all 32bits are already taken up by language extensions, and | I'd need a 33th bit, I'm wondering how to proceed. Can we replace the | 'Int' by an 'Int64' (or even better a Word64, ideally with a newtype or | at least a type-synonym around it?) which would give us a bit more | headroom while being semantically sound even for 'bitSize Int == 32'? | | [1]: https://ghc.haskell.org/trac/ghc/ticket/9224 | | Cheers, | hvr
participants (3)
-
Edward Z. Yang
-
Herbert Valerio Riedel
-
Simon Peyton Jones