
#15646: ghci takes super long time to find the type of large fractional number -------------------------------------+------------------------------------- Reporter: Johannkokos | Owner: | JulianLeviston Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: GHCi | Version: 8.4.3 Resolution: | Keywords: newcomer Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): So, to summarize the problem here: - Semantics of a floating point literal (which is in our case in `XeY` form) is defined as `fromRational (n % d)`. - So we need to find and `n` and `d` such that `fromRational (n % d)` gives us our float. - As per Haskell 2010 chapter 6.4.1 (note that section 2.5 only gives the syntax, not the type) `fromRational` has type `Fractional a => Rational -> a` so `fromRational (n % d)` has type `Fractional a => a` - So `XeY` form always has type `Fractional a => a` ! - Currently we tokenize an `XeY` we call `readRational__` on the string `"XeY"`, which returns this: `(n%1)*(10^^(k-d))`. `n`, `k` and `d` are parsed from the input so this is very cheap (in our example `n` is `X`, `k` is `Y`, `d` is `0`). - However, evaluating `(10^^_) :: Rational` part of the expression takes long time and uses a lot of memory. Indeed, as comment:10 says, if you have 1000000000 as the exponent then representation of this number takes GBs of memory so this is expected. - comment:12 asks why rendering `Infinity` takes this long: that's because we generate the `Infinity` value during desugaring (after parsing), and for this we have to first generate the `Rational` value for the literal first (which is what is taking all the time and using all the memory). You can observe this if you run ghci with `-fdump-parsed -fdump-ds` and evaluate `1e100000 :: Float`. Perhaps it makes sense to distinguish these two problems: - Type checking is too slow - Generating `Infinity :: Float` is too slow (a) in comment:10 fixes (1) but not (2). (c) in comment:10 fixes (1), and it may also fix (2) if we could somehow use this new constructor fields to return `Infinity` without calculating `10^^e :: Integer` first (as far as I understand the new consturctor will look like: `ExpRational n e -- stands for (n%1)*(10^^e)`). However, (c) means a change in a public type, and we'd need to evaluate performance changes etc. caused by turning a product type to a sum type (more optimizations apply to product types than sum types! at least until we improve demand analysis for sum types and use unboxed sums in WW). Also, I'm not sure if adding one more constructor to a library type just for this is a good idea. We'd need to think about how/whether to expose this to users (perhaps we do want to expose it in `Data.Ratio` somehow). This means long discussions and bikeshedding etc. What to do? I think delaying evaluation `readRational` until it's needed by making some code lazier would work. Looking at the lexer, we generate a `ITrational` for this literal like this: {{{ tok_float str = ITrational $! readFractionalLit str readFractionalLit :: String -> FractionalLit readFractionalLit str = ((FL $! (SourceText str)) $! is_neg) $! readRational str where is_neg = case str of ('-':_) -> True }}} Notice how we're very deliberately strict here. My guess is that simply removing strictness around `readRational` (e.g. the `$!` in `readFractionalLit`) may fix this. If it doesn't then some other code down the line needs to be made lazier too. However I'm not sure if this causes other problems (the fact that the thunk for `readRational str` will keep the `str` alive is not a problem because the `str` only holds the string for the literal). We'll have to evaluate this change in strictness somehow. If we follow this route then we'd also need to add a test to make sure typing this expression remains fast (strictness properties are hard to maintain ...). Here's another idea that is like (c) but does not need changes in a public type: duplicate the `Ratio` type so that only the version used by GHC has the special constructor. The code gen and `Data.Ratio` will still use the current type. Now we can use the special constructor in lexing to avoid evaluating a huge `Integer`, and only when desugaring we do some kind of evaluation (this is where we can perhaps be smart and generate `Infinity` efficiently, or in the worst case we end up with the current situation, but with fast type checking). None of these solutions strike me as particularly satisfying though... -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15646#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler