
Hi Daniel, Thank you very much for the explanation of this issue. While I understand the parts about rewrite rules and the big thunk, it is still not clear why it is the way it is. Please could you explain which Nums are not strict? The ones I am aware about are all strict. Also, why doesn't it require building the full thunk for non-strict Nums? Even if they are not strict, an addition requires both parts to be evaluated. This means the thunk will have to be pre-built, doesn't it? With kind regards, Denys
On Monday 14 June 2010 16:25:06, Serge D. Mechveliani wrote:
Dear people and GHC team,
I have a naive question about the compiler and library of ghc-6.12.3. Consider the program
import List (genericLength) main = putStr $ shows (genericLength [1 .. n]) "\n" where n = -- 10^6, 10^7, 10^8 ...
(1) When it is compiled under -O, it runs in a small constant space in n and in a time approximately proportional to n. (2) When it is compiled without -O, it takes at the run-time the stack proportional to n, and it takes enormousely large time for n >= 10^7. (3) In the interpreter mode ghci, `genericLength [1 .. n]' takes as much resource as (2).
Are the points (2) and (3) natural for an Haskell implementation?
Independently on whether lng is inlined or not, its lazy evaluation is, probably, like this: lng [1 .. n] = lng (1 : (list 2 n)) = 1 + (lng $ list 2 n) = 1 + (lng (2: (list 3 n))) = 1 + 1 + (lng $ list 3 n) = 2 + (lng (3: (list 4 n))) -- because this "+" is of Integer = 2 + 1 + (lng $ list 4 n) = 3 + (lng $ list 4 n) ... And this takes a small constant space.
Unfortunately, it would be
lng [1 .. n] ~> 1 + (lng [2 .. n]) ~> 1 + (1 + (lng [3 .. n])) ~> 1 + (1 + (1 + (lng [4 .. n]))) ~>
and that builds a thunk of size O(n).
The thing is, genericLength is written so that for lazy number types, the construction of the result can begin before the entire list has been traversed. This means however, that for strict number types, like Int or Integer, it is woefully inefficient.
In the code above, the result type of generic length (and the type of list elements) is defaulted to Integer. When you compile with optimisations, a rewrite-rule fires:
-- | The 'genericLength' function is an overloaded version of 'length'. In -- particular, instead of returning an 'Int', it returns any type which is -- an instance of 'Num'. It is, however, less efficient than 'length'. genericLength :: (Num i) => [b] -> i genericLength [] = 0 genericLength (_:l) = 1 + genericLength l
{-# RULES "genericLengthInt" genericLength = (strictGenericLength :: [a] -> Int); "genericLengthInteger" genericLength = (strictGenericLength :: [a] -> Integer); #-}
strictGenericLength :: (Num i) => [b] -> i strictGenericLength l = gl l 0 where gl [] a = a gl (_:xs) a = let a' = a + 1 in a' `seq` gl xs a'
which gives a reasonabley efficient constant space calculation.
Without optimisations and in ghci, you get the generic code, which is slow and thakes O(n) space.
Thank you in advance for your explanation,
----------------- Serge Mechveliani mechvel@botik.ru
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users