
Hi, This is somewhat related to an earlier post about streaming by Jorge, and to a similar post by me a bunch of months back regarding multiple ways of writing a similar function. Many times when processing data which you need to process in more than once way, you have two options: (1) process it in one go and store intermediate results as, eg, tuples; (2) process it in more than one go. I've been told that the latter is the more "haskelly" way to do it, but I don't completely buy that. Especially in cases like Jorge's, since the data will not be able to be garbage collected between runs and thus the whole program will require a much larger stack/heap/whatever. In either case, consider the following (semantically equivalent) functions:
sumMinusOnce l = foldr sumMinus (0,0) l where sumMinus c (a,b) = (a+c, c-b)
sumMinusTwice l = (foldr (+) 0 l, foldr (-) 0 l)
The first takes one pass over the data, while the second takes two. Now, we can give these functions various type signatures, Int, Integer, Float or Double (more, of course, but these are the basics). When the list is [1..250000], the timing results look like: | ONCE | TWICE --------+------------------------+------------------------ INT | 1.81u 0.18s 0:02.07 | 1.84u 0.15s 0:01.99 INTEGER | 9.77u 0.28s 0:10.29 | 5.07u 0.07s 0:05.23 FLOAT | 8.60u 0.30s 0:09.08 | 3.79u 0.16s 0:04.07 DOUBLE | 9.50u 0.25s 0:09.82 | 4.25u 0.16s 0:04.44 Here, we see that for everything but INT, the ONCE version is about two times slower. If we pump the input size up to [1..1000000], then the times for INT become INT | 16.38u 0.66s 0:17.22 | 20.75u 0.34s 0:21.33 Here, there's a much smaller difference in speed. For the other versions (say DOUBLE), the ONCE version is consistently two times slower. This raises the question as to why. Looking at the -ddump-simpl output, we can see why. In the sumMinusOnce version, when given an Int type signature, the tuple gets unboxed and the primitive GHC.Prim.+# and GHC.Prim.-# functions are used. In the one where we've given a Double type signature, we don't get this. The tuple still gets unboxed, but the non-unboxed functions GHC.Float.plusDouble and GHC.Float.minusDouble functions are used instead. Perhaps more importantly, the Double version is left as a recursive call, looking something like:
Main.$wgo = \w -> case w of (y:ys) -> case Main.$wgo ys of (# ww1, ww2 #) -> (# plusDouble ww1 y, minusDouble y ww2) #) [] -> (# 0, 0 #)
(converted by hand from simpl-core to a more Haskelly syntax) However, the Int version becomes a very tight loop in which the list is deforested. My understanding was that using list comprehensions like [1..100000] together with foldr you would always get build/fold deforestation, but this only seems to happen when the list is a list of Ints, not a list of Doubles. If we change the Once function to:
i2d :: Int -> Double i2d = fromInteger.toInteger
sumMinusOnce :: [Int] -> (Double, Double) sumMinusOnce l = foldr sumMinus (0,0) l where sumMinus c (a,b) = (a+i2d c, (i2d c)-b)
then we do get full deforestation as we want. This new version, when run with [1..250000], yields a time of 4.12u 0.23s 0:04.44 which is actually faster than the TWICE/DOUBLE version. So...why is GHC more aggressive with Ints than with Doubles and what are the additional conditions under which fold/build deforestation occurs? - Hal -- Hal Daume III "Computer science is no more about computers | hdaume@isi.edu than astronomy is about telescopes." -Dijkstra | www.isi.edu/~hdaume
participants (1)
-
Hal Daume III