
Hi, I'm just a beginner trying to learn a little about Haskell, and as such write some toy programs (e.g. for projecteuler.net) in Haskell. Currently, I'm experiencing what I would call "strange behaviour": I've got a data-type data Fraction = Fraction Int Int to hold rational numbers (maybe there's already some built-in type for this in Haskell, much like for instance Scheme has a rational type?), and then I compute a list of pairs of those numbers, that is [(Fraction, Fraction)]. Fraction is declared an instance of Ord. This list has up to 3 million elements. If I do main = print $ length $ points then the program prints out the length fine and takes around 6s to finish (compiled with GHC -O3). Ok, but I acknowledge that length isn't quite an expensive function, as I can imagine that Haskell does not compute and hold the entire list for this but instead each element at once and discards it afterwards. Doing main = print $ length $ map (\(x, _) -> x == Fraction 1 2) points instead, gives a slightly longer runtime (6.6s), but in this case I'm sure that at least each element is computed; right? main = print $ length $ reverse points gives 11.9s, and here I guess (?) that for this to work, the entire list is computed and hold in memory. However, trying to do import List main = print $ length $ sort points makes memory usage go up and the program does not finish in 15m, also spending most time waiting for swapped out memory. What am I doing wrong, why is sort this expensive in this case? I would assume that computing and holding the whole list does not take too much memory, given its size and data type; doing the very same calculation in C should be straight forward. And sort should be O(n * log n) for time and also not much more expensive in memory, right? Am I running into a problem with lazyness? What can I do to avoid it? As far as I see it though, the reverse or map call above should do nearly the same as sort, except maybe that the list needs to be stored in memory as a whole and sort has an additional *log n factor, but neither of those should matter. What's the problem here? Is this something known with sort or similar functions? I couldn't find anything useful on Google, though. My code is below, and while I would of course welcome critics, I do not want to persuade anyone to read through it. Thanks a lot, Daniel ---------------------------------------------------------- -- Problem 165: Intersections of lines. import List -- The random number generator. seeds :: [Integer] seeds = 290797 : [ mod (x * x) 50515093 | x <- seeds ] numbers :: [Int] numbers = map fromInteger (map (\x -> mod x 500) (tail seeds)) -- Line segments, vectors and fractions. data Segment = Segment Int Int Int Int data Vector = Vector Int Int data Fraction = Fraction Int Int instance Eq Fraction where (Fraction a b) == (Fraction c d) = (a == c && b == d) instance Ord Fraction where compare (Fraction a b) (Fraction c d) | (a == c && b == d) = EQ | b > 0 = if a * d < b * c then LT else GT | otherwise = if a * d > b * c then LT else GT -- Build a normalized fraction and get its value. normalize (Fraction a b) = let g = gcd a b; aa = div a g; bb = div b g in if bb < 0 then Fraction (-aa) (-bb) else Fraction aa bb -- Find the inner product of two vectors. innerProduct (Vector a b) (Vector c d) = a * c + b * d -- Find the normal vector and direction of a line segment, as well as -- the constant in straight-normal form for a given normal vector. normalVector (Segment x1 y1 x2 y2) = Vector (y1 - y2) (x2 - x1) direction (Segment x1 y1 x2 y2) = Vector (x2 - x1) (y2 - y1) nfConstant (Segment x1 y1 x2 y2) n = innerProduct n (Vector x1 y1) -- Check if a point is between the ends of the segment times D. betweenEndsTimes d (Segment x1 y1 x2 y2) xD yD = let x1D = x1 * d; x2D = x2 * d; y1D = y1 * d; y2D = y2 * d; xDMin = min x1D x2D; xDMax = max x1D x2D; yDMin = min y1D y2D; yDMax = max y1D y2D in (xDMin <= xD && xDMax >= xD && yDMin <= yD && yDMax >= yD && (x1D /= xD || y1D /= yD) && (x2D /= xD || y2D /= yD)) -- If they are not parallel, we can find their intersection point (at least, -- the one it would be if both were straights). Then it is easy to check if -- it is between the endpoints for both. -- -- n1 * x + m1 * y = c1 -- n2 * x + m2 * y = c2 -- -- => x = (c1 * m2 - x2 * m1) / (n1 * m2 - n2 * m1) -- => y = (n1 * c2 - n2 * c1) / (n1 * m2 - n2 * m1) -- -- (Iff they are parallel, the determinant will be 0.) trueIntersect s1 s2 = let (Vector n1 m1) = normalVector s1; (Vector n2 m2) = normalVector s2; c1 = nfConstant s1 (Vector n1 m1); c2 = nfConstant s2 (Vector n2 m2); d = n1 * m2 - n2 * m1; xD = c1 * m2 - c2 * m1; yD = n1 * c2 - n2 * c1 in if d == 0 then Nothing else if (betweenEndsTimes d s1 xD yD) && (betweenEndsTimes d s2 xD yD) then Just ((normalize $ Fraction xD d), (normalize $ Fraction yD d)) else Nothing -- Build list of segments. takeEveryForth :: [Int] -> [Int] takeEveryForth (a:_:_:_:t) = a : (takeEveryForth t) n1 = numbers n2 = tail n1 n3 = tail n2 n4 = tail n3 segments = [ Segment a b c d | ((a, b), (c, d)) <- zip (zip (takeEveryForth n1) (takeEveryForth n2)) (zip (takeEveryForth n3) (takeEveryForth n4)) ] -- For the first 5000 segments, calculate intersections. firstSegments = take 5000 segments intersects :: [Maybe (Fraction, Fraction)] intersects = findInters firstSegments [] where findInters [] l = l findInters (h:t) l = findInters t (addInters h t l) where addInters _ [] l = l addInters e (h:t) l = addInters e t ((trueIntersect e h) : l) getPoints :: [Maybe (Fraction, Fraction)] -> [(Fraction, Fraction)] getPoints [] = [] getPoints (Nothing : t) = getPoints t getPoints ((Just v) : t) = v : (getPoints t) points = getPoints intersects -- Main program. main :: IO () main = print $ length $ reverse points --main = print $ length $ map (\(x, _) -> x == Fraction 1 2) points --main = print $ length $ sort points