Newbie Haskell optimization question

Hello, I am new to Haskell and I am having fun with it writing L-system generator. I was very pleased with concise syntax. It's an order of magnitude better than any C++, Lisp, etc program I've ever written or read. Doing algorithmic optimizations on my generator I ended up with 4 different versions (you can see the other 3 in the file attached). The most optimal of them is: import qualified Data.Map type LSystemElement = (Char, [Int]) type LSystem = [LSystemElement] type LSystemRules = Data.Map.Map Char (LSystemElement -> LSystem) generate'' :: LSystemRules -> LSystem -> Int -> LSystem generate'' rules axiom steps = concatMap (iterate f axiom !!) (ind !! steps) where ind = [0] : [g x | x <- ind] where g [] = [] g [x] = [x, x + 1] g xs = xs ++ g (drop (length xs `div` 2) xs) f = concatMap $ \elem -> Data.Map.findWithDefault (\x -> [x]) (fst elem) rules elem -- Tests a (_, [x, y]) | y <= 2 = [('A', [x + 2, y + 2])] | otherwise = [('B', [2]), ('A', [x - 1, y - 1])] b (_, [x]) | x <= 2 = [('C', [])] | otherwise = [('B', [x - 1])] rules = Data.Map.fromList [('A', a), ('B', b)] axiom = [('A', [2, 2])] Now I am thinking of doing some language level optimizations. Can you give me any (random) directions? Any style recommendations are welcome as well. Cheers. -- Slavomir Kaslev

Slavomir Kaslev wrote:
... g xs = xs ++ g (drop (length xs `div` 2) xs) ...
Now I am thinking of doing some language level optimizations.
As a starter, the argument of g in the above right-hand side traverses xs twice, once to compute the length, then to drop an appropriate number of elements (so actually traverses xs once and a half times). The fullowing funny function could be used instead: h (_:_:xs) (_:ys) = h xs ys h _ ys = ys Then "h xs xs" is the same as "drop (length xs `div` 2) xs". It also kind of traverses xs once and a half times, but in one go. It might also be better in terms of space consumption behavior, because elements in the first half of xs can be reclaimed earlier than in the "drop (length xs `div` 2) xs" case, where the full list xs needs to be kept around until its length is computed and divided by two. A nice exercise: try to provide a function h' in a similar style as above, but in the same traversal computing also the "xs ++" part of the above right-hand side for g, i.e., "h' xs xs" gives "xs ++ g (drop (length xs `div` 2) xs)" without a third traversal of xs being necessary. Have fun, Janis. -- Dr. Janis Voigtlaender http://wwwtcs.inf.tu-dresden.de/~voigt/ mailto:voigt@tcs.inf.tu-dresden.de

generate'' :: LSystemRules -> LSystem -> Int -> LSystem generate'' rules axiom steps = concatMap (iterate f axiom !!) (ind !! steps) where ind = [0] : [g x | x <- ind] where g [] = [] g [x] = [x, x + 1] g xs = xs ++ g (drop (length xs `div` 2) xs) f = concatMap $ \elem -> Data.Map.findWithDefault (\x -> [x]) (fst elem) rules elem
I absolutely don't know what (ind) (hereby called the "index list") means, but it can be improved (slightly, see note at the end). Currently, you define ind = iterate g [0] (this is exactly the same as [0] : [g x | x <- ind]) and the first iterations of g yield [0] [0, 1] [0, 1, 1,2] [0, 1, 1,2, 1,2,2,3] [0, 1, 1,2, 1,2,2,3, 1,2,2,3,2,3,3,4] [0, 1, 1,2, 1,2,2,3, 1,2,2,3,2,3,3,4, 1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5] ... Clearly, g xs = xs ++ map (+1) xs but your recursion scheme is more clever and exploits the fact that g is iterated and most of the additions (+1) already have been calculated. Yet it is unnecessary to generate all intermediate lists: by starting the iteration at [1], you can generate all "differences": iterate g [1] == [[1], [1,2], [1,2,2,3], [1,2,2,3,2,3,3,4], ... and get the equivalent to (ind !! steps): ind steps = 0 : (concat . take steps . iterate g $ [1]) Concerning (drop (length xs `div` 2)), you can carry (n = length xs) around so that you don't have to recalculate the length everytime. Alternatively, you can even make the recursion structure explicit and avoid the rescan involved with (drop) entirely data List a = Zero | One a | List a :++: List a g x@(One k) = x :++: One (k + 1) g xs@(_ :++: xs') = xs :++: g xs' flatten' Zero = id flatten' (One x) = (x :) flatten' (xs :++: ys) = (xs ++) . (ys ++) flatten = flip flatten' [] ind steps = 0 : (flatten . foldr1 (:++:) . take steps . iterate g $ One 1) Going even further, you can fuse the lindenmaier iteration into this (starting the iteration of g at 0 again): generate rules axiom steps = flatten . flip (!!) (steps + 1) . iterate g $ Zero where g Zero = One axiom g x@(One lsys) = x :++: One (f lsys) g xs@(_ :++: xs') = xs :++: g xs' f = ... Look, the Ints are gone! (k + 1) only meant (f k)! The data structures are lightweight and now perfectly fit the structure of the calculation. This is the best you can do, for all languages: In C, the most native data structure is (int) and pointer arithmetic is the way to win. In LISP, lists are favored above all else. And in Haskell, all algebraic data types are native and together with laziness, they pave the road to speed. As a last note, all these things are only able to improve a constant factor (measured relative to the exponential length of the outcoming index list) and only benchmarks can show which ones really improve things. Constant factors are quite sensitive on how the code looks like. Of course, "ghc -O2" is the best constant factor improver known :) Regards, apfelmus PS: You can even improve upon the logarithmic factor that comes from repeatedly using Data.Map by decorating each LSystemElement with its rule: data LSystemElement' = LSE LSystemElement LSystem' type LSystem' = [LSystemElement'] a l@[x,y] = LSE ('A',l) $ if y <= 2 then [a [x + 2, y + 2]] else [b [2], a [x - 1, y - 1]] b l@[x] = LSE ('B',l) $ if x <= 2 then [c []] else [b [x - 1]] c [] = LSE ('C',[]) [c] axiom = [a [2,2]] generate = ... where ... f = concatMap (\(LSE _ s) -> s) This decoration can be derived from LSystemRules if you want to keep your traditional rules.
participants (3)
-
apfelmus@quantentunnel.de
-
Janis Voigtlaender
-
Slavomir Kaslev