
Hello Dinh, Friday, July 08, 2005, 9:12:22 PM, you wrote: DTTA> Another question, it's said in the book that using cyclic structure (like DTTA> ones = 1:ones) , the list would be represented by a fixed amount of memory. DTTA> Does it mean [1,1,1......] only occupy one cell of memory ? DTTA> How about in " take 100 [1,1,...] " ? in order to understand how Haskell datastructures uses memory, you must remember that Haskell does LAZY evaluation. it means that any data represented as CLOSURE - pointer to function which upon call will compute this data, or VALUE - computed data itself. LAZY avaluation means that all data computed only when needed - in the time of USING them instead of time of ASSIGNMENT as in all other STRICT languages (Pascal, C++, ML and so on). when the data used FIRST TIME, function stored in its closure is evaluated and computed result replaces pointer to function if computed result is not a plain value (Int/Double/Char), but a complex structure (list, tree and so on) it will be not computed entirely, but only the highest level of this structure. all the references to substructures will be represented by the closures - again. so one function call (CLOSURE) representing all the datastructure after evaluation will be replaced by cell which represents highest level of structure and contains references to closures representing next level and these closures will remain unevaluated until the time they are actually used when you use (consume) data in list, you can do it only sequentially. each evaluated list cell contains two links - to value in this list element and to tail of list (it is the list again). when you consume next list element, closure that represents it is evaluated and replaced by cell containing two abovementioned links. when you define "ones=1:ones" this variable will contain unevaluated closure, which would call function "(:)" with two arguments - "1" and "ones". when you start consuming "ones", this closure will be evaluated and replaced by its result - cell containing links to "1" and to closure representing "ones", i.e. this cell itself! so, subsequent consuming of next list elements will goes thorough this cyclic structure without any additional evaluations "take 100 ones" itself will occupy just one memory cell (of 3 elementary values - link to function and to its 2 arguments) - because it, like anything else, represented as CLOSURE. when you goes to consume this value, you will do it sequentially. let's see at take definition: take 0 xs = [] take n (x:xs) = x : take (n-1) xs when you consume first element of "take 100 ones", this function call will be replaced by "1:take 99 ones", i.e. cell which contain reference to "1" and to closure "take 99 ones". as you see, we can't create cycle here just because "take 99 ones" is not the same as "take 100 ones" :-) more thorough, when we compute "take 100 somelist", matching "somelist" to "x:xs" will DECOMPOSE first element of list. evaluating "x : take (n-1) xs" will COMPOSE new cell which contains links to "x" and unevaluated closure "take (n-1) xs". so, recursive evaluation of "take" will build new list - element-by-element, but this list will contain links to the same elements (represented as closures) as original list. even if you define take 0 xs = xs take n (x:xs) = x : take (n-1) xs so that "take 100 xs" will return just the same list, Haskell on evaluation of "take" will rebuild first 100 elements of list and only after this will share the list remainder (of couse, i don't count for possible smart compiler's optimization) give attention to that while "take" will rebuild structure of list, it will use links to list elements as is. this list can contain complex elements - for example, another lists, but links to this sublists (again represented with evaluated values or unevaluated closures) will be just copied to new list. for example, "take 3 [ones, [1..1000000], map (2*) ones, ones]" will only create 3 new cells which will share pointers to sublists with original list. it is possible because in Haskell data value never changed - it can be only represented as unevaluated or evaluted, so we can share subelements of datastructure without any risk to encounter a side behaviour next, "take 100 ones" will require memory for all 100 elements only when it is fully evaluated. but not all data necessarily goes to this state. for example, this data may be completely ignored in computation, or just a few elements will be evaluated as in "take 3 [1..100]". moreover, evaluated elements of this list may be immediately consumed without creating actual cells, as in "sum [1..100]". on the other side, Haskell compilers try to not re-evaluate datastructures and when your usage of variable don't goes into a "producer-consumer" pattern, evaluated closures are replaced with their values, so on the next use of same variable it will be already evaluated (to degree that was needed for previous usage). run the following program and see at delays between numbers printed: main = do let n = 12345678 let xs = [1, sum [1..n], 1, sum [1..n-1]] print$ sum (take 1 xs) print$ sum (take 2 xs) print$ sum (take 3 xs) print$ sum (take 4 xs) you can also see how many memmory this program require to run - under "GHC -O2" it allocated 2.7 gb in heap but all this memory was immediately reused so maximum memory usage will be only 5 kb so i can say that possibility to create cyclic datastructures in Haskell is basically the same as in C, Pascal or any other language having poiners/references/whatever. the real difference is what you do this not by ASSIGNMENTS but by declaration of structure itself without any problems with referring to structures that will be declared later: let a = 1:b b = 2:c c = 3:a powers_of_2 = 1 : map (2*) powers_of_2 fibonachi = 1 : 1 : zipWith (+) fibonachi (tail fibonachi) sharing of parts of datastructures are also possible (and easy) in imperative languages. but SAFE sharing is possible only in Haskell just because datastructures never change their values, so we never need to copy datastructure just to give the same value to another variable. as a result, it is widely used in function definitions and even in compiler optimisations hope this will help :) -- Best regards, Bulat mailto:bulatz@HotPOP.com