what to do about excess memory usage

I finally got serious about learning Haskell and have started by trying to write as efficient and clear a program to solve the Code Jam 2013 "Fair and Square" problem as I can. (Since the qualifying round is long over, I have "cheated" some; I peeked at the analysis to see what I had missed when working on it by myself, and borrowed some very nice semilattice tree search code by Twan van Laarhoven and an integer square root routine.) You can see the result at http://pastebin.com/JuyaNLRp (apologies for what pastebin's Haskell source highlighting does when you have an identifier with an apostrophe). The result now processes the test input with values up to a googol in about 0.2 seconds on my not particularly high-end computer. I'd say that's as good as I can do and go on to the next problem, but profiling turns up that * it's spending 45% of its time garbage collecting. (ouch!) * one function, pairSum, is consuming a lot of RAM--about two megabytes at the end. It seems like a simple function. Here it is, along with most of the things it uses. **powersOfTen = map (10 ^) [0..] halfN = n `div` 2 shell = pair 0 memoPair = zipWith (+) powersOfTen (take halfN (reverse (take n powersOfTen))) pair i = memoPair !! i pairSum xs = foldl' (+) shell (map pair xs) (pair i gives the matching 1s in the top and bottom half of a palindrome that has only ones and zeroes; pairsum takes a list of positions of ones and generates the corresponding palindrome.) I suspect thunks are accumulating, but I don't know how to get rid of them. I've tried bang patterns and $!, to no avail. I still haven't wrapped my brain around seq. There are other things that should be improved about the code, but I think I can figure those out myself. I've leaned heavily on Chapter 25 of Real World Haskell to get the code to this point, but it's not getting through to me how to apply its techniques here. Any suggestions or pointers would be greatly appreciated.

First 2MB isn't a lot of RAM nowadays, do you mean 2GB or is that just compared to the rest of the program ? Second, your powersOfTen should probably be :
powersOfTen = iterate (10*) 1
Or maybe even a Vector (if you can guess the maximum value asked of it) or a MemoTrie (if you can't) since list indexing is slow as hell. That could help with memoPair which should definitely be a Vector and not a list. Good luck (on the other hand, maybe your program is already "good enough" and you could just switch to another project) -- Jedai

On 06/27/2013 11:23 AM, Chaddaï Fouché wrote:
First 2MB isn't a lot of RAM nowadays, do you mean 2GB or is that just compared to the rest of the program ?
It's a lot compared to the rest of the program... not to mention that I'm a fossil from the days of 8-bit microprocessors, so 2 MB seems like a lot of RAM to me. :)
Second, your powersOfTen should probably be :
powersOfTen = iterate (10*) 1
Or maybe even a Vector (if you can guess the maximum value asked of it) or a MemoTrie (if you can't) since list indexing is slow as hell. That could help with memoPair which should definitely be a Vector and not a list.
Thanks!
Good luck (on the other hand, maybe your program is already "good enough" and you could just switch to another project) -- Jedai
I do want to find a better way to keep the list of positions for ones around than a [Int], and I want to save them only as long as I need to, i.e. until I have both the 2 * k and 2 * k + 1 digit palindromes. Once that's done, I will move on. Thanks again!
participants (2)
-
Chaddaï Fouché
-
James Jones