On 8/29/07, David Frey <dpfrey@shaw.ca> wrote:
Hello Haskellers,

I have been trying to learn a bit about Haskell by solving Project Euler
problems.  For those of you who don't know what Project Euler is, see
http://projecteuler.net

After solving problem 21, which is related to amicable pairs, I decided
to attempt problem 95 since it is an extension of the same topic.

The full description of problem 95 is here:
http://projecteuler.net/index.php?section=problems&id=95

This is the summary:
"Find the smallest member of the longest amicable chain with no element
exceeding one million."

I have attempted to solve this problem, but my solution is too resource
hungry to search the full set of 1000000 numbers.

I am hoping someone reading this list can suggest :
- How I can improve my algorithm
- An alternative algorithm that will be more efficient
- Ways to profile a Haskell program to help me understand why my
   implementation is performing poorly.

In addition to the question above, I would also be interested in
comments on the style of the code.  If there is a more idiomatic way to
write portions of the code, I would like to see what it is.

My solution is at the bottom of this e-mail.  The program will probably
run obscenely slow  or die from stack overflow unless you replace
[1..999999] with [1..somethingSmaller]  in main.

Thanks,
David Frey

Hi David,

Project Euler is a good way to learn some Haskell, although it does tend to give you a crash course in understanding laziness, efficiency and such in Haskell (whether that's good or bad depends on your point of view!).

All you need to do to stop your program from overflowing the stack is to change foldl to foldl' in the definition of main (foldl' is in Data.List so you'll also have to import that).  The problem with foldl is that since Haskell is lazy, instead of evaluating things as it goes along, it builds up a huge string of thunks (=unevaluated expressions) and doesn't even start evaluating anything until it reaches the end of the list, which can easily blow the stack.  foldl' is a strict version of foldl which forces evaluation to take place as it goes along.  For an excellent explanation of all of this, I suggest you read http://haskell.org/haskellwiki/Stack_overflow.

As soon as I replaced foldl with foldl' in your code, it no longer overflows the stack -- however, it's still quite slow!  I didn't wait long enough for it to finish when I ran it up to a million.  I don't have any advice on that front at the moment, perhaps someone else will come along with a suggestion or two.  In the meantime, you can poke around http://haskell.org/haskellwiki/Performance to see if it gives you any ideas.

Hope this helps,
-Brent