
Am Mittwoch, 5. Juli 2006 15:50 schrieb Simon Marlow:
Malcolm Wallace wrote:
Daniel Fischer
wrote: Cool, though the problem of exploding runtime remains, it's only pushed a little further. Now I get a 5x5 magig square in 1 s, a 6x6 in 5.4 s, but 7x7 segfaulted after about 2 1/2 hours - out of memory,
I note that your solution uses Arrays. I have recently discovered that the standard array implementations in GHC introduce non-linear performance profiles (wrt to the size of the array). None of the ordinary variations of arrays seemed to make any significant difference, but replacing Array with the new ByteString from fps brought my application's performance back down to the expected linear complexity.
Here are some figures, timings all in seconds:
dataset size (Mb) Array ByteString ------ ---- ----- ---------- marschnerlobb 0.069 0.67 0.57 silicium 0.113 1.37 1.09 neghip 0.26 2.68 2.18 hydrogenAtom 2.10 31.6 17.6 lobster 5.46 137 49.3 engine 8.39 286 83.2 statueLeg 10.8 420 95.8 BostonTeapot 11.8 488 107 skull 16.7 924 152
Mutable, boxed arrays in GHC have a linear GC overhead in GHC unfortunately. This is partially fixed in GHC 6.6.
You can work around it by using either immutable or unboxed arrays, or both (if you already are, then something else is amiss, and I'd be interested in taking a look). However, I doubt that IOUArray would beat ByteString.
The code uses UArray (Int,Int) Int, but I'm not convinced that using ByteString would make so much of a difference, it might reduce memory consumption (even significantly), but I think, the problem is the algorithm. bestFirst produces a long list of partially filled squares (for a 7x7 square, the queue's length rises quickly to over 100,000; 5x5 gives a ~500 long queue and 6x6 a ~4,150 queue) and it continually inserts new ones (though not far down the list), so the sheer amount of data the algorithm produces will forbid all dreams of linear complexity. I don't know the algorithm's complexity, it's better than O( (n^2)! ), but from my measurements I gained the impression it's worse than O( n! ), so even with optimal data representation and no overhead, I expect memory consumption rather sooner than later. If anybody produces a 10x10 magic square with this algorithm (and whatever representation of the grid), I'll be very impressed (even 8x8 will be impressive, but 10x10 is beyond what I deem possible).
Cheers, Simon
Cheers, Daniel -- "In My Egotistical Opinion, most people's C programs should be indented six feet downward and covered with dirt." -- Blair P. Houghton