Re: Comments from Brent Fulgham on Haskell and the shootout

John Meacham wrote:
On Tue, Jun 27, 2006 at 02:58:05PM +0100, Simon Marlow wrote:
.. (and jhc already generates native C code, so it will have at least one substantial advantage over GHC) ...
Compiling via C is a dead end. We realised this about 5 years ago, and yet we still haven't managed to shake off the C backend from GHC, but I remain hopeful that one day we will. /me heads for the cafe...
Out of curiosity, compiling via C as opposed to what? c--? Parrot? JVM?
C-- ultimately, but in the meantime GHC's built in native-code generator. Our NCG is quite reasonable, and we plan to push it forward in various ways to improve the generated code. There's a wiki page with some ideas: http://hackage.haskell.org/trac/ghc/wiki/BackEndNotes Cheers, Simon

We recently began considering another benchmark for the shootout, namely a Magic Square via best-first search. This is fairly inefficient, and we may need to shift to another approach due to the extremely large times required to find a solution for larger squares. I thought the Haskell community might be interested in the performance we have measured so far (see "http:// shootout.alioth.debian.org/sandbox/fulldata.php? test=magicsquares&p1=java-0&p2=javaclient-0&p3=ghc-0&p4=psyco-0" Interestingly, Java actually beats the tar out of GHC and Python for N=5x5 (and I assume higher, though this already takes on the order of 2 hours to solve on the benchmark machine). Memory use in GHC stays nice and low, but the time to find the result rapidly grows. I was hoping for an order of magnitude increase with each increase in N, but discovered that it is more like an exponential... Thanks, -Brent

Hello Brent, Sunday, July 2, 2006, 3:58:11 AM, you wrote:
We recently began considering another benchmark for the shootout, namely a Magic Square via best-first search. This is fairly
i've slightly beautified your printMatrix code: ..... where showMatrix n grid = join "\n" [ showRow y | y<-[1..n] ] where showRow y = join " " [ show $ grid!(x,y) | x<-[1..n] ] join filler pss = concat (intersperse filler pss)
inefficient, and we may need to shift to another approach due to the extremely large times required to find a solution for larger squares.
it's interesting to see one more compiler-dependent (as opposite to libraries-dependent) benchmark in shootout. It seems that the devil hides in the last function, possibleMoves. i tried to replace using of Data.Set with Data.Set.Enum by David F. Place, but got only 5% improvement. This procedure heavily uses lists and that is not the fastest data structure, especially in Haskell where lists are lazy. One possible solution may be to use lists of strict (and automatically unboxed) elements and/or lists that are strict in their links. Another possible solution will be to use unboxed arrays and implement all the required List routines for them. About the overall algorithm - it tends to recompute data that is almost not changed, such as list of already used numbers. It resembles me sudoku solvers that was discussed here several months ago - its highly possible that optimization tricks developed for this task will be appropriate to speed up magic squares too. -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Am Sonntag, 2. Juli 2006 01:58 schrieb Brent Fulgham:
We recently began considering another benchmark for the shootout, namely a Magic Square via best-first search. This is fairly inefficient, and we may need to shift to another approach due to the extremely large times required to find a solution for larger squares.
A slightly less naive approach to determining the possible moves dramatically reduces the effort, while Josh Goldfoot's code did not finish within 4 1/2 hours on my machine, a simple modification (see below) reduced runtime for N = 5 to 4.3 s, for N = 6 to 86.5 s. Unfortunately, the squares are now delivered in a different order, so my programme would probably be rejected :-(
I thought the Haskell community might be interested in the performance we have measured so far (see "http:// shootout.alioth.debian.org/sandbox/fulldata.php? test=magicsquares&p1=java-0&p2=javaclient-0&p3=ghc-0&p4=psyco-0"
Interestingly, Java actually beats the tar out of GHC and Python for N=5x5 (and I assume higher, though this already takes on the order of 2 hours to solve on the benchmark machine). Memory use in GHC stays nice and low, but the time to find the result rapidly grows.
I was hoping for an order of magnitude increase with each increase in N, but discovered that it is more like an exponential...
Thanks,
-Brent
Modified code, still best-first search: import Data.Array.Unboxed import Data.List import System.Environment (getArgs) main :: IO () main = getArgs >>= return . read . head >>= msquare msquare :: Int -> IO () msquare n = let mn = (n*(n*n+1)) `quot` 2 grd = listArray ((1,1),(n,n)) (repeat 0) unus = [1 .. n*n] ff = findFewestMoves n mn grd unus ini = Square grd unus ff (2*n*n) allSquares = bestFirst (successorNodes n mn) [ini] in putStrLn $ showGrid n . grid $ head allSquares data Square = Square { grid :: UArray (Int,Int) Int , unused :: [Int] , ffm :: ([Int], Int, Int, Int) , priority :: !Int } deriving Eq instance Ord Square where compare (Square g1 _ _ p1) (Square g2 _ _ p2) = case compare p1 p2 of EQ -> compare g1 g2 ot -> ot showMat :: [[Int]] -> ShowS showMat lns = foldr1 ((.) . (. showChar '\n')) $ showLns where showLns = map (foldr1 ((.) . (. showChar ' ')) . map shows) lns showGrid :: Int -> UArray (Int,Int) Int -> String showGrid n g = showMat [[g ! (r,c) | c <- [1 .. n]] | r <- [1 .. n]] "" bestFirst :: (Square -> [Square]) -> [Square] -> [Square] bestFirst _ [] = [] bestFirst successors (front:queue) | priority front == 0 = front : bestFirst successors queue | otherwise = bestFirst successors $ foldr insert queue (successors front) successorNodes n mn sq = map (place sq n mn (r,c)) possibilities where (possibilities,_,r,c) = ffm sq place :: Square -> Int -> Int -> (Int,Int) -> Int -> Square place (Square grd unus _ _) n mn (r,c) k = Square grd' uns moveChoices p where grd' = grd//[((r,c),k)] moveChoices@(_,len,_,_) = findFewestMoves n mn grd' uns uns = delete k unus p = length uns + len findFewestMoves n mn grid unus | null unus = ([],0,0,0) | otherwise = (movelist, length movelist, mr, mc) where openSquares = [(r,c) | r <- [1 .. n], c <- [1 .. n], grid ! (r,c) == 0] pm = possibleMoves n mn grid unus openMap = map (\(x,y) -> (pm x y,x,y)) openSquares mycompare (a,_,_) (b,_,_) = compare (length a) (length b) (movelist,mr,mc) = minimumBy mycompare openMap possibleMoves n mn grid unus r c | grid ! (r,c) /= 0 = [] | otherwise = intersect [mi .. ma] unus -- this is the difference that -- does it: better bounds where cellGroups | r == c && r + c == n + 1 = [d1, d2, theRow, theCol] | r == c = [d1, theRow, theCol] | r + c == n + 1 = [d2, theRow, theCol] | otherwise = [theRow, theCol] d1 = diag1 grid n d2 = diag2 grid n theRow = gridRow grid n r theCol = gridCol grid n c lows = scanl (+) 0 unus higs = scanl (+) 0 $ reverse unus rge cg = let k = count0s cg - 1 lft = mn - sum cg in (lft - (higs!!k),lft - (lows!!k)) (mi,ma) = foldr1 mima $ map rge cellGroups mima (a,b) (c,d) = (max a c, min b d) gridRow grid n r = [grid ! (r,i) | i <- [1 .. n]] gridCol grid n c = [grid ! (i,c) | i <- [1 .. n]] diag1 grid n = [grid ! (i,i) | i <- [1 .. n]] diag2 grid n = [grid ! (i,n+1-i) | i <- [1 .. n]] count0s = length . filter (== 0) Cheers, Daniel -- "In My Egotistical Opinion, most people's C programs should be indented six feet downward and covered with dirt." -- Blair P. Houghton

Perhaps you could post a new entry page on our shootout wiki? http://www.haskell.org/hawiki/ShootoutEntry This makes it easier for people to keep contributing. Cheers, Don daniel.is.fischer:
Am Sonntag, 2. Juli 2006 01:58 schrieb Brent Fulgham:
We recently began considering another benchmark for the shootout, namely a Magic Square via best-first search. This is fairly inefficient, and we may need to shift to another approach due to the extremely large times required to find a solution for larger squares.
A slightly less naive approach to determining the possible moves dramatically reduces the effort, while Josh Goldfoot's code did not finish within 4 1/2 hours on my machine, a simple modification (see below) reduced runtime for N = 5 to 4.3 s, for N = 6 to 86.5 s. Unfortunately, the squares are now delivered in a different order, so my programme would probably be rejected :-(
I thought the Haskell community might be interested in the performance we have measured so far (see "http:// shootout.alioth.debian.org/sandbox/fulldata.php? test=magicsquares&p1=java-0&p2=javaclient-0&p3=ghc-0&p4=psyco-0"
Interestingly, Java actually beats the tar out of GHC and Python for N=5x5 (and I assume higher, though this already takes on the order of 2 hours to solve on the benchmark machine). Memory use in GHC stays nice and low, but the time to find the result rapidly grows.
I was hoping for an order of magnitude increase with each increase in N, but discovered that it is more like an exponential...
Thanks,
-Brent
Modified code, still best-first search:
import Data.Array.Unboxed import Data.List import System.Environment (getArgs)
main :: IO () main = getArgs >>= return . read . head >>= msquare
msquare :: Int -> IO () msquare n = let mn = (n*(n*n+1)) `quot` 2 grd = listArray ((1,1),(n,n)) (repeat 0) unus = [1 .. n*n] ff = findFewestMoves n mn grd unus ini = Square grd unus ff (2*n*n) allSquares = bestFirst (successorNodes n mn) [ini] in putStrLn $ showGrid n . grid $ head allSquares
data Square = Square { grid :: UArray (Int,Int) Int , unused :: [Int] , ffm :: ([Int], Int, Int, Int) , priority :: !Int } deriving Eq
instance Ord Square where compare (Square g1 _ _ p1) (Square g2 _ _ p2) = case compare p1 p2 of EQ -> compare g1 g2 ot -> ot
showMat :: [[Int]] -> ShowS showMat lns = foldr1 ((.) . (. showChar '\n')) $ showLns where showLns = map (foldr1 ((.) . (. showChar ' ')) . map shows) lns
showGrid :: Int -> UArray (Int,Int) Int -> String showGrid n g = showMat [[g ! (r,c) | c <- [1 .. n]] | r <- [1 .. n]] ""
bestFirst :: (Square -> [Square]) -> [Square] -> [Square] bestFirst _ [] = [] bestFirst successors (front:queue) | priority front == 0 = front : bestFirst successors queue | otherwise = bestFirst successors $ foldr insert queue (successors front)
successorNodes n mn sq = map (place sq n mn (r,c)) possibilities where (possibilities,_,r,c) = ffm sq
place :: Square -> Int -> Int -> (Int,Int) -> Int -> Square place (Square grd unus _ _) n mn (r,c) k = Square grd' uns moveChoices p where grd' = grd//[((r,c),k)] moveChoices@(_,len,_,_) = findFewestMoves n mn grd' uns uns = delete k unus p = length uns + len
findFewestMoves n mn grid unus | null unus = ([],0,0,0) | otherwise = (movelist, length movelist, mr, mc) where openSquares = [(r,c) | r <- [1 .. n], c <- [1 .. n], grid ! (r,c) == 0] pm = possibleMoves n mn grid unus openMap = map (\(x,y) -> (pm x y,x,y)) openSquares mycompare (a,_,_) (b,_,_) = compare (length a) (length b) (movelist,mr,mc) = minimumBy mycompare openMap
possibleMoves n mn grid unus r c | grid ! (r,c) /= 0 = [] | otherwise = intersect [mi .. ma] unus -- this is the difference that -- does it: better bounds where cellGroups | r == c && r + c == n + 1 = [d1, d2, theRow, theCol] | r == c = [d1, theRow, theCol] | r + c == n + 1 = [d2, theRow, theCol] | otherwise = [theRow, theCol] d1 = diag1 grid n d2 = diag2 grid n theRow = gridRow grid n r theCol = gridCol grid n c lows = scanl (+) 0 unus higs = scanl (+) 0 $ reverse unus rge cg = let k = count0s cg - 1 lft = mn - sum cg in (lft - (higs!!k),lft - (lows!!k)) (mi,ma) = foldr1 mima $ map rge cellGroups mima (a,b) (c,d) = (max a c, min b d)
gridRow grid n r = [grid ! (r,i) | i <- [1 .. n]] gridCol grid n c = [grid ! (i,c) | i <- [1 .. n]] diag1 grid n = [grid ! (i,i) | i <- [1 .. n]] diag2 grid n = [grid ! (i,n+1-i) | i <- [1 .. n]] count0s = length . filter (== 0)
Cheers, Daniel
--
"In My Egotistical Opinion, most people's C programs should be indented six feet downward and covered with dirt." -- Blair P. Houghton
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Hi, I have now tuned Josh Goldfoot's code without changing the order in which the magic squares are produced, for a 5x5 magic square, my machine took about 1 1/2 hours and used 2Mb memory (considering that the original code did not finish within 4 1/2 hours here, that should push time on the benchmarking machine under 3000s and put us in the lead, I hope). However, with the improved bounds for the possibilities, I can now get a 5x5 square in 1s, a 6x6 square in 5.5s (replacing intersect by takeWhile & dropWhile), so it's still sloooowwww. Brent, can I informally submit the code thus, or what formalities would I have to perform to submit my code? --------------------------------------------------------------------------------------------------- {- The Computer Language Shootout http://shootout.alioth.debian.org/ benchmark implementation contributed by Josh Goldfoot modified by Daniel Fischer to improve performance -} {- An implementation using Data.Graph would be much faster. This implementation is designed to demonstrate the benchmark algorithm. -} import Data.Array import Data.List import System (getArgs) main = do n <- getArgs >>= return . read . head let mn = (n * (1 + n * n)) `div` 2 -- the magic number initialNode = makeSquare n mn (listArray ((1,1),(n,n)) (repeat 0), [1 .. n^2]) allSquares = bestFirst (successorNodes n mn) (initialNode:[]) putStrLn $ printMatrix n $ grid $ head allSquares where printMatrix n grid = unlines [ (rowlist grid n y) | y <- [1..n]] where rowlist grid n y = unwords [show $ grid ! (x,y) | x <- [1..n]] data Square = Square { grid :: Array (Int,Int) Int, ffm :: ([Int], Int, Int), unused :: [Int], priority :: !Int } {- bestFirst: Given a queue with one initial node and a function, successors, that takes a node and returns a list of nodes that are created by making all possible moves in a single cell, implements the Best-First algorithm, and returns a list of all nodes that end up with priority zero. In this implementation we only ever use the first node. -} bestFirst _ [] = [] bestFirst successors (frontnode:priorityq) | priority frontnode == 0 = frontnode:bestFirst successors priorityq | otherwise = bestFirst successors $ foldr (insertBy compSquare) priorityq (successors frontnode) where {- The priority queue is sorted first by the node's calculated priority; then, if the priorities are equal, by whichever node has the lowest numbers in the top-left of the array (or the next cell over, and so on). -} compSquare a b = case compare (priority a) (priority b) of EQ -> compare (grid a) (grid b) ot -> ot {- successorNodes: Find the cell with the fewest possible moves left, and then creates a new node for each possible move in that cell. -} successorNodes n mn squarenode = map (makeSquare n mn) [(thegrid//[((x, y), i)], delete i un) | i <- possibilities] where thegrid = grid squarenode un = unused squarenode (possibilities, x, y) = ffm squarenode {- makeSquare: Creates a node for the priority queue. In the process, this calculates the cell with the fewest possible moves, and also calculates this node's priority. The priority function is: (number of zeros in the grid) plus (number of possible moves in the cell with the fewest possible moves) the lower the priority, the sooner the node will be popped from the queue. -} makeSquare n mn (thegrid,un) = Square { grid = thegrid, ffm = moveChoices, unused = un, priority = calcPriority } where moveChoices@(poss,_,_) = findFewestMoves n mn thegrid un calcPriority = length un + length poss {- findFewestMoves: Go through the grid (starting at the top-left, and moving right and down), checking all 0 cells to find the cell with the fewest possible moves. -} findFewestMoves n mn grid un | null un = ([],0,0) | otherwise = (movelist, mx, my) where openSquares = [ (x,y) | y <- [1..n], x <- [1..n], (grid ! (x,y)) == 0] pm = possibleMoves n mn grid un openMap = map (\(x,y) -> (pm (x,y), (x,y))) openSquares mycompare f g = compare ((length . fst) f) ((length . fst) g) (movelist, (mx, my)) = minimumBy mycompare openMap {- possibleMoves: Return all moves that can go in the cell x,y for a given grid. A move is possible if the move (number) is not already in the grid, and if, after making that move, it is still possible to satisfy the magic square conditions (all rows, columns, diagonals adding up to mn, the magic number) -} possibleMoves n mn grid un (x,y) | grid ! (x,y) /= 0 = [] | null oneZeroGroups = takeWhile (<= highest) un -- [1 .. highest] `intersect` un | otherwise = case onePossible of [p] | p `elem` un -> [p] _ -> [] where cellGroups | x + y == n + 1 && x == y = [diag1 grid n, diag2 grid n, theRow, theCol] | x == y = [diag1 grid n, theRow, theCol] | x + y == n + 1 = [diag2 grid n, theRow, theCol ] | otherwise = [theRow, theCol] theRow = gridRow grid n x y theCol = gridCol grid n x y oneZeroGroups = filter (\x -> count 0 x == 1) cellGroups onePossible = nub ( [mn - (sum g) | g <- oneZeroGroups ] ) highest = minimum ( (n*n):[mn - (sum g) | g <- cellGroups] ) {- Utility functions to extract a single row, column, or diagonal. -} gridRow grid n _ y = [grid ! (xx, y) | xx <- [1..n]] gridCol grid n x _ = [grid ! (x, yy) | yy <- [1..n]] diag1 grid n = [grid ! (i, i) | i <- [1..n]] diag2 grid n = [grid ! (i, n - i + 1) | i <- [1..n]] {- Returns the number of times n appears n list xs -} count n xs = length $ filter ((==) n) xs --------------------------------------------------------------------------------------------------- Am Sonntag, 2. Juli 2006 01:58 schrieb Brent Fulgham:
We recently began considering another benchmark for the shootout, namely a Magic Square via best-first search. This is fairly inefficient, and we may need to shift to another approach due to the extremely large times required to find a solution for larger squares.
I thought the Haskell community might be interested in the performance we have measured so far (see "http:// shootout.alioth.debian.org/sandbox/fulldata.php? test=magicsquares&p1=java-0&p2=javaclient-0&p3=ghc-0&p4=psyco-0"
Interestingly, Java actually beats the tar out of GHC and Python for N=5x5 (and I assume higher, though this already takes on the order of 2 hours to solve on the benchmark machine). Memory use in GHC stays nice and low, but the time to find the result rapidly grows.
I was hoping for an order of magnitude increase with each increase in N, but discovered that it is more like an exponential...
Thanks,
-Brent
Cheers, Daniel -- "In My Egotistical Opinion, most people's C programs should be indented six feet downward and covered with dirt." -- Blair P. Houghton

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Daniel,
I have now tuned Josh Goldfoot's code without changing the order in which the magic squares are produced, for a 5x5 magic square, my machine took about 1 1/2 hours and used 2Mb memory (considering that the original code did not finish within 4 1/2 hours here, that should push time on the benchmarking machine under 3000s and put us in the lead, I hope).
Thanks for your efforts on this project. I'm actually more interested in using your earlier solution, since it is so much faster. Right now, the magic square code rises in runtime from 1.5 seconds to 4 hours with an increase of 1 in the square's dimension. I would much rather use a technique that had a more linear (or even exponential) increase! I would propose modifying the other entries (since there are only a handful) to match the output of your original solution. What do you think? - -Brent -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (Darwin) iD8DBQFEqpVmzGDdrzfvUpURAkPpAJ9oKTwzmUyTAoA6yQdOo7APKnXCqACghJEV id5EqEyVKrvSlJlLH9JZTN0= =jNXB -----END PGP SIGNATURE-----

Am Dienstag, 4. Juli 2006 18:20 schrieben Sie:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Daniel,
I have now tuned Josh Goldfoot's code without changing the order in which the magic squares are produced, for a 5x5 magic square, my machine took about 1 1/2 hours and used 2Mb memory (considering that the original code did not finish within 4 1/2 hours here, that should push time on the benchmarking machine under 3000s and put us in the lead, I hope).
Thanks for your efforts on this project. I'm actually more interested in using your earlier solution, since it is so much faster. Right now, the magic square code rises in runtime from 1.5 seconds to 4 hours with an increase of 1 in the square's dimension. I would much rather use a technique that had a more linear (or even exponential) increase!
I would propose modifying the other entries (since there are only a handful) to match the output of your original solution.
What do you think?
Cool, though the problem of exploding runtime remains, it's only pushed a little further. Now I get a 5x5 magig square in 1 s, a 6x6 in 5.4 s, but 7x7 segfaulted after about 2 1/2 hours - out of memory, I believe. And, as mentioned in passing, using 'intersect' in the first version is slowing things down, so here is my currently fastest (undoubtedly, the experts could still make it faster by clever unboxing): import Data.Array.Unboxed import Data.List import System.Environment (getArgs) main :: IO () main = getArgs >>= return . read . head >>= msquare msquare :: Int -> IO () msquare n = let mn = (n*(n*n+1)) `quot` 2 grd = listArray ((1,1),(n,n)) (repeat 0) unus = [1 .. n*n] ff = findFewestMoves n mn grd unus ini = Square grd unus ff (2*n*n) allSquares = bestFirst (successorNodes n mn) [ini] in putStrLn $ showGrid n . grid $ head allSquares data Square = Square { grid :: UArray (Int,Int) Int , unused :: [Int] , ffm :: ([Int], Int, Int, Int) , priority :: !Int } deriving Eq instance Ord Square where compare (Square g1 _ _ p1) (Square g2 _ _ p2) = case compare p1 p2 of EQ -> compare g1 g2 ot -> ot showMat :: [[Int]] -> ShowS showMat lns = foldr1 ((.) . (. showChar '\n')) $ showLns where showLns = map (foldr1 ((.) . (. showChar ' ')) . map shows) lns showGrid :: Int -> UArray (Int,Int) Int -> String showGrid n g = showMat [[g ! (r,c) | c <- [1 .. n]] | r <- [1 .. n]] "" bestFirst :: (Square -> [Square]) -> [Square] -> [Square] bestFirst _ [] = [] bestFirst successors (front:queue) | priority front == 0 = front : bestFirst successors queue | otherwise = bestFirst successors $ foldr insert queue (successors front) successorNodes n mn sq = map (place sq n mn (r,c)) possibilities where (possibilities,_,r,c) = ffm sq place :: Square -> Int -> Int -> (Int,Int) -> Int -> Square place (Square grd unus _ _) n mn (r,c) k = Square grd' uns moveChoices p where grd' = grd//[((r,c),k)] moveChoices@(_,len,_,_) = findFewestMoves n mn grd' uns uns = delete k unus p = length uns + len findFewestMoves :: Int -> Int -> UArray (Int,Int) Int -> [Int] -> ([Int],Int,Int,Int) findFewestMoves n mn grid unus | null unus = ([],0,0,0) | otherwise = (movelist, length movelist, mr, mc) where openSquares = [(r,c) | r <- [1 .. n], c <- [1 .. n], grid ! (r,c) == 0] pm = possibleMoves n mn grid unus openMap = map (\(x,y) -> (pm x y,x,y)) openSquares mycompare (a,_,_) (b,_,_) = compare (length a) (length b) (movelist,mr,mc) = minimumBy mycompare openMap possibleMoves :: Int -> Int -> UArray (Int,Int) Int -> [Int] -> Int -> Int -> [Int] possibleMoves n mn grid unus r c | grid ! (r,c) /= 0 = [] | otherwise = takeWhile (<= ma) $ dropWhile (< mi) unus where cellGroups | r == c && r + c == n + 1 = [d1, d2, theRow, theCol] | r == c = [d1, theRow, theCol] | r + c == n + 1 = [d2, theRow, theCol] | otherwise = [theRow, theCol] d1 = diag1 grid n d2 = diag2 grid n theRow = gridRow grid n r theCol = gridCol grid n c lows = scanl (+) 0 unus higs = scanl (+) 0 $ reverse unus rge :: [Int] -> (Int,Int) rge cg = let k = count0s cg - 1 lft = mn - sum cg in (lft - (higs!!k),lft - (lows!!k)) (mi,ma) = foldr1 mima $ map rge cellGroups mima (a,b) (c,d) = (max a c, min b d) gridRow, gridCol :: UArray (Int,Int) Int -> Int -> Int -> [Int] diag1, diag2 :: UArray (Int,Int) Int -> Int -> [Int] gridRow grid n r = [grid ! (r,i) | i <- [1 .. n]] gridCol grid n c = [grid ! (i,c) | i <- [1 .. n]] diag1 grid n = [grid ! (i,i) | i <- [1 .. n]] diag2 grid n = [grid ! (i,n+1-i) | i <- [1 .. n]] count0s :: [Int] -> Int count0s = length . filter (== 0)
- -Brent -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (Darwin)
iD8DBQFEqpVmzGDdrzfvUpURAkPpAJ9oKTwzmUyTAoA6yQdOo7APKnXCqACghJEV id5EqEyVKrvSlJlLH9JZTN0= =jNXB -----END PGP SIGNATURE-----
Cheers, Daniel -- "In My Egotistical Opinion, most people's C programs should be indented six feet downward and covered with dirt." -- Blair P. Houghton

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jul 4, 2006, at 5:20 PM, Daniel Fischer wrote:
I would propose modifying the other entries (since there are only a handful) to match the output of your original solution.
What do you think?
Cool, though the problem of exploding runtime remains, it's only pushed a little further. Now I get a 5x5 magig square in 1 s, a 6x6 in 5.4 s, but 7x7 segfaulted after about 2 1/2 hours - out of memory, I believe.
Hrm. Well, I still prefer the growth of search space in your version over the original, since it *was* going from 0.01 s (3x3) to 0.10 (4x4) to 4 hours (5x5). Going 1s->5.4s ->x hours is at least a bit more controlled..... I wonder if anyone can propose a slightly smaller problem, or a better algorithm? Thanks, - -Brent -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (Darwin) iD8DBQFEq0b7zGDdrzfvUpURAumWAJ4itR4eayB3mj5hYEtxbK630mF4IgCeO3PA qFF7cLTW4xk36J/nQOON+F4= =C1xL -----END PGP SIGNATURE-----

Daniel Fischer
Cool, though the problem of exploding runtime remains, it's only pushed a little further. Now I get a 5x5 magig square in 1 s, a 6x6 in 5.4 s, but 7x7 segfaulted after about 2 1/2 hours - out of memory,
I note that your solution uses Arrays. I have recently discovered that the standard array implementations in GHC introduce non-linear performance profiles (wrt to the size of the array). None of the ordinary variations of arrays seemed to make any significant difference, but replacing Array with the new ByteString from fps brought my application's performance back down to the expected linear complexity. Here are some figures, timings all in seconds: dataset size (Mb) Array ByteString ------ ---- ----- ---------- marschnerlobb 0.069 0.67 0.57 silicium 0.113 1.37 1.09 neghip 0.26 2.68 2.18 hydrogenAtom 2.10 31.6 17.6 lobster 5.46 137 49.3 engine 8.39 286 83.2 statueLeg 10.8 420 95.8 BostonTeapot 11.8 488 107 skull 16.7 924 152 Regards, Malcolm

Hello Malcolm, Wednesday, July 5, 2006, 4:30:43 PM, you wrote:
I note that your solution uses Arrays. I have recently discovered that the standard array implementations in GHC introduce non-linear performance profiles (wrt to the size of the array). None of the ordinary variations of arrays seemed to make any significant difference, but replacing Array with the new ByteString from fps brought my application's performance back down to the expected linear complexity.
are you give a chance to UArray? boxed arrays in ghc 6.4 may sufficiently increase GC times -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Malcolm Wallace wrote:
Daniel Fischer
wrote: Cool, though the problem of exploding runtime remains, it's only pushed a little further. Now I get a 5x5 magig square in 1 s, a 6x6 in 5.4 s, but 7x7 segfaulted after about 2 1/2 hours - out of memory,
I note that your solution uses Arrays. I have recently discovered that the standard array implementations in GHC introduce non-linear performance profiles (wrt to the size of the array). None of the ordinary variations of arrays seemed to make any significant difference, but replacing Array with the new ByteString from fps brought my application's performance back down to the expected linear complexity.
Here are some figures, timings all in seconds:
dataset size (Mb) Array ByteString ------ ---- ----- ---------- marschnerlobb 0.069 0.67 0.57 silicium 0.113 1.37 1.09 neghip 0.26 2.68 2.18 hydrogenAtom 2.10 31.6 17.6 lobster 5.46 137 49.3 engine 8.39 286 83.2 statueLeg 10.8 420 95.8 BostonTeapot 11.8 488 107 skull 16.7 924 152
Mutable, boxed arrays in GHC have a linear GC overhead in GHC unfortunately. This is partially fixed in GHC 6.6. You can work around it by using either immutable or unboxed arrays, or both (if you already are, then something else is amiss, and I'd be interested in taking a look). However, I doubt that IOUArray would beat ByteString. Cheers, Simon

Hello Simon, Wednesday, July 5, 2006, 5:50:58 PM, you wrote:
Mutable, boxed arrays in GHC have a linear GC overhead in GHC unfortunately. This is partially fixed in GHC 6.6.
You can work around it by using either immutable or unboxed arrays, or
immutable boxed array can still have large overhead because they are created via mutable ones -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Am Mittwoch, 5. Juli 2006 15:50 schrieb Simon Marlow:
Malcolm Wallace wrote:
Daniel Fischer
wrote: Cool, though the problem of exploding runtime remains, it's only pushed a little further. Now I get a 5x5 magig square in 1 s, a 6x6 in 5.4 s, but 7x7 segfaulted after about 2 1/2 hours - out of memory,
I note that your solution uses Arrays. I have recently discovered that the standard array implementations in GHC introduce non-linear performance profiles (wrt to the size of the array). None of the ordinary variations of arrays seemed to make any significant difference, but replacing Array with the new ByteString from fps brought my application's performance back down to the expected linear complexity.
Here are some figures, timings all in seconds:
dataset size (Mb) Array ByteString ------ ---- ----- ---------- marschnerlobb 0.069 0.67 0.57 silicium 0.113 1.37 1.09 neghip 0.26 2.68 2.18 hydrogenAtom 2.10 31.6 17.6 lobster 5.46 137 49.3 engine 8.39 286 83.2 statueLeg 10.8 420 95.8 BostonTeapot 11.8 488 107 skull 16.7 924 152
Mutable, boxed arrays in GHC have a linear GC overhead in GHC unfortunately. This is partially fixed in GHC 6.6.
You can work around it by using either immutable or unboxed arrays, or both (if you already are, then something else is amiss, and I'd be interested in taking a look). However, I doubt that IOUArray would beat ByteString.
The code uses UArray (Int,Int) Int, but I'm not convinced that using ByteString would make so much of a difference, it might reduce memory consumption (even significantly), but I think, the problem is the algorithm. bestFirst produces a long list of partially filled squares (for a 7x7 square, the queue's length rises quickly to over 100,000; 5x5 gives a ~500 long queue and 6x6 a ~4,150 queue) and it continually inserts new ones (though not far down the list), so the sheer amount of data the algorithm produces will forbid all dreams of linear complexity. I don't know the algorithm's complexity, it's better than O( (n^2)! ), but from my measurements I gained the impression it's worse than O( n! ), so even with optimal data representation and no overhead, I expect memory consumption rather sooner than later. If anybody produces a 10x10 magic square with this algorithm (and whatever representation of the grid), I'll be very impressed (even 8x8 will be impressive, but 10x10 is beyond what I deem possible).
Cheers, Simon
Cheers, Daniel -- "In My Egotistical Opinion, most people's C programs should be indented six feet downward and covered with dirt." -- Blair P. Houghton
participants (6)
-
Brent Fulgham
-
Bulat Ziganshin
-
Daniel Fischer
-
dons@cse.unsw.edu.au
-
Malcolm Wallace
-
Simon Marlow