Performance of Knight's Tour

Hi! I'm learning Haskell, and now I'm trying to make framework for solving searching problems, such as Knight's Tour. For small boards it answers instantly. For 7x8 board - 23 seconds. For 8x8 board - more than 30 minutes (it hasn't finished yet). Where is the root of the evil? --program module Main where import Data.List import Data.Array.Unboxed import qualified Data.Array.IArray as IArr import Data.Ix data SResult = Good | GoodProcess | Process | Bad data SDPSearch a p r = SDPSearch (a -> p -> [a]) --expand (p -> p) --update (a -> p -> SResult) --sort ([a] -> r) --result runSDPSearch :: SDPSearch a c b -> [a] -> c -> b runSDPSearch (SDPSearch e u s r) list p = r (rec list params) where params = iterate u p rec [] _ = [] rec (l:lp) pr@(n:np) = case s l n of Good -> l : rec lp pr GoodProcess -> l : (rec (e l n) np) ++ (rec lp pr) Process -> (rec (e l n) np) ++ (rec lp pr) Bad -> rec lp pr main = do (a, b) <- (break (== ' ')) `fmap` getLine print (knightTour (read a) (read b)) knightTour :: Int -> Int -> UArray (Int, Int) Int knightTour a b = runSDPSearch (SDPSearch e u s r) [((1, 1), sArray)] 2 where size = a * b range = ((1, 1), (a, b)) sArray = listArray range (1 : (replicate (size - 1) 0)) allTurns :: Array (Int, Int) [(Int, Int)] allTurns = IArr.listArray range [turns x y | x <- [1..a], y <- [1..b]] where shifts = [(1, 2),(1, -2),(2, 1),(2, -1),(-1, 2),(-1, -2),(-2, 1),(-2, -1)] turns x y = [(x+i, y+j) | (i, j) <- shifts, inRange range (x+i, y+j)] e ((x, y), arr) p = [(t, arr // [(t, p)]) | t <- changes] where changes = [t | t <- allTurns ! (x, y), arr ! t == 0] s el p | p == size = Good | otherwise = Process u = (+ 1) r l | not (null l) = snd (head l) | otherwise = error "No solutions!"

Am Montag 01 März 2010 17:07:46 schrieb Artyom Kazak:
Hi! I'm learning Haskell, and now I'm trying to make framework for solving searching problems, such as Knight's Tour. For small boards it answers instantly. For 7x8 board - 23 seconds. For 8x8 board - more than 30 minutes (it hasn't finished yet). Where is the root of the evil?
In the algorithm. You investigate far too many dead ends. Since for larger boards, the number of dead ends increases fast, larger boards take much much longer. With one little change, I get $ echo "59 59" | ./knights +RTS -s > /dev/null ./knights +RTS -s 68,243,720 bytes allocated in the heap 5,914,848 bytes copied during GC 36,436,628 bytes maximum residency (6 sample(s)) 8,486,604 bytes maximum slop 58 MB total memory in use (1 MB lost due to fragmentation) Generation 0: 109 collections, 0 parallel, 0.03s, 0.03s elapsed Generation 1: 6 collections, 0 parallel, 0.02s, 0.02s elapsed INIT time 0.00s ( 0.00s elapsed) MUT time 0.05s ( 0.10s elapsed) GC time 0.05s ( 0.05s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 0.10s ( 0.15s elapsed) %GC time 50.0% (32.2% elapsed) Alloc rate 1,421,744,166 bytes per MUT second Productivity 50.0% of total user, 31.3% of total elapsed For a reason I don't understand, if the second dimension is 60 and the first is > 18, it takes much longer, $ echo "19 60" | ./knights +RTS -A8M -H64M-s > /dev/null ./knights +RTS -A8M -H64M -s 2,374,198,988 bytes allocated in the heap 1,873,412 bytes copied during GC 5,611,132 bytes maximum residency (2 sample(s)) 4,934,352 bytes maximum slop 70 MB total memory in use (1 MB lost due to fragmentation) Generation 0: 281 collections, 0 parallel, 0.15s, 0.15s elapsed Generation 1: 2 collections, 0 parallel, 0.00s, 0.01s elapsed INIT time 0.00s ( 0.00s elapsed) MUT time 1.17s ( 1.21s elapsed) GC time 0.15s ( 0.16s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 1.32s ( 1.37s elapsed) %GC time 11.2% (11.6% elapsed) Alloc rate 2,032,579,317 bytes per MUT second Productivity 88.8% of total user, 85.5% of total elapsed The magic change: e ((x, y), arr) p = [(t, arr // [(t, p)]) | t <- changes] where legit ps = [t | t <- allTurns ! ps, arr!t == 0] changes = map snd $ sort [(length $ legit t, t) | t <- allTurns ! (x, y), arr ! t == 0] investigate squares with fewer options first.

2010/3/1 Daniel Fischer
In the algorithm. You investigate far too many dead ends. Since for larger boards, the number of dead ends increases fast, larger boards take much much longer. With one little change, I get ... For a reason I don't understand, if the second dimension is 60 and the first is > 18, it takes much longer, ... The magic change:
e ((x, y), arr) p = [(t, arr // [(t, p)]) | t <- changes] where legit ps = [t | t <- allTurns ! ps, arr!t == 0] changes = map snd $ sort [(length $ legit t, t) | t <- allTurns ! (x, y), arr ! t == 0]
investigate squares with fewer options first.
Wow! Thanks you! By the way, I didn't notice the difference between (59, 59) and (60, 60) on my machine...

Am Montag 01 März 2010 19:29:45 schrieb Artyom Kazak:
2010/3/1 Daniel Fischer
: In the algorithm. You investigate far too many dead ends. Since for larger boards, the number of dead ends increases fast, larger boards take much much longer. With one little change, I get ... For a reason I don't understand, if the second dimension is 60 and the first is > 18, it takes much longer, ... The magic change:
e ((x, y), arr) p = [(t, arr // [(t, p)]) | t <- changes] where legit ps = [t | t <- allTurns ! ps, arr!t == 0] changes = map snd $ sort [(length $ legit t, t) | t <- allTurns ! (x, y), arr ! t == 0]
investigate squares with fewer options first.
Wow! Thanks you! By the way, I didn't notice the difference between (59, 59) and (60, 60) on my machine...
Strangely, $ echo "62 59" | time ./knights > /dev/null 0.10user 0.08system 0:00.17elapsed 101%CPU $ echo "65 59" | time ./knights > /dev/null 0.08user 0.07system 0:00.17elapsed 96%CPU , so it's a thing of the second dimension predominantly (the size plays a small role, too). As I said, I don't understand it, but looking at the allocation figures: 70*59: 97,970,072 bytes allocated in the heap 18*60: 12,230,296 bytes allocated in the heap 19*60: 2,374,148,320 bytes allocated in the heap 19*61: 13,139,688 bytes allocated in the heap 60*61: 71,771,324 bytes allocated in the heap 61*61: 72,965,428 bytes allocated in the heap it seems that something is kicked out of the registers when the second dimension is 60 and the first > 18. Very strange.

2010/3/1 Daniel Fischer
Am Montag 01 März 2010 19:29:45 schrieb Artyom Kazak:
2010/3/1 Daniel Fischer
: In the algorithm. You investigate far too many dead ends. Since for larger boards, the number of dead ends increases fast, larger boards take much much longer. With one little change, I get ... For a reason I don't understand, if the second dimension is 60 and the first is > 18, it takes much longer, ... The magic change:
e ((x, y), arr) p = [(t, arr // [(t, p)]) | t <- changes] where legit ps = [t | t <- allTurns ! ps, arr!t == 0] changes = map snd $ sort [(length $ legit t, t) | t <- allTurns ! (x, y), arr ! t == 0]
investigate squares with fewer options first.
Wow! Thanks you! By the way, I didn't notice the difference between (59, 59) and (60, 60) on my machine...
Strangely,
$ echo "62 59" | time ./knights > /dev/null 0.10user 0.08system 0:00.17elapsed 101%CPU $ echo "65 59" | time ./knights > /dev/null 0.08user 0.07system 0:00.17elapsed 96%CPU
, so it's a thing of the second dimension predominantly (the size plays a small role, too).
As I said, I don't understand it, but looking at the allocation figures: 70*59: 97,970,072 bytes allocated in the heap 18*60: 12,230,296 bytes allocated in the heap 19*60: 2,374,148,320 bytes allocated in the heap 19*61: 13,139,688 bytes allocated in the heap 60*61: 71,771,324 bytes allocated in the heap 61*61: 72,965,428 bytes allocated in the heap
it seems that something is kicked out of the registers when the second dimension is 60 and the first > 18.
Very strange.
Maybe we were compiling with different options? I compiled with -O2 -fvia-C -optc-O3. ... Oh, I know! I slightly changed the code. import Data.Ord e ((x, y), arr) p = [(t, arr // [(t, p)]) | t <- changes] where legit ps = [t | t <- allTurns ! ps, arr ! t == 0] changes = sortOn (length . legit) (legit (x, y)) sortOn = sortBy . comparing My version gives answer for 60, 60 in one second. But if both dimensions are >60, it won't finish. Yes, very strange.

Am Montag 01 März 2010 21:40:16 schrieb Artyom Kazak:
Maybe we were compiling with different options? I compiled with -O2 -fvia-C -optc-O3. ... Oh, I know! I slightly changed the code.
import Data.Ord
e ((x, y), arr) p = [(t, arr // [(t, p)]) | t <- changes] where legit ps = [t | t <- allTurns ! ps, arr ! t == 0] changes = sortOn (length . legit) (legit (x, y)) sortOn = sortBy . comparing
Ah, that! I also tried that, that gets stuck for different values. With a little debugging output, I saw that it got stuck in a dead-end, always advancing a few steps and then backtracking. I'm now considering also the grand-children, that speeds things up and enters fewer dead-ends, but so far I haven't found a valuation which doesn't enter a dead-end for some values. I have an idea, though, also consider the distance from the border, try squares near the border first.
My version gives answer for 60, 60 in one second. But if both dimensions are >60, it won't finish. Yes, very strange.
participants (2)
-
Artyom Kazak
-
Daniel Fischer