suggestions for optimizing sudoku solver

I set myself a learning task of writing a sudoku solver. (https://github.com/peterjoel/sudoku/blob/master/src/Sudoku.hs) It works pretty well, but struggles with some of the harder grids. e.g. data/hard4.txt and data/hard5.txt take around 18 seconds to solve. Obviously there's a limit, but I feel like I should be able to make this faster. I think the main issue is reading/writing to the cells in the grid, since (!!) is O(n). Even though it never has to go beyond index 8, it will add up over the millions of times it has to do it. I imagine it could be a lot faster if I use a static, non-list data structure, but I was hoping to keep it a bit more flexible. Also, I'm struggling to get started with performance profiling. Can someone point me to some good resources? Peter

From "Pearls of Functional Algorithm Design", Richard Bird, 2010
"Assuming about a half of the 81 entries are fixed initially (a
generous estimate), there are about 9^40 grids to check."
The main value of the book is that Bird shows how to design by
calculation (how equational reasoning) can be used to solve problems
and then the efficiency of the solution can be improved.
On Sun, Jan 1, 2012 at 7:27 PM, Peter Hall
I set myself a learning task of writing a sudoku solver. (https://github.com/peterjoel/sudoku/blob/master/src/Sudoku.hs) It works pretty well, but struggles with some of the harder grids. e.g. data/hard4.txt and data/hard5.txt take around 18 seconds to solve. Obviously there's a limit, but I feel like I should be able to make this faster.
I think the main issue is reading/writing to the cells in the grid, since (!!) is O(n). Even though it never has to go beyond index 8, it will add up over the millions of times it has to do it. I imagine it could be a lot faster if I use a static, non-list data structure, but I was hoping to keep it a bit more flexible.
Also, I'm struggling to get started with performance profiling. Can someone point me to some good resources?
Peter
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
-- -- Regards, KC

Bird begins with the specification (without regard to efficiency):
solve = filter valid . expand . choices
And ends up with this third version of solve
solve = search . choices
search m
| not (safe m) = []
| complete m' = [map (map head) m']
| otherwise = concat (map search (expand1 m'))
| where m' = prune m
Note: some functions are missing
The interesting idea is how he uses equational reasoning to reach this solve.
On Sun, Jan 1, 2012 at 7:27 PM, Peter Hall
I set myself a learning task of writing a sudoku solver. (https://github.com/peterjoel/sudoku/blob/master/src/Sudoku.hs) It works pretty well, but struggles with some of the harder grids. e.g. data/hard4.txt and data/hard5.txt take around 18 seconds to solve. Obviously there's a limit, but I feel like I should be able to make this faster.
I think the main issue is reading/writing to the cells in the grid, since (!!) is O(n). Even though it never has to go beyond index 8, it will add up over the millions of times it has to do it. I imagine it could be a lot faster if I use a static, non-list data structure, but I was hoping to keep it a bit more flexible.
Also, I'm struggling to get started with performance profiling. Can someone point me to some good resources?
Peter
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
-- -- Regards, KC

I was about to recommend Bird's solution as well.
An incredibly eye-opening pearl. I believe that he sticks with lists
as the data-structure the entire way through, so that should give an
indication that acceptable performance can be achieved without
requiring mutable data-structures, etc.
The only issue with this approach is that unless you're aware of other
invariants available you'll just end up copying Bird's solution...
I certainly wasn't aware of some of the invariants used in the paper,
and can't think of any others.
On Wed, Jan 4, 2012 at 11:01 AM, KC
Bird begins with the specification (without regard to efficiency):
solve = filter valid . expand . choices
And ends up with this third version of solve
solve = search . choices
search m | not (safe m) = [] | complete m' = [map (map head) m'] | otherwise = concat (map search (expand1 m')) | where m' = prune m
Note: some functions are missing
The interesting idea is how he uses equational reasoning to reach this solve.
On Sun, Jan 1, 2012 at 7:27 PM, Peter Hall
wrote: I set myself a learning task of writing a sudoku solver. (https://github.com/peterjoel/sudoku/blob/master/src/Sudoku.hs) It works pretty well, but struggles with some of the harder grids. e.g. data/hard4.txt and data/hard5.txt take around 18 seconds to solve. Obviously there's a limit, but I feel like I should be able to make this faster.
I think the main issue is reading/writing to the cells in the grid, since (!!) is O(n). Even though it never has to go beyond index 8, it will add up over the millions of times it has to do it. I imagine it could be a lot faster if I use a static, non-list data structure, but I was hoping to keep it a bit more flexible.
Also, I'm struggling to get started with performance profiling. Can someone point me to some good resources?
Peter
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
-- -- Regards, KC
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners

I'm amazed at the invariants he comes up with for all the problems in the book.
At the end of Bird's solution:
"Final Remarks
We tested the solver on 36 puzzles recorded at the website
http://haskell.org/haskellwiki/Sudoku. It solved them in 8.8s (on a
1GHz Pentium 3 PC). We also tested them on six minimal puzzles (each
with 17 non-blank entries) chosen randomly from the 32 000 given at
the site. It solved them in 111.4s. There are about a dozen different
Haskell Sudoku solvers at the site. All of these, including a very
nice solver by Lennart Augustsson, deploy coordinate calculations.
Many use arrays and most use monads. Ours is about twice as slow as
Augustsson’s on the nefarious puzzle (a particularly hard puzzle with
the minimum 17 non-blank entries), but about 30 times faster than Yitz
Gale’s solver on easy puzzles. We also know of solvers that reduce the
problem to Boolean satisfiability, constraint satisfaction, model
checking and so on. I would argue that the one presented above is
certainly one of the simplest and shortest. And at least it was
derived, in part, by equational reasoning."
On Tue, Jan 3, 2012 at 7:25 PM, Lyndon Maydwell
I was about to recommend Bird's solution as well.
An incredibly eye-opening pearl. I believe that he sticks with lists as the data-structure the entire way through, so that should give an indication that acceptable performance can be achieved without requiring mutable data-structures, etc.
The only issue with this approach is that unless you're aware of other invariants available you'll just end up copying Bird's solution...
I certainly wasn't aware of some of the invariants used in the paper, and can't think of any others.
On Wed, Jan 4, 2012 at 11:01 AM, KC
wrote: Bird begins with the specification (without regard to efficiency):
solve = filter valid . expand . choices
And ends up with this third version of solve
solve = search . choices
search m | not (safe m) = [] | complete m' = [map (map head) m'] | otherwise = concat (map search (expand1 m')) | where m' = prune m
Note: some functions are missing
The interesting idea is how he uses equational reasoning to reach this solve.
On Sun, Jan 1, 2012 at 7:27 PM, Peter Hall
wrote: I set myself a learning task of writing a sudoku solver. (https://github.com/peterjoel/sudoku/blob/master/src/Sudoku.hs) It works pretty well, but struggles with some of the harder grids. e.g. data/hard4.txt and data/hard5.txt take around 18 seconds to solve. Obviously there's a limit, but I feel like I should be able to make this faster.
I think the main issue is reading/writing to the cells in the grid, since (!!) is O(n). Even though it never has to go beyond index 8, it will add up over the millions of times it has to do it. I imagine it could be a lot faster if I use a static, non-list data structure, but I was hoping to keep it a bit more flexible.
Also, I'm struggling to get started with performance profiling. Can someone point me to some good resources?
Peter
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
-- -- Regards, KC
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
-- -- Regards, KC

Thanks, I have that book coming any day now :)
In the meantime, I started again and tried to avoid accessing the full
List-of-Lists inside the algorithm and to make it more linear. So now,
I'm starting with a list of hints (ie solved cells) and a list of
unsolved cells, each with an index for row, column and block. Each
recursion attempts to move a cell from the unresolved list to the
hints list.
https://github.com/peterjoel/sudoku/blob/master/src/Sudoku/Solver2.hs
The result is actually really simple, and far faster than I had
before. There are some other functions wrapping it, but this is the
meat of it:
solveNext :: [((Int,Int,Int),Int)] -> [(Int,Int,Int)] -> Maybe
[((Int,Int,Int),Int)]
solveNext hints [] = Just hints
solveNext hints (r@(x,y,b):rem) = case catMaybes $ map try remaining of
[] -> Nothing
p:_ -> Just p
where try v = solveNext ((r,v):hints) rem
remaining = [1..9] \\ (map snd .
filter isNeighbour) hints
isNeighbour ((x',y',b'),_) = x==x' ||
y==y' || b==b'
For the two "hardest" grids (hard4.txt and hard5.txt) that were taking
my first solution 17-18 seconds to solve, it's now doing it in around
0.004 seconds, which is about 4000x faster! Strangely, some of the
easier ones are now taking slightly longer. For example this one:
https://github.com/peterjoel/sudoku/blob/master/data/hard2.txt , took
0.18s with Solver1, but 0.65s with Solver2. Still within reason
though.
Peter
On Wed, Jan 4, 2012 at 3:01 AM, KC
Bird begins with the specification (without regard to efficiency):
solve = filter valid . expand . choices
And ends up with this third version of solve
solve = search . choices
search m | not (safe m) = [] | complete m' = [map (map head) m'] | otherwise = concat (map search (expand1 m')) | where m' = prune m
Note: some functions are missing
The interesting idea is how he uses equational reasoning to reach this solve.
On Sun, Jan 1, 2012 at 7:27 PM, Peter Hall
wrote: I set myself a learning task of writing a sudoku solver. (https://github.com/peterjoel/sudoku/blob/master/src/Sudoku.hs) It works pretty well, but struggles with some of the harder grids. e.g. data/hard4.txt and data/hard5.txt take around 18 seconds to solve. Obviously there's a limit, but I feel like I should be able to make this faster.
I think the main issue is reading/writing to the cells in the grid, since (!!) is O(n). Even though it never has to go beyond index 8, it will add up over the millions of times it has to do it. I imagine it could be a lot faster if I use a static, non-list data structure, but I was hoping to keep it a bit more flexible.
Also, I'm struggling to get started with performance profiling. Can someone point me to some good resources?
Peter
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
-- -- Regards, KC
participants (3)
-
KC
-
Lyndon Maydwell
-
Peter Hall