
Hello, After reading Peter Norvig's take on writing a Sudoku solver (http:// norvig.com/sudoku.html) I decided that I would port his program to Haskell, without changing the algorithm, that'll make a nice exercise I thought and should be fairly easy... Boy, was I wrong ! Anyway, I eventually managed to tiptoe around for loops, mutable state, etc... However, when I run my program against the test data provided (http:// norvig.com/top95.txt), I find it takes around 1m20 s to complete (compiled with -fvia-C and - O2, on a MacBook Pro 2.33GHz Intel Core 2 Duo). That's roughly 8 times longer than Norvig's Python script. That's not what I expected ! My program is also longer than the Python version. Being a beginner, I am convinced my implementation is super naive and non idiomatic. A seasonned Haskeller would do much shorter and much faster. I don't know how to improve it though ! Should I introduce more strictness ? replace lists with more efficient data structures (ByteStrings, Arrays) ? Here is my program, and part of the profiling (memory allocation looks huge !) I hope this post wasn't too long. Thanks for any advice ! Emmanuel. {- This is an attempt to implement in Haskell, Peter Norvig's sudoku solver : "Solving Every Sudoku Puzzle" (http://norvig.com/sudoku.html) In Norvig's program, methods which change a grid return either a new grid, either False (failure). Here I use Maybe, and return Just grid or Nothing in case of failure -} module Main where import Prelude hiding (lookup) import Data.List hiding (lookup) import qualified Data.Map as M import Control.Monad import Maybe import System.IO -------------------------------------------------- -- Types type Digit = Char type Square = String type Unit = [Square] -- We represent our grid as a Map type Grid = M.Map Square [Digit] -------------------------------------------------- -- Setting Up the Problem rows = "ABCDEFGHI" cols = "123456789" digits = "123456789" cross :: String -> String -> [String] cross rows cols = [ r:c:[] | r <- rows, c <- cols ] squares :: [Square] squares = cross rows cols -- ["A1","A2","A3",...] unitlist :: [Unit] unitlist = [ cross rows [c] | c <- cols ] ++ [ cross [r] cols | r <- rows ] ++ [ cross rs cs | rs <- ["ABC","DEF","GHI"], cs <- ["123","456","789"]] units :: M.Map Square [Unit] units = M.fromList [ (s, [ u | u <- unitlist, elem s u ]) | s <- squares ] peers :: M.Map Square [Square] peers = M.fromList [ (s, set [[ p | p <- e, p /= s ] | e <- lookup s units ]) | s <- squares ] where set = nub . concat -------------------------------------------------- -- Wrapper around M.lookup used in list comprehensions lookup :: (Ord a, Show a) => a -> M.Map a b -> b lookup k v = case M.lookup k v of Just x -> x Nothing -> error $ "Error : key " ++ show k ++ " not in map !" -- lookup k m = fromJust . M.lookup k m -------------------------------------------------- -- Parsing a grid into a Map parsegrid :: String -> Maybe Grid parsegrid g = do regularGrid g foldM assign allPossibilities (zip squares g) where allPossibilities :: Grid allPossibilities = M.fromList [ (s,digits) | s <- squares ] regularGrid :: String -> Maybe String regularGrid g = if all (\c -> (elem c "0.-123456789")) g then (Just g) else Nothing -------------------------------------------------- -- Propagating Constraints assign :: Grid -> (Square, Digit) -> Maybe Grid assign g (s,d) = if (elem d digits) then do -- check that we are assigning a digit and not a '.' let toDump = delete d (lookup s g) res <- foldM eliminate g (zip (repeat s) toDump) return res else return g eliminate :: Grid -> (Square, Digit) -> Maybe Grid eliminate g (s,d) = let cell = lookup s g in if not (elem d cell) then return g -- already eliminated -- else d is deleted from s' values else do let newCell = delete d cell newV = M.insert s newCell g -- newV2 <- case length newCell of -- contradiction : Nothing terminates the computation 0 -> Nothing -- if there is only one value (d2) left in square, remove it from peers 1 -> do let peersOfS = [ s' | s' <- lookup s peers ] res <- foldM eliminate newV (zip peersOfS (cycle newCell)) return res -- else : return the new grid _ -> return newV -- Now check the places where d appears in the units of s let dPlaces = [ s' | u <- lookup s units, s' <- u, elem d (lookup s' newV2) ] case length dPlaces of 0 -> Nothing -- d can only be in one place in unit; assign it there 1 -> assign newV2 (head dPlaces, d) _ -> return newV2 -------------------------------------------------- -- Search search :: Maybe Grid -> Maybe Grid search Nothing = Nothing search (Just g) = if all (\xs -> length xs == 1) [ lookup s g | s <- squares ] then (Just g) -- solved else do let (_,s) = minimum [ (length (lookup s g),s) | s <- squares, length (lookup s g) > 1 ] g' = g -- copie of g foldl' some Nothing [ search (assign g' (s,d)) | d <- lookup s g ] where some Nothing Nothing = Nothing some Nothing (Just g) = (Just g) some (Just g) _ = (Just g) -------------------------------------------------- -- Display solved grid printGrid :: Grid -> IO () printGrid = putStrLn . gridToString gridToString :: Grid -> String gridToString g = let l0= map snd (M.toList g) -- [("1537"),("4"),...] l1 = (map (\s -> " " ++ s ++ " ")) l0 -- [" 1 "," 2 ",...] l2 = (map concat . sublist 3) l1 -- [" 1 2 3 "," 4 5 6 ",...] l3 = (sublist 3) l2 -- [[" 1 2 3 "," 4 5 6 "," 7 8 9 "],...] l4 = (map (concat . intersperse "|")) l3 -- [" 1 2 3 | 4 5 6 | 7 8 9 ",...] l5 = (concat . intersperse [line] . sublist 3) l4 in unlines l5 where sublist n [] = [] sublist n xs = take n xs : sublist n (drop n xs) line = hyphens ++ "+" ++ hyphens ++ "+" ++ hyphens hyphens = take 9 (repeat '-') -------------------------------------------------- main :: IO () main = do h <- openFile "top95.txt" ReadMode grids <- hGetContents h let solved = mapMaybe (search . parsegrid) (lines grids) mapM_ printGrid solved hClose h ************************************************************************ *** Sun Aug 26 13:44 2007 Time and Allocation Profiling Report (Final) sudoku_norvig +RTS -p -hc -RTS total time = 49.40 secs (988 ticks @ 50 ms) total alloc = 6,935,777,308 bytes (excludes profiling overheads) COST CENTRE MODULE %time %alloc lookup Main 65.7 22.6 eliminate Main 32.4 70.3 search Main 1.8 6.3 individual inherited COST CENTRE MODULE no. entries % time %alloc %time %alloc MAIN MAIN 1 0 0.0 0.0 100.0 100.0 main Main 190 1 0.0 0.0 100.0 100.0 printGrid Main 214 95 0.0 0.0 0.0 0.1 gridToString Main 215 665 0.0 0.1 0.0 0.1 search Main 208 427143 1.8 6.3 99.4 99.2 assign Main 210 468866 0.1 0.6 90.4 90.3 eliminate Main 212 30626903 32.2 69.8 89.9 89.6 lookup Main 213 172203504 57.7 19.9 57.7 19.9 lookup Main 211 468866 0.4 0.1 0.4 0.1 lookup Main 209 22447632 7.2 2.6 7.2 2.6 parsegrid Main 192 95 0.0 0.0 0.6 0.7 assign Main 198 7695 0.0 0.0 0.6 0.7 eliminate Main 201 51054 0.2 0.5 0.6 0.7 lookup Main 202 1239860 0.4 0.1 0.4 0.1 lookup Main 200 1953 0.0 0.0 0.0 0.0 ... (more innocuous stuff)