Re: [Haskell-cafe] Interpreting profiling results

6 Jan 2010

      On Mon, Jan 4, 2010 at 10:05 AM, Daniel Fischer
 wrote:
...
Am Montag 04 Januar 2010 02:17:06 schrieb Patrick LeBoutillier:
...
Hi,
...
...
This question didn't get any replies on the beginners list, I thought
...
I'd try it here...
Sorry, been occupied with other things. I already took a look, but hadn't
anything conclusive enough to reply yet.
No sweat... I didn't mean to be pushy :)

Thanks a lot for all the pointers, they have speeded up my code a lot.

Patrick
...
...
...
I've written (and improved using other solutions I've found on the
...
net) a simple sudoku solver which I'm trying to profile. Here is the
...
code:
...
...
...
import Array
Better
import Data.Array.Unboxed
*much* faster
...
import List (transpose, nub, (\\))
...
import Data.List
...
...
data Sudoku = Sudoku { unit :: Int, cells :: Array (Int, Int) Int,
cells :: UArray (Int,Int) Int
...
holes :: [(Int, Int)] }
...
...
cell :: Sudoku -> (Int, Int) -> Int
...
cell s i = (cells s) ! i
...
...
instance Read Sudoku where
...
 readsPrec _ s = [(Sudoku unit cells holes, "")]
...
   where unit = length . words . head . lines $ s
...
         cells = listArray ((1, 1), (unit, unit)) (map read . words $ s)
...
         holes = [ c | c <- indices cells, (cells ! c) == 0]
...
...
instance Show Sudoku where
...
 show s = unlines [unwords [show $ cell s (x,y) | x <- [1 .. unit s]]
...
...
| y <- [1 .. unit s]]
...
...
genNums :: Sudoku -> (Int, Int) -> [Int]
...
genNums s c@(i,j) = ([1 .. u] \\) . nub $ used
...
 where
nub isn't nice. It's quadratic in the length of the list. Use e.g.
map head . group . sort
or
Data.[Int]Set.toList . Data.[Int]Set.fromList
if the type is in Ord (and you don't need the distinct elements in the order
they come in). That gives an O(n*log n) nub with a sorted result.
And (\\) isn't particularly fast either (O(m*n), where m and n are the
lengths of the lists). If you use one of the above instead of nub, you can
use the O(min m n) 'minus' for sorted lists:
xxs@(x:xs) `minus` yys@(y:ys)
| x < y = x : xs `minus` yys
| x == y = xs `minus` ys
| otherwise = xxs `minus` ys
xs `minus` _ = xs
Here, you can do better:
genNums s c@(i,j) = nums
where
nums = [n | n <- [1 .. u], arr!n]
arr :: [U]Array Int Bool
arr = accumArray (\_ _ -> False) True (0,u) used
...
   used = (row s u i j) ++ (col s u i j) ++ (square s sq u i j)
...
   u = unit s
Not good to calculate sq here. You'll use it many times, calculate once and
store it in s.
...
   sq = truncate . sqrt . fromIntegral $ u
...
...
row s u i j = [cell s (i, y) | y <- [1 .. u]]
...
...
col s u i j = [cell s (x, j) | x <- [1 .. u]]
...
...
square s sq u i j = [cell s (x, y) | y <- [1 .. u], x <- [1 .. u], f x i,
f
...
y j] where f a b = div (a-1) sq == div (b-1) sq
Test for f y j before you generate x to skip early.
square s sq u i j = [cell s (ni+x,nj+y) | x <- [1 .. sq], y <- [1 .. sq]]
where
qi = (i-1) `div` sq
qj = (j-1) `div` sq
ni = qi*sq
nj = qj*sq
...
...
solve :: Sudoku -> [Sudoku]
...
solve s =
...
 case holes s of
...
   [] -> [s]
...
   (h:hs) -> do
...
     n <- genNums s h
...
     let s' = Sudoku (unit s) ((cells s) // [(h, n)]) hs
...
     solve s'
...
...
main = print . head . solve . read =<< getContents
...
...
...
When I compile as such:
...
...
$ ghc -O2 --make Sudoku.hs -prof -auto-all -caf-all -fforce-recomp
...
...
and run it on the following puzzle:
...
...
0 2 3 4
...
3 4 1 0
...
2 1 4 0
...
0 3 2 1
...
...
I get the following profiling report:
...
...
       Fri Jan  1 10:34 2010 Time and Allocation Profiling Report  (Final)
...
...
          Sudoku +RTS -p -RTS
...
...
       total time  =        0.00 secs   (0 ticks @ 20 ms)
That means the report is basically useless. Not entirely, because the
allocation figures may already contain useful information. Run on a 9x9
puzzle (a not too hard one, but not trivial either).
Also, run the profiling with -P instead of -p, you'll get more info about
time and allocation then.
...
       total alloc =     165,728 bytes  (excludes profiling overheads)
...
...
COST CENTRE                    MODULE               %time %alloc
...
...
CAF                            GHC.Handle             0.0   10.7
...
CAF                            Text.Read.Lex          0.0    2.1
...
CAF                            GHC.Read               0.0    1.2
...
square                         Main                   0.0    2.8
...
solve                          Main                   0.0    1.3
...
show_aVx                       Main                   0.0    3.7
...
readsPrec_aYF                  Main                   0.0   60.6
...
main                           Main                   0.0    9.6
...
genNums                        Main                   0.0    5.0
...
cell                           Main                   0.0    1.2
...
...
...
...
                       individual    inherited
...
COST CENTRE              MODULE
...
      no.    entries  %time %alloc   %time %alloc
...
...
MAIN                     MAIN
...
        1           0   0.0    0.3     0.0  100.0
...
 main                    Main
...
      186           1   0.0    9.6     0.0   85.6
...
 show_aVx               Main
...
      196           2   0.0    3.7     0.0    3.7
...
  cell                  Main
...
      197          16   0.0    0.0     0.0    0.0
...
 solve                  Main
...
      188           5   0.0    1.3     0.0   11.8
...
  genNums               Main
...
      189           8   0.0    5.0     0.0   10.4
...
   square               Main
...
      194          88   0.0    2.8     0.0    3.2
...
    cell                Main
...
      195          16   0.0    0.4     0.0    0.4
...
   col                  Main
...
      192           4   0.0    0.7     0.0    1.1
...
    cell                Main
...
      193          16   0.0    0.4     0.0    0.4
...
   row                  Main
...
      190           4   0.0    0.7     0.0    1.1
...
    cell                Main
...
      191          16   0.0    0.4     0.0    0.4
...
 readsPrec_aYF          Main
...
      187           3   0.0   60.6     0.0   60.6
...
 CAF                     GHC.Read
...
      151           1   0.0    1.2     0.0    1.2
...
 CAF                     Text.Read.Lex
...
      144           8   0.0    2.1     0.0    2.1
...
 CAF                     GHC.Handle
...
      128           4   0.0   10.7     0.0   10.7
...
 CAF                     GHC.Conc
...
      127           1   0.0    0.0     0.0    0.0
...
...
Does the column 'entries' represent the number of times the function
...
was called?
Number of times it was 'entered', not quite the same as the number of times
it was called.
I think (Warning: speculation ahead, I don't *know* how the profiles are
generated) it's thus:
Say you call a function returning a list. One call, first entry. It finds
the beginning of the list, the first k elements and hands them to the
caller. Caller processes these, asks "can I have more, or was that it?".
Same call, second entry: f looks for more, finds the next m elements, hands
them to caller. Caller processes. Repeat until whatever happens first,
caller doesn't ask whether there's more or callee finds there's nothing more
(or hits bottom).
...
If so, I don't understand how the 'square' function could
...
be called 88 times when it's caller is only called 8 times. Same thing
...
with 'genNums' (called 8 times, and solve called 5 times)
...
...
What am I missing here?
...
...
Patrick
-- 
=====================
Patrick LeBoutillier
Rosemère, Québec, Canada