Implementation of the Floyd-Warshall algorithm

Hi, I'm new to Haskell (yet I am very familiar with Lisp and OCaml), and I am trying to implement the Floyd-Warshall algorithm (finding the minimal distance between two nodes in a weighted graph). For an input graph with 101 nodes, the obvious C version takes 0.01 s on my machine. My first totally functional implementation in Haskell took 6s... for a graph with 10 edges. (This version considered that a graph is given as its adjacency matrix, which is represented as a 2-uple in ([k], k -> k -> Double)). [I do not show my code, as I am ashamed of it :-S] My first question is: what would an (efficient?) version of the algorithm using this representation would look like ? Is it possible to do without ressorting to the ST monad ? Now, I have been trying to implement it in a more imperative way, to understand how the ST monad works. It runs in 0.6s for a 101-noded graph, which is much, much faster than the original version but still much slower than the C version. I would be very grateful if someone cared to explain why this is unefficient and how to make it faster (Without using the FFI :-|) Thanks by advance. (BTW, I'm using the ghc-6.42 compiler with -O2 flag). -- Frederic Beal -- Code begins here module FW (bench) where import Control.Monad import Control.Monad.ST import Data.Array.ST update :: STUArray s (Int, Int) Double -> Int -> Int -> Int -> ST s () update arr i j k = do aij <- readArray arr (i, j) ajk <- readArray arr (j, k) aik <- readArray arr (i, k) if aij + ajk < aik then do writeArray arr (i, k) (aij + ajk) else return () updateLine arr i j n = do mapM_ (update arr i j) [0..n] updateRow arr i n = do mapM_ (\x -> updateLine arr i x n) [0..n] updateStep arr n = do mapM_ (\x -> updateRow arr x n) [0..n] -- The actual FW invocation canonicalize = updateStep -- From here on, the "testing" suite count = 100 -- A test array: M[i, j] = 1 + ((x+y) mod count) orgArray :: ST s (STUArray s (Int, Int) Double) orgArray = do v <- newArray ((0, 0), (count, count)) 0.0 mapM_ (\x -> mapM_ (\y -> writeArray v (x, y) ((1+) $ fromIntegral (mod (x+y) count))) [0..count]) [0..count] return v sumDiag :: STUArray s (Int, Int) Double -> Int -> ST s Double sumDiag arr n = do foldM (\y x -> do a <- readArray arr (x, x) return $ a + y) 0.0 [0..n] orgDiag = do arr <- orgArray v <- sumDiag arr count return v cptDiag = do arr <- orgArray canonicalize arr count v <- sumDiag arr count return v bench = do val <- stToIO cptDiag diag <- stToIO orgDiag print val print diag

Every readArray and writeArray checks that the index tuple is in range. You could try an use Data.Array.Base (in GHC) and unsafeWrite and unsafeRead. They do not do bounds checking like readArray and writeArray. Oleg has a good interface to unchecked array usage here: http://okmij.org/ftp/Haskell/types.html#branding -- Chris frederic@ka-ge-ro.org wrote:
Hi,
I'm new to Haskell (yet I am very familiar with Lisp and OCaml), and I am trying to implement the Floyd-Warshall algorithm (finding the minimal distance between two nodes in a weighted graph). For an input graph with 101 nodes, the obvious C version takes 0.01 s on my machine. My first totally functional implementation in Haskell took 6s... for a graph with 10 edges. (This version considered that a graph is given as its adjacency matrix, which is represented as a 2-uple in ([k], k -> k -> Double)). [I do not show my code, as I am ashamed of it :-S] My first question is: what would an (efficient?) version of the algorithm using this representation would look like ? Is it possible to do without ressorting to the ST monad ?
Now, I have been trying to implement it in a more imperative way, to understand how the ST monad works. It runs in 0.6s for a 101-noded graph, which is much, much faster than the original version but still much slower than the C version. I would be very grateful if someone cared to explain why this is unefficient and how to make it faster (Without using the FFI :-|) Thanks by advance. (BTW, I'm using the ghc-6.42 compiler with -O2 flag).
-- Frederic Beal
-- Code begins here module FW (bench) where
import Control.Monad import Control.Monad.ST import Data.Array.ST
update :: STUArray s (Int, Int) Double -> Int -> Int -> Int -> ST s () update arr i j k = do aij <- readArray arr (i, j) ajk <- readArray arr (j, k) aik <- readArray arr (i, k) if aij + ajk < aik then do writeArray arr (i, k) (aij + ajk) else return ()
updateLine arr i j n = do mapM_ (update arr i j) [0..n] updateRow arr i n = do mapM_ (\x -> updateLine arr i x n) [0..n] updateStep arr n = do mapM_ (\x -> updateRow arr x n) [0..n]
-- The actual FW invocation canonicalize = updateStep
-- From here on, the "testing" suite count = 100
-- A test array: M[i, j] = 1 + ((x+y) mod count) orgArray :: ST s (STUArray s (Int, Int) Double) orgArray = do v <- newArray ((0, 0), (count, count)) 0.0 mapM_ (\x -> mapM_ (\y -> writeArray v (x, y) ((1+) $ fromIntegral (mod (x+y) count))) [0..count]) [0..count] return v
sumDiag :: STUArray s (Int, Int) Double -> Int -> ST s Double sumDiag arr n = do foldM (\y x -> do a <- readArray arr (x, x) return $ a + y) 0.0 [0..n]
orgDiag = do arr <- orgArray v <- sumDiag arr count return v
cptDiag = do arr <- orgArray canonicalize arr count v <- sumDiag arr count return v
bench = do val <- stToIO cptDiag diag <- stToIO orgDiag print val print diag
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On 7/28/06, frederic@ka-ge-ro.org
Hi,
I'm new to Haskell (yet I am very familiar with Lisp and OCaml), and I am trying to implement the Floyd-Warshall algorithm (finding the minimal distance between two nodes in a weighted graph). For an input graph with 101 nodes, the obvious C version takes 0.01 s on my machine. My first totally functional implementation in Haskell took 6s... for a graph with 10 edges. (This version considered that a graph is given as its adjacency matrix, which is represented as a 2-uple in ([k], k -> k -> Double)). [I do not show my code, as I am ashamed of it :-S] My first question is: what would an (efficient?) version of the algorithm using this representation would look like ? Is it possible to do without ressorting to the ST monad ?
Dynamic programming is actually quite neat in Haskell. You can express it quite directly using arrays. arr = array (1,n) [ (k, foo k) | k <- [1..n]] foo k = ... now, foo would reference arr in some way, it it should probably contain some base case for k=1. So you basically just let "foo k" be the actual algorithm in question, and then arr!x (x = n most likely) would be your final result. Laziness is really neat here you see. You can define the array 'arr' such that its elements actually reference 'arr' in their definition (no need to obfuscate the algorithm with bottom-up construction like in imperative languages). IIRC Floyd Warshall would be a 3 dimensional array whose k=0 elements would simply be the distance matrix. Something like (untested!): fw :: Array Int Double -> Int -> Int -> Double fw dist i j = d!(n,i,j) where (_,(_,n)) = bounds dist -- find n, assum vertices are 1..n d = array ((0,1,1),(n,n,n)) [((k,i,j), foo k i j) | k <- [0..n], i <- [1..n], j <- [1..n]] foo 0 i j = dist!(i,j) -- base case, just the weighted edges foo k i j = min (d!(k-1,i,j)) (d!(k-1,k,k) + d!(k-1,k,j)) /S -- Sebastian Sylvan +46(0)736-818655 UIN: 44640862

On 7/28/06, Sebastian Sylvan
On 7/28/06, frederic@ka-ge-ro.org
wrote: Hi,
I'm new to Haskell (yet I am very familiar with Lisp and OCaml), and I am trying to implement the Floyd-Warshall algorithm (finding the minimal distance between two nodes in a weighted graph). For an input graph with 101 nodes, the obvious C version takes 0.01 s on my machine. My first totally functional implementation in Haskell took 6s... for a graph with 10 edges. (This version considered that a graph is given as its adjacency matrix, which is represented as a 2-uple in ([k], k -> k -> Double)). [I do not show my code, as I am ashamed of it :-S] My first question is: what would an (efficient?) version of the algorithm using this representation would look like ? Is it possible to do without ressorting to the ST monad ?
Dynamic programming is actually quite neat in Haskell.
You can express it quite directly using arrays.
arr = array (1,n) [ (k, foo k) | k <- [1..n]] foo k = ... now, foo would reference arr in some way, it it should probably contain some base case for k=1. So you basically just let "foo k" be the actual algorithm in question, and then arr!x (x = n most likely) would be your final result. Laziness is really neat here you see. You can define the array 'arr' such that its elements actually reference 'arr' in their definition (no need to obfuscate the algorithm with bottom-up construction like in imperative languages).
IIRC Floyd Warshall would be a 3 dimensional array whose k=0 elements would simply be the distance matrix. Something like (untested!):
fw :: Array Int Double -> Int -> Int -> Double fw dist i j = d!(n,i,j) where (_,(_,n)) = bounds dist -- find n, assum vertices are 1..n d = array ((0,1,1),(n,n,n)) [((k,i,j), foo k i j) | k <- [0..n], i <- [1..n], j <- [1..n]] foo 0 i j = dist!(i,j) -- base case, just the weighted edges foo k i j = min (d!(k-1,i,j)) (d!(k-1,k,k) + d!(k-1,k,j)) ^^^^ typo
Bah, spotted a typo, it's (of course): foo k i j = min (d!(k-1,i,j)) (d!(k-1,i,k) + d!(k-1,k,j)) /S -- Sebastian Sylvan +46(0)736-818655 UIN: 44640862

Sebastian Sylvan wrote:
Dynamic programming is actually quite neat in Haskell.
You can express it quite directly using arrays.
arr = array (1,n) [ (k, foo k) | k <- [1..n]] foo k = ... now, foo would reference arr in some way, it it should probably contain some base case for k=1. So you basically just let "foo k" be the actual algorithm in question, and then arr!x (x = n most likely) would be your final result. Laziness is really neat here you see. You can define the array 'arr' such that its elements actually reference 'arr' in their definition (no need to obfuscate the algorithm with bottom-up construction like in imperative languages).
Anyone interested in the above style of programming should check out the SCP paper "A discipline of dynamic programming over sequence data" or related articles on "algebraic dynamic programming" by Giegerich and colleagues. The heart of their DSL (embedded in Haskell) is very much like the above self-referential array idea. Ciao, Janis. -- Dr. Janis Voigtlaender http://wwwtcs.inf.tu-dresden.de/~voigt/ mailto:voigt@tcs.inf.tu-dresden.de

Hello frederic, Friday, July 28, 2006, 10:44:51 AM, you wrote:
much slower than the C version. I would be very grateful if someone cared to explain why this is unefficient and how to make it faster
update :: STUArray s (Int, Int) Double -> Int -> Int -> Int -> ST s () update arr i j k = do aij <- readArray arr (i, j) try update arr i j k | arr `seq` i `seq` j `seq` k `seq` True = do aij <- readArray arr (i, j) and make the same changes in other update functions. we has http://haskell.org/haskellwiki/Performance page, you should read it second problem is bounds checking in readArray. if you want better speed, you should use unsafeRead/unsafeWrite and compute index with something like (i*count+j) btw, 'newArray ((0, 0), (10, 10))' creates array with 121 elements. is this what you really need? also, "foreach=flip mapM_" will be useful for you. i also recommend you to read http://haskell.org/haskellwiki/IO_inside - ST monad differs from IO one only in what its operations are restricted to array/reference ones -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com
participants (5)
-
Bulat Ziganshin
-
Chris Kuklewicz
-
frederic@ka-ge-ro.org
-
Janis Voigtlaender
-
Sebastian Sylvan