
mattcbro:
No doubt any kind of binary serialization would be a lot faster. In my case, however, I just wanted it to work out of the box. I need to read in about 5-10 arrays of only 1000 entries or so, saved in files. I suspect even the ascii parser could do that within a few seconds.
Faster, and trivial to write! Here's a complete example: This little demo shows how to serialise (read/show) unboxed arrays efficiently now using Data.Binary, available from Hackage, http://hackage.haskell.org/cgi-bin/hackage-scripts/package/binary/0.2 It uses lazy bytestrings to wrap fast serialisation, in the style of newBinary, in a pure interface. For example, to serialise a bunch of unboxed arrays to and from disk: Import Binary,
import Data.Binary import Data.Array.Unboxed import Codec.Compression.Zlib import qualified Data.ByteString.Lazy as B
type A = UArray Int Int
Create some unboxed arrays
a1, a2, a3, a4, a5 :: A a1 = listArray (0,999) [0..] a2 = listArray (0,999) [1..] a3 = listArray (0,999) [2..] a4 = listArray (0,999) [3..] a5 = listArray (0,999) [4..]
Collect our arrays, and the files we'll write them to
arrs :: [(FilePath,A)] arrs = [("a1",a1) ,("a1",a1) ,("a2",a2) ,("a3",a3) ,("a4",a4) ,("a5",a5)]
An action to write some unboxed arrays to disk
writeArrays :: [(FilePath,A)] -> IO () writeArrays as = mapM_ (uncurry encodeFile) as
An action to read them back from disk:
readArrays :: [FilePath] -> IO [(FilePath,A)] readArrays fs = zip fs `fmap` mapM decodeFile fs
Write, then read, and check that the operation was the identity
main' = do writeArrays arrs arrs' <- readArrays (map fst arrs) print . all (uncurry (==)) $ zip arrs arrs'
Running this: $ ghc -O A.hs --make [1 of 1] Compiling Main ( A.hs, A.o ) Linking A ... $ time ./A True ./A 0.01s user 0.02s system 107% cpu 0.022 total Using zlib (also available from hackage), we can do some more fun things. For example, compressing the arrays before we write them to disk:
writeGzipArrays :: [(FilePath,A)] -> IO () writeGzipArrays as = mapM_ (uncurry gzEncodeFile) as where gzEncodeFile f a = B.writeFile f . compress . encode $ a
And read them back in
readGzipArrays :: [FilePath] -> IO [(FilePath,A)] readGzipArrays fs = zip fs `fmap` mapM gzDecodeFile fs where gzDecodeFile f = return . decode . decompress =<< B.readFile f
Our main function then simply becomes:
main = do writeGzipArrays arrs arrs' <- readGzipArrays (map fst arrs) print . all (uncurry (==)) $ zip arrs arrs'
Runs a bit slower: $ time ./A True ./A 0.03s user 0.01s system 104% cpu 0.037 total But the resulting files on disk are, instead of word sized chunks: $ wc -c a1 8024 a1 Are 8 times smaller: $ wc -c a1 1869 a1 Yay for pure compression and serialisation! Note that you can use this code *right now* , in Hugs or GHc. Cheers, Don