Data.PackedString.hGetPS

Hi all, If I have \begin{code} import Data.PackedString (hGetPS) import System.IO (openBinaryFile, hClose, IOMode(ReadMode)) main :: IO () main = do h <- openBinaryFile "1000000x" ReadMode p <- hGetPS h 1000000 hClose h return () \end{code} (1000000x is a file containing 1000000 'x' characters) then: and compile with ghc -O2 (either 6.2 or reasonably recent CVS) then I get: $ ./foo Stack space overflow: current size 1048576 bytes. Use `+RTS -Ksize' to increase it. $ The definition is: -- | Read a 'PackedString' directly from the specified 'Handle'. -- This is far more efficient than reading the characters into a 'String' -- and then using 'packString'. -- NOTE: as with 'hPutPS', the string representation in the file is -- assumed to be ISO-8859-1. hGetPS :: Handle -> Int -> IO PackedString hGetPS h i = do arr <- newArray_ (0, i-1) l <- hGetArray h arr i chars <- mapM (\i -> readArray arr i >>= return.chr.fromIntegral) [0..l-1] return (packString chars) (is the efficiency comment accurate? This definition is calling packString on a String it makes, so it's not intuitively obvious to me). I think the last line should be replaced with return (packNChars l chars) and that the packNChars definition: packNChars len str = PS (array (0,len-1) (zip [0..] str)) is equivalent to: packNChars len str = PS (listArray (0,len-1) str) which I assume is more efficient (due to not con/destructing tuples if nothing else). None of this solves the stack overflow, though. While I'm here, David Roundy has a PackedString which looks to have been forked from GHC or its libraries, now based around data PackedString = PS !(ForeignPtr Word8) !Int !Int He said it can't serve as a replacement as it doesn't support large unicode characters, though (would just changing to a Word32 or Char fix that?). If you're interested it's in the darcs (http://abridgegame.org/darcs/) darcs repo - selecting "FastPackedString.hs" on http://abridgegame.org/cgi-bin/darcs?darcs* should show you the latest version. It doesn't have this problem. Thanks Ian
participants (1)
-
Ian Lynagh