RE: Data.PackedString.hGetPS

If I have
\begin{code} import Data.PackedString (hGetPS) import System.IO (openBinaryFile, hClose, IOMode(ReadMode))
main :: IO () main = do h <- openBinaryFile "1000000x" ReadMode p <- hGetPS h 1000000 hClose h return () \end{code}
(1000000x is a file containing 1000000 'x' characters) then:
and compile with ghc -O2 (either 6.2 or reasonably recent CVS) then I get:
$ ./foo Stack space overflow: current size 1048576 bytes. Use `+RTS -Ksize' to increase it. $
I noticed a while back that Data.PackedString isn't being optimised as well as it should. Most of the code in there is written with the assumption that lots of Deforestation is going to happen, and in several of the cases I looked at it isn't. This is on our list of things to investigate (along with performance of the overloaded array support).
-- | Read a 'PackedString' directly from the specified 'Handle'. -- This is far more efficient than reading the characters into a 'String' -- and then using 'packString'. -- NOTE: as with 'hPutPS', the string representation in the file is -- assumed to be ISO-8859-1. hGetPS :: Handle -> Int -> IO PackedString hGetPS h i = do arr <- newArray_ (0, i-1) l <- hGetArray h arr i chars <- mapM (\i -> readArray arr i >>= return.chr.fromIntegral) [0..l-1] return (packString chars)
(is the efficiency comment accurate? This definition is calling packString on a String it makes, so it's not intuitively obvious to me).
I think the last line should be replaced with
return (packNChars l chars)
Indeed that would make a small improvement, yes. Cheers, Simon
participants (1)
-
Simon Marlow