
I always thought that signal processing must be a good benchmark for compiler optimizations and optimizer rules of libraries like 'binary' and 'fps'. Many signal processing functions process a signal from the beginning to the end with a little of state, thus in many cases I expect that they can be translated to simple loops. However, I still cannot get the necessary speed for real-time processing out of 'binary' and 'fps'. In my setup a signal is of type [Double]. After a series of sound transformations a signal is finally converted to [Int16] and then to ByteString. Can I expect that interim signals represented as lists are optimized away? I have simple example which writes zeros to disk. The content of the first file is created by ByteString.replicate for the purpose of comparison with the second file. With the second file I want to check the speed of the conversion from [Int16] to ByteString. module Main (main) where import System.Time (getClockTime, diffClockTimes, tdSec, tdPicosec) import qualified Data.ByteString.Lazy as B import qualified Data.Binary.Put as Bin import Foreign (Int16) signalToBinaryPut :: [Int16] -> B.ByteString signalToBinaryPut = Bin.runPut . mapM_ (Bin.putWord16host . fromIntegral) writeSignalBinaryPut :: FilePath -> [Int16] -> IO () writeSignalBinaryPut fileName = B.writeFile fileName . signalToBinaryPut measureTime :: String -> IO () -> IO () measureTime name act = do putStr (name++": ") timeA <- getClockTime act timeB <- getClockTime let td = diffClockTimes timeB timeA print (fromIntegral (tdSec td) + fromInteger (tdPicosec td) * 1e-12 :: Double) numSamples :: Int numSamples = 1000000 zeroSignal16 :: [Int16] zeroSignal16 = replicate numSamples 0 zeroByteString :: B.ByteString zeroByteString = B.replicate (fromIntegral (2 * numSamples)) 0 main :: IO () main = do measureTime "write zero bytestring" (B.writeFile "zero-bytestring.sw" zeroByteString) measureTime "put zero int16" (writeSignalBinaryPut "zero-int16string.sw" zeroSignal16) The program is compiled with GHC-6.4 and option -O2, CPU clock 1.7 GHz. This yields: $ speedtest write zero bytestring: 2.5674e-2 put zero int16: 1.080541 That is, not using ByteString.replicate and converting from [Int16] to ByteString slows down computation by a factor of 40. What am I doing wrong?