Haskell Gurus, I have tried to use profiling to tell me what's going on here, but it hasn't helped much, probably because I'm not interpreting the results correctly. Empirically I have determined that the show's are pretty slow, so an alternative to them would be helpful. I replaced the show's with "", and compiled with -O2 and not much improvement. I need to write _a lot_ of code in this style. A few words about how best to do this would be helpful. Laziness, infinite lists, uvector ?? Help... Thanks, Brian import Complex import System.IO genData :: Double -> Int -> (Double -> Complex Double) -> ([Double], [Complex Double]) genData tstop n f = let deltat = tstop / (fromIntegral n) t = [ fromIntegral(i) * deltat | i <- [0..n-1]] in (t, map f t) main = do let (t, y) = genData 100.0E-6 (2 ^ 15) (\x -> x :+ 0.0) h <- openFile "data.txt" WriteMode mapM_ (\(x, y) -> do hPutStr h (show t) hPutStr h " " hPutStrLn h (show (realPart y))) (zip t y) hClose h
How much output does this generate? Does it matter if you send the
output to /dev/null? This looks as if the bottleneck might well be in
I/O operations, not in the code itself. To find this out, you could
rewrite the code in C and see if that makes a difference?
Thomas
On Sun, Jun 14, 2009 at 20:44, brian
Haskell Gurus,
I have tried to use profiling to tell me what's going on here, but it hasn't helped much, probably because I'm not interpreting the results correctly.
Empirically I have determined that the show's are pretty slow, so an alternative to them would be helpful. I replaced the show's with "", and compiled with -O2 and not much improvement.
I need to write _a lot_ of code in this style. A few words about how best to do this would be helpful. Laziness, infinite lists, uvector ??
Help...
Thanks,
Brian
import Complex import System.IO
genData :: Double -> Int -> (Double -> Complex Double) -> ([Double], [Complex Double]) genData tstop n f = let deltat = tstop / (fromIntegral n) t = [ fromIntegral(i) * deltat | i <- [0..n-1]] in (t, map f t)
main = do let (t, y) = genData 100.0E-6 (2 ^ 15) (\x -> x :+ 0.0) h <- openFile "data.txt" WriteMode mapM_ (\(x, y) -> do hPutStr h (show t) hPutStr h " " hPutStrLn h (show (realPart y))) (zip t y) hClose h _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
I have included a new and improved version. Just to make the comparison a little more reasonable I re-wrote the program using ML and ran it with SMLNJ eal 0m3.175s user 0m0.935s sys 0m0.319s Here's the compiled haskell (ghc -O2 foo.hs -o foo): real 0m16.855s user 0m9.724s sys 0m0.495s OUCH. I verified to make sure they were both writing valid data files. I'm trying to learn how to fish, so I'm truly interested in finding out _how_ to optimize using profiling and other such tools. Here's the header of the foo.prof file: total time = 9.44 secs (472 ticks @ 20 ms) total alloc = 2,171,923,916 bytes (excludes profiling overheads) 2GB of allocation ??? with a base size of 131k. that seems excessive, which gets me back to the , I don't think I'm interpreting profiling stuff correctly. This line is a little more interesting: COST CENTRE MODULE no. entries %time %alloc %time %alloc main Main 178 1 98.7 99.1 98.7 99.1 So even though getData should be doing all of the allocation, main's using a lot of time and effort. I figured it was the show's that were slowing things up (how do I get profiling to show that detail ?), so I had it output just "\n". Well that finishes in no time at all. And yea, verily, the output of the .prof file. total time = 0.14 secs (7 ticks @ 20 ms) total alloc = 65,562,824 bytes (excludes profiling overheads) So I guess it's the show's, but I can't seem to find more efficient float output. FFI to sprintf ? yuch. Brian foo.hs import Numeric import Complex import IO genData :: Double -> Int -> (Double -> Complex Double) -> ([Double], [Complex Double]) genData tstop n f = let deltat = tstop / (fromIntegral n) t = [ fromIntegral(i) * deltat | i <- [0..n-1]] in (t, map f t) main = do let (t, y) = genData 100.0E-6 (2 ^ 17) (\x -> x :+ 0.0) h <- openFile "/dev/null" WriteMode mapM_ (\(x, y) -> do hPutStr h ((showEFloat (Just 6) x) " ") hPutStr h (showEFloat (Just 6) (realPart y) "\n")) (zip t y) hClose h print "Done" foo.sml let fun genData tstop n f = let val deltat = tstop / (Real.fromInt n) val t = List.tabulate(n, fn i => Real.fromInt(i) * deltat) in (t, map f t) end val (t, y) = genData 100.0E~6 131072 (fn x => (x, 0.0)) val h = TextIO.openOut("data.txt") in List.app (fn (x, (a,b)) => (TextIO.output(h, Real.fmt (StringCvt.SCI(SOME 6)) x); TextIO.output(h, Real.fmt (StringCvt.SCI(SOME 6)) a ^ "\n"))) (ListPair.zip (t, y)); TextIO.closeOut(h); print "Done"; () end On Jun 15, 2009, at 12:15 AM, Thomas ten Cate wrote:
How much output does this generate? Does it matter if you send the output to /dev/null? This looks as if the bottleneck might well be in I/O operations, not in the code itself. To find this out, you could rewrite the code in C and see if that makes a difference?
Thomas
On Sun, Jun 14, 2009 at 20:44, brian
wrote: Haskell Gurus,
I have tried to use profiling to tell me what's going on here, but it hasn't helped much, probably because I'm not interpreting the results correctly.
Empirically I have determined that the show's are pretty slow, so an alternative to them would be helpful. I replaced the show's with "", and compiled with -O2 and not much improvement.
I need to write _a lot_ of code in this style. A few words about how best to do this would be helpful. Laziness, infinite lists, uvector ??
Help...
Thanks,
Brian
import Complex import System.IO
genData :: Double -> Int -> (Double -> Complex Double) -> ([Double], [Complex Double]) genData tstop n f = let deltat = tstop / (fromIntegral n) t = [ fromIntegral(i) * deltat | i <- [0..n-1]] in (t, map f t)
main = do let (t, y) = genData 100.0E-6 (2 ^ 15) (\x -> x :+ 0.0) h <- openFile "data.txt" WriteMode mapM_ (\(x, y) -> do hPutStr h (show t) hPutStr h " " hPutStrLn h (show (realPart y))) (zip t y) hClose h _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
briand:
I have included a new and improved version.
Just to make the comparison a little more reasonable I re-wrote the program using ML and ran it with SMLNJ
eal 0m3.175s user 0m0.935s sys 0m0.319s
Here's the compiled haskell (ghc -O2 foo.hs -o foo):
real 0m16.855s user 0m9.724s sys 0m0.495s
OUCH.
I verified to make sure they were both writing valid data files.
I'm trying to learn how to fish, so I'm truly interested in finding out _how_ to optimize using profiling and other such tools.
Here's the header of the foo.prof file:
total time = 9.44 secs (472 ticks @ 20 ms) total alloc = 2,171,923,916 bytes (excludes profiling overheads)
2GB of allocation ??? with a base size of 131k. that seems excessive, which gets me back to the , I don't think I'm interpreting profiling stuff correctly.
This line is a little more interesting:
COST CENTRE MODULE no. entries %time %alloc %time %alloc main Main 178 1 98.7 99.1 98.7 99.1
So even though getData should be doing all of the allocation, main's using a lot of time and effort. I figured it was the show's that were slowing things up (how do I get profiling to show that detail ?), so I had it output just "\n". Well that finishes in no time at all.
And yea, verily, the output of the .prof file.
total time = 0.14 secs (7 ticks @ 20 ms) total alloc = 65,562,824 bytes (excludes profiling overheads)
So I guess it's the show's, but I can't seem to find more efficient float output. FFI to sprintf ? yuch.
Is your SMLNJ using lazy lists? :) Try hmatrix or uvector. -- Don
On Jun 16, 2009, at 8:58 AM, Don Stewart wrote:
So I guess it's the show's, but I can't seem to find more efficient float output. FFI to sprintf ? yuch.
Is your SMLNJ using lazy lists? :)
strictly speaking : no.
Try hmatrix or uvector.
uvector is _probably_ the long term answer even after I solve the double -> string problem. However, I would like to reiterate that it's the double -> string which is really the time/memory sink. I verified this by printing a simple string based on the value (to make sure the value was evaluated) and it runs fast enough for me. Is there an efficient way to output double -> binary ? I typically write my data files as binary anyway, because it's faster for graph and the like to handle them anyway. Brian
On Tue, Jun 16, 2009 at 6:47 PM, brian
On Jun 16, 2009, at 8:58 AM, Don Stewart wrote:
So I guess it's the show's, but I can't seem to find more efficient
float output. FFI to sprintf ? yuch.
Is your SMLNJ using lazy lists? :)
strictly speaking : no.
Try hmatrix or uvector.
uvector is _probably_ the long term answer even after I solve the double -> string problem.
However, I would like to reiterate that it's the double -> string which is really the time/memory sink. I verified this by printing a simple string based on the value (to make sure the value was evaluated) and it runs fast enough for me.
You might want to look at the source and see if you can find a faster way to convert it: http://haskell.org/ghc/docs/latest/html/libraries/base/src/GHC-Float.html#sh... Jason
brian
However, I would like to reiterate that it's the double -> string which is really the time/memory sink. I verified this by printing a simple string based on the value (to make sure the value was evaluated) and it runs fast enough for me.
Is there an efficient way to output double -> binary ?
Not as far as I know. I had the same problem, and ended up building a array of float representations: farray :: Array Int ByteString farray = listArray (0,9999) [B.pack (showFFloat (Just 2) i "") | i <- [0,0.01..99.99::Double]] and using a lookup function to show the floats: fi :: Int -> ByteString fi f | f <= 9999 && f >= 0 = farray!f | otherwise = error ("Can't show a value of "++show f) This works for me, since I have a very limited range of Doubles to deal with (stored as fixed-precision Ints). Still, a fast and general way to output primitive data types would be quite welcome. -k -- If I haven't seen further, it is by standing in the footprints of giants
participants (6)
-
brian -
Don Stewart -
Jason Dagit -
Ketil Malde -
Matthias Görgens -
Thomas ten Cate