
And against gawk 3.1.5:
$ time awk -F: '{sum += 1 / $2} END{print sum}' test.out
3155.63
real 0m0.197s
user 0m0.184s
sys 0m0.004s
compared to Don's Haskell version:
$ time ./fastSum < test.out
3155.626666664377
real 0m0.072s
user 0m0.056s
sys 0m0.004s
compared to the Corey's perl version:
$ time perl Sum.pl
Duration (sec): 3155.62666666438
real 0m0.181s
user 0m0.164s
sys 0m0.012s
Regards,
Brad Larsen
On Wed, 23 Jul 2008 15:01:24 -0400, Don Stewart
coreyoconnor:
I have the need to regularly write tiny programs that analyze output logs. The output logs don't have a consistent formatting so I typically choose Perl for these tasks.
The latest instance of having to write such a program was simple enough I figured I'd try my hand at using Haskell instead. The short story is that I could quickly write a Haskell program that achieved the goal. Yay! However, the performance was ~8x slower than a comparable Perl implementation. With a lot of effort I got the Haskell version to only 2x slower. A lot of the optimization was done with guesses that the performance difference was due to how each line was being read from the file. I couldn't determine much using GHC's profiler.
{-# OPTIONS -fbang-patterns #-}
import qualified Data.ByteString.Char8 as S import Data.ByteString.Lex.Double import Debug.Trace
main = print . go 0 =<< S.getContents where go !n !s = case readDouble str of Nothing -> n Just (k,t) -> let delta = 1.0 / k in go (n+delta) t where (_, xs) = S.break ((==) ':') s str = S.drop 2 xs
It uses the bytestring-lexing package on Hackage to read the Doubles out,
$ ghc --make Fast.hs -O2 -fvia-C -optc-O2 $ time ./Fast < test.out 3155.626666664377 ./Fast < test.out 0.07s user 0.01s system 97% cpu 0.078 total
So that's twice as fast as the perl entry on my box,
$ time perl Sum.pl < test.out Duration (sec): 3155.62666666438 perl Sum.pl < test.out 0.15s user 0.03s system 100% cpu 0.180 total
Note that the safe Double lexer uses a bit of copying, so we can in fact do better still with a non-copying Double parser, but that's only for the hardcore.
-- Don _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe