
Hi, I'm tackling a Sphere Online Judge tutorial question where it tests how fast you can process input data. You need to achieve at least 2.5MB of input data per second at runtime (on an old machine running ghc 6.6.1). This is probably close to the limit of Haskell's ability. https://www.spoj.pl/problems/INTEST/ I can see that 24 haskell programmers have solved it, but most are very close to the 8 secs limit (and 6/24 are even over the limit!). Here's my code. It fails with a "time limit exceeded" error. (I think it would calculate the correct result, eventually). module Main where import qualified Data.List as DLi import qualified System.IO as SIO main :: IO () main = do line1 <- SIO.hGetLine SIO.stdin let k = read $ words line1 !! 1 s <- SIO.hGetContents SIO.stdin print $ count s k count :: String -> Int -> Int count s k = DLi.foldl' foldFunc 0 (map read $ words s) where foldFunc :: Int -> Int -> Int foldFunc a b | mod b k == 0 = a+1 | otherwise = a I tried using Data.ByteString but then found that 'read' needs a String, not a ByteString. I tried using buffered IO, but it did not make any difference. Any suggestions on how to speed it up? Regards, Steve

On Sun, Aug 30, 2009 at 6:14 AM, Steve
Hi, I'm tackling a Sphere Online Judge tutorial question where it tests how fast you can process input data. You need to achieve at least 2.5MB of input data per second at runtime (on an old machine running ghc 6.6.1). This is probably close to the limit of Haskell's ability.
https://www.spoj.pl/problems/INTEST/
I can see that 24 haskell programmers have solved it, but most are very close to the 8 secs limit (and 6/24 are even over the limit!).
Here's my code. It fails with a "time limit exceeded" error. (I think it would calculate the correct result, eventually).
module Main where
import qualified Data.List as DLi import qualified System.IO as SIO
main :: IO () main = do line1 <- SIO.hGetLine SIO.stdin let k = read $ words line1 !! 1 s <- SIO.hGetContents SIO.stdin print $ count s k
count :: String -> Int -> Int count s k = DLi.foldl' foldFunc 0 (map read $ words s) where foldFunc :: Int -> Int -> Int foldFunc a b | mod b k == 0 = a+1 | otherwise = a
I tried using Data.ByteString but then found that 'read' needs a String, not a ByteString. I tried using buffered IO, but it did not make any difference.
Any suggestions on how to speed it up?
Regards, Steve
Did you try readInt? http://hackage.haskell.org/packages/archive/bytestring/0.9.1.4/doc/html/Data... -- gwern

On Sun, 2009-08-30 at 06:30 -0400, Gwern Branwen wrote:
On Sun, Aug 30, 2009 at 6:14 AM, Steve
wrote: Hi, I'm tackling a Sphere Online Judge tutorial question where it tests how fast you can process input data. You need to achieve at least 2.5MB of input data per second at runtime (on an old machine running ghc 6.6.1). This is probably close to the limit of Haskell's ability.
https://www.spoj.pl/problems/INTEST/
I can see that 24 haskell programmers have solved it, but most are very close to the 8 secs limit (and 6/24 are even over the limit!).
Here's my code. It fails with a "time limit exceeded" error. (I think it would calculate the correct result, eventually).
module Main where
import qualified Data.List as DLi import qualified System.IO as SIO
main :: IO () main = do line1 <- SIO.hGetLine SIO.stdin let k = read $ words line1 !! 1 s <- SIO.hGetContents SIO.stdin print $ count s k
count :: String -> Int -> Int count s k = DLi.foldl' foldFunc 0 (map read $ words s) where foldFunc :: Int -> Int -> Int foldFunc a b | mod b k == 0 = a+1 | otherwise = a
I tried using Data.ByteString but then found that 'read' needs a String, not a ByteString. I tried using buffered IO, but it did not make any difference.
Any suggestions on how to speed it up?
Regards, Steve
Did you try readInt? http://hackage.haskell.org/packages/archive/bytestring/0.9.1.4/doc/html/Data...
Thanks. I didn't see readInt. It allows me to use ByteString and produce results about 10 times faster than System.IO hGetContents. It makes me wonder why the System.IO functions have not been replaced by Data.ByteString. My program runs in 8.56 seconds (its over the 8 secs limit but it was accepted). I compared the top 10 C/C++ results against the top 10 Haskell results: C/C++ ~0.4 secs Haskell ~5.0 secs So it looks like Haskell is ~13 slower for IO than C/C++, even (I assume) when using Data.ByteString or other speed-up tricks. Steve

Hello Steve, Sunday, August 30, 2009, 3:54:53 PM, you wrote:
So it looks like Haskell is ~13 slower for IO than C/C++, even (I assume) when using Data.ByteString or other speed-up tricks.
it means that *your* program is 13x slower than C one and nothing more. in particular, your program may be constrained by readInt speed -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Well, Steve wrote:
I compared the top 10 C/C++ results against the top 10 Haskell results:
So to me it seems he's not talking about his code.
Anyway, I thought Haskell's ByteString IO should not be that much slower
anyway.
Not sure how lazy ByteString IO is implemented, but if it performs async
(aka overlapped) IO, it could be very very fast (faster than C), since the
reading of the next buffer from (or writing of the previous buffer to) the
file is then completely parallel with the computation (when done inplace you
even don't need a memcpy, although these days the overhead of copying 64KB
of memory might be very tiny, it used to be different in the old days :-) At
least that's how I did it in the past in C++ with templates, which was
faster than the C approach.
On Sun, Aug 30, 2009 at 2:15 PM, Bulat Ziganshin
Hello Steve,
Sunday, August 30, 2009, 3:54:53 PM, you wrote:
So it looks like Haskell is ~13 slower for IO than C/C++, even (I assume) when using Data.ByteString or other speed-up tricks.
it means that *your* program is 13x slower than C one and nothing more. in particular, your program may be constrained by readInt speed
-- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Hello Peter, Sunday, August 30, 2009, 4:36:55 PM, you wrote:
I compared the top 10 C/C++ results against the top 10 Haskell results: So to me it seems he's not talking about his code.
well, he talks about 20 programs
Anyway, I thought Haskell's ByteString IO should not be that much slower anyway.
what you mean by ByteString IO speed? speed of reading 100 gb file? speed of reading 1 byte? speed of readInt? speed of those 10 programs? these all are different things, and talking about ByteString IO speed is the same as talking of speed of red cars -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Sun, 2009-08-30 at 16:15 +0400, Bulat Ziganshin wrote:
Hello Steve,
Sunday, August 30, 2009, 3:54:53 PM, you wrote:
So it looks like Haskell is ~13 slower for IO than C/C++, even (I assume) when using Data.ByteString or other speed-up tricks.
it means that *your* program is 13x slower than C one and nothing more. in particular, your program may be constrained by readInt speed
No, not at all. I did not count my program when comparing C/C++ to Haskell. I was counting the the *top 10* programs (submitted by everybody) in the "Best Solutions" list. So its a general survey of the best haskell solutions against the best C/C++ solutions. Steve

Here's my version that works in 0.7s for me for a file with 10^7
"999999999"'s but for some reason gets a 'wrong answer' at SPOJ :)
{-# LANGUAGE BangPatterns #-}
module Main where
import qualified Data.ByteString.Lazy as B
import Data.Word
answer :: Int -> B.ByteString -> Int
answer k = fst . B.foldl' f (0, 0)
where f :: (Int,Int) -> Word8 -> (Int,Int)
f (!countSoFar, !x) 10
| x`mod`k==0 = (countSoFar+1, 0)
| otherwise = (countSoFar, 0)
f (!countSoFar, !x) c = (countSoFar, 10*x+(fromIntegral c)-48)
readInt :: B.ByteString -> Int
readInt = B.foldl' (\x c -> 10*x+fromIntegral c-48) 0
main = do
(line, rest) <- B.break (==10) `fmap` B.getContents
let [n, k] = map readInt . B.split 32 $ line
putStrLn . show $ answer k rest - 1
2009/8/30 Steve
On Sun, 2009-08-30 at 06:30 -0400, Gwern Branwen wrote:
On Sun, Aug 30, 2009 at 6:14 AM, Steve
wrote: Hi, I'm tackling a Sphere Online Judge tutorial question where it tests how fast you can process input data. You need to achieve at least 2.5MB of input data per second at runtime (on an old machine running ghc 6.6.1). This is probably close to the limit of Haskell's ability.
https://www.spoj.pl/problems/INTEST/
I can see that 24 haskell programmers have solved it, but most are very close to the 8 secs limit (and 6/24 are even over the limit!).
Here's my code. It fails with a "time limit exceeded" error. (I think it would calculate the correct result, eventually).
module Main where
import qualified Data.List as DLi import qualified System.IO as SIO
main :: IO () main = do line1 <- SIO.hGetLine SIO.stdin let k = read $ words line1 !! 1 s <- SIO.hGetContents SIO.stdin print $ count s k
count :: String -> Int -> Int count s k = DLi.foldl' foldFunc 0 (map read $ words s) where foldFunc :: Int -> Int -> Int foldFunc a b | mod b k == 0 = a+1 | otherwise = a
I tried using Data.ByteString but then found that 'read' needs a String, not a ByteString. I tried using buffered IO, but it did not make any difference.
Any suggestions on how to speed it up?
Regards, Steve
Did you try readInt? http://hackage.haskell.org/packages/archive/bytestring/0.9.1.4/doc/html/Data...
Thanks. I didn't see readInt. It allows me to use ByteString and produce results about 10 times faster than System.IO hGetContents. It makes me wonder why the System.IO functions have not been replaced by Data.ByteString.
My program runs in 8.56 seconds (its over the 8 secs limit but it was accepted).
I compared the top 10 C/C++ results against the top 10 Haskell results: C/C++ ~0.4 secs Haskell ~5.0 secs So it looks like Haskell is ~13 slower for IO than C/C++, even (I assume) when using Data.ByteString or other speed-up tricks.
Steve
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Eugene Kirpichov Web IR developer, market.yandex.ru

On Sun, 2009-08-30 at 16:34 +0400, Eugene Kirpichov wrote:
Here's my version that works in 0.7s for me for a file with 10^7 "999999999"'s but for some reason gets a 'wrong answer' at SPOJ :)
{-# LANGUAGE BangPatterns #-} module Main where
import qualified Data.ByteString.Lazy as B import Data.Word
answer :: Int -> B.ByteString -> Int answer k = fst . B.foldl' f (0, 0) where f :: (Int,Int) -> Word8 -> (Int,Int) f (!countSoFar, !x) 10 | x`mod`k==0 = (countSoFar+1, 0) | otherwise = (countSoFar, 0) f (!countSoFar, !x) c = (countSoFar, 10*x+(fromIntegral c)-48)
readInt :: B.ByteString -> Int readInt = B.foldl' (\x c -> 10*x+fromIntegral c-48) 0
main = do (line, rest) <- B.break (==10) `fmap` B.getContents let [n, k] = map readInt . B.split 32 $ line putStrLn . show $ answer k rest - 1
Eugene, I ran your code on one of my test files and it gave the same answer as my code. So I submitted it and it was accepted. Its fast - twice as fast as my solution, using much less memory. Overall its the 4th fastest Haskell solution. (but its still 10 * slower than C/C++) I'll have to read up on BangPatterns to try to understand what its doing! I submitted it as: {-# LANGUAGE BangPatterns #-} module Main where import qualified Data.ByteString.Lazy as B import qualified Data.Word as DW answer :: Int -> B.ByteString -> Int answer k = fst . B.foldl' f (0, 0) where f :: (Int,Int) -> DW.Word8 -> (Int,Int) f (!countSoFar, !x) 10 | x`mod`k==0 = (countSoFar+1, 0) | otherwise = (countSoFar, 0) f (!countSoFar, !x) c = (countSoFar, 10*x+(fromIntegral c)-48) readInt :: B.ByteString -> Int readInt = B.foldl' (\x c -> 10*x+fromIntegral c-48) 0 main :: IO () main = do (line, rest) <- B.break (==10) `fmap` B.getContents let [_, k] = map readInt . B.split 32 $ line putStrLn . show $ answer k rest - 1

Thanks :) I wonder why SPOJ didn't accept the same thing from me.
I think that in order to obtain even higher performance we need to
resort to low-level IO: raw reading into a byte buffer and parsing the
very buffer to avoid memcpy'ing.
Or, better, to use Oleg's iteratees with a file handle enumerator.
I'll probably give it a try when I have time, but there's a 70% chance
that I won't, so someone please try it, it should work :)
2009/8/30 Steve
On Sun, 2009-08-30 at 16:34 +0400, Eugene Kirpichov wrote:
Here's my version that works in 0.7s for me for a file with 10^7 "999999999"'s but for some reason gets a 'wrong answer' at SPOJ :)
{-# LANGUAGE BangPatterns #-} module Main where
import qualified Data.ByteString.Lazy as B import Data.Word
answer :: Int -> B.ByteString -> Int answer k = fst . B.foldl' f (0, 0) where f :: (Int,Int) -> Word8 -> (Int,Int) f (!countSoFar, !x) 10 | x`mod`k==0 = (countSoFar+1, 0) | otherwise = (countSoFar, 0) f (!countSoFar, !x) c = (countSoFar, 10*x+(fromIntegral c)-48)
readInt :: B.ByteString -> Int readInt = B.foldl' (\x c -> 10*x+fromIntegral c-48) 0
main = do (line, rest) <- B.break (==10) `fmap` B.getContents let [n, k] = map readInt . B.split 32 $ line putStrLn . show $ answer k rest - 1
Eugene, I ran your code on one of my test files and it gave the same answer as my code. So I submitted it and it was accepted. Its fast - twice as fast as my solution, using much less memory. Overall its the 4th fastest Haskell solution. (but its still 10 * slower than C/C++) I'll have to read up on BangPatterns to try to understand what its doing!
I submitted it as:
{-# LANGUAGE BangPatterns #-} module Main where
import qualified Data.ByteString.Lazy as B import qualified Data.Word as DW
answer :: Int -> B.ByteString -> Int answer k = fst . B.foldl' f (0, 0) where f :: (Int,Int) -> DW.Word8 -> (Int,Int) f (!countSoFar, !x) 10 | x`mod`k==0 = (countSoFar+1, 0) | otherwise = (countSoFar, 0) f (!countSoFar, !x) c = (countSoFar, 10*x+(fromIntegral c)-48)
readInt :: B.ByteString -> Int readInt = B.foldl' (\x c -> 10*x+fromIntegral c-48) 0
main :: IO () main = do (line, rest) <- B.break (==10) `fmap` B.getContents let [_, k] = map readInt . B.split 32 $ line putStrLn . show $ answer k rest - 1
-- Eugene Kirpichov Web IR developer, market.yandex.ru

On Sun, 2009-08-30 at 18:50 +0400, Eugene Kirpichov wrote:
Thanks :) I wonder why SPOJ didn't accept the same thing from me.
I think that in order to obtain even higher performance we need to resort to low-level IO: raw reading into a byte buffer and parsing the very buffer to avoid memcpy'ing. Or, better, to use Oleg's iteratees with a file handle enumerator. I'll probably give it a try when I have time, but there's a 70% chance that I won't, so someone please try it, it should work :)
I just discovered that the SPOJ question regarding the problem http://www.spoj.pl/problems/INTEST/ had already been asked about 2 years ago. http://groups.google.com/group/fa.haskell/browse_thread/thread/4133fa71ce97e... Donald Stewart gave a solution - long, complex and highly optimised using knowledge of Data.ByteString internals. But fast - 2 or 3 times as fast as your method. Steve

I've updated Don Stewart's solution to compile with the modern ByteString libs. I'll be looking at ways to improve the performance of the `bytestring-nums` package. -- Jason Dusek http://github.com/jsnx/bytestring-nums/blob/d7de9db83e44ade9958fb3bfad0b29ed...

Hm, on my machine Don's code has exactly the same performance my code above.
Also, replacing the 'test' and 'parse' functions with this one
add :: Int -> Int -> S.ByteString -> Int
add k i s = fst $ S.foldl' f (i, 0) s
where f (!i, !n) '\n' | n`divisibleBy`k = (i+1, 0)
| otherwise = (i, 0)
f (!i, !n) w = (i, 10*n+ord w-ord '0')
increases performance by another 15% (0.675s vs 0.790s)
2009/9/1 Jason Dusek
I've updated Don Stewart's solution to compile with the modern ByteString libs. I'll be looking at ways to improve the performance of the `bytestring-nums` package.
-- Jason Dusek
http://github.com/jsnx/bytestring-nums/blob/d7de9db83e44ade9958fb3bfad0b29ed... _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Eugene Kirpichov Web IR developer, market.yandex.ru

Good work guys. If you can abstract out a common function for lexing ints out of bytestrings, we could add it to the bytestring-lexing package. ekirpichov:
Hm, on my machine Don's code has exactly the same performance my code above.
Also, replacing the 'test' and 'parse' functions with this one
add :: Int -> Int -> S.ByteString -> Int add k i s = fst $ S.foldl' f (i, 0) s where f (!i, !n) '\n' | n`divisibleBy`k = (i+1, 0) | otherwise = (i, 0) f (!i, !n) w = (i, 10*n+ord w-ord '0')
increases performance by another 15% (0.675s vs 0.790s)
2009/9/1 Jason Dusek
: I've updated Don Stewart's solution to compile with the modern ByteString libs. I'll be looking at ways to improve the performance of the `bytestring-nums` package.
-- Jason Dusek
http://github.com/jsnx/bytestring-nums/blob/d7de9db83e44ade9958fb3bfad0b29ed... _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Eugene Kirpichov Web IR developer, market.yandex.ru _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

2009/08/31 Don Stewart
If you can abstract out a common function for lexing ints out of bytestrings, we could add it to the bytestring-lexing package.
All the really performant implementations operate on strings with multiple ints in them; I suspect this reduces memory traffic -- and indeed, Eugene's code using my libs allocates about twice as much memory as Don's code. I've tried a few different things with strictness annotations to no avail. I'm having some trouble understanding the meaning of "entries" in the profiler's output. I have a file with 5 million random integers in it, totalling 26210408 bytes (21210408 bytes of which are not newlines). The relevant part is here: COST CENTRE MODULE entries MAIN MAIN 0 main Main 1 bint Main 5000001 lazy_int Data.ByteString.Nums.Careless.Int 41211385 digitize Data.ByteString.Nums.Careless.Int 21210408 The number of "entries" to `lazy_int` is puzzling. Eugene's `bint` is called for each line of the file -- once for the header and then 5 million times for each of the integers. (There are two numbers on the first line but Eugene's program only uses `k` so `bint` is only actually entered once.) However, `bint` just calls my `int` and `int` calls `lazy_int` so why are there 41 million plus "entries" of `lazy_int`? -- Jason Dusek

On Tue, 2009-09-01 at 08:45 +0400, Eugene Kirpichov wrote:
Hm, on my machine Don's code has exactly the same performance my code above. That's strange.
Also, replacing the 'test' and 'parse' functions with this one
add :: Int -> Int -> S.ByteString -> Int add k i s = fst $ S.foldl' f (i, 0) s where f (!i, !n) '\n' | n`divisibleBy`k = (i+1, 0) | otherwise = (i, 0) f (!i, !n) w = (i, 10*n+ord w-ord '0')
increases performance by another 15% (0.675s vs 0.790s)
On my system I get a 50% slowdown using this add function! I guess is just shows that benchmarking code on one single CPU/memory/OS/ghc combination does not give results that apply widely. I'm using: AMD Athlon X2 4800 2GB memory Linux (Fedora 11, 64-bit version) ghc 6.10.3 Steve

2009/9/2 Steve
On Tue, 2009-09-01 at 08:45 +0400, Eugene Kirpichov wrote:
Hm, on my machine Don's code has exactly the same performance my code above. That's strange.
Also, replacing the 'test' and 'parse' functions with this one
add :: Int -> Int -> S.ByteString -> Int add k i s = fst $ S.foldl' f (i, 0) s where f (!i, !n) '\n' | n`divisibleBy`k = (i+1, 0) | otherwise = (i, 0) f (!i, !n) w = (i, 10*n+ord w-ord '0')
increases performance by another 15% (0.675s vs 0.790s)
On my system I get a 50% slowdown using this add function!
I guess is just shows that benchmarking code on one single CPU/memory/OS/ghc combination does not give results that apply widely. I'm using: AMD Athlon X2 4800 2GB memory Linux (Fedora 11, 64-bit version) ghc 6.10.3
I've got a Centrino Duo 2000 (I'm on a notebook), Ubuntu 9.04 and ghc 6.10.2. However, we have not set up on what exact input file we're using :) I'm using one where it is written "10000000 3" and then 10000000 lines of "999999999" follow. Also, I wonder what one'd get if one compiled this program with jhc, but I don't know whether jhc is able to compile Data.ByteString.
Steve
-- Eugene Kirpichov Web IR developer, market.yandex.ru

On 02/09/2009, at 2:26 PM, Eugene Kirpichov wrote:
I've got a Centrino Duo 2000 (I'm on a notebook), Ubuntu 9.04 and ghc 6.10.2.
However, we have not set up on what exact input file we're using :) I'm using one where it is written "10000000 3" and then 10000000 lines of "999999999" follow.
Also, I wonder what one'd get if one compiled this program with jhc, but I don't know whether jhc is able to compile Data.ByteString.
It couldn't last time I tried - choked on some INLINE pragmas. Might not be a massive job, but there aren't enough hours in the day... mark

On Wed, 2009-09-02 at 11:55 +0800, Steve wrote:
On Tue, 2009-09-01 at 08:45 +0400, Eugene Kirpichov wrote:
Hm, on my machine Don's code has exactly the same performance my code above. That's strange.
Also, replacing the 'test' and 'parse' functions with this one
add :: Int -> Int -> S.ByteString -> Int add k i s = fst $ S.foldl' f (i, 0) s where f (!i, !n) '\n' | n`divisibleBy`k = (i+1, 0) | otherwise = (i, 0) f (!i, !n) w = (i, 10*n+ord w-ord '0')
increases performance by another 15% (0.675s vs 0.790s)
On my system I get a 50% slowdown using this add function!
I guess is just shows that benchmarking code on one single CPU/memory/OS/ghc combination does not give results that apply widely. I'm using: AMD Athlon X2 4800 2GB memory Linux (Fedora 11, 64-bit version) ghc 6.10.3
I should have also said that the test method and test data is important too. This is what I have been using: $ time ./0450 < 0450.input.data and looking at the 'real' value. The file 0450.input.data is generated with a Python script: #!/usr/bin/env python ''' generate a data file for problem 0450 ''' from __future__ import division # new in 2.2, redundant in 3.0 from __future__ import absolute_import # new in 2.5, redundant in 2.7/3.0 from __future__ import print_function # new in 2.6, redundant in 3.0 import io import random inFile = '0450.input.data' #n, k, tiMax = 10**6, 3, 10**9 n, k, tiMax = 10**7, 3, 10**9 with io.open(inFile, 'wb') as f: f.write('%d %d\n' % (n, k)) for i in xrange(n): ti = random.randint(1, tiMax) f.write('%d\n' % (ti,)) Steve

Bang patterns are easy: they make a part of the pattern strict by seq'ing it.
f (x, !y) = ...
~
f (x, y) = y `seq` ...
2009/8/30 Steve
On Sun, 2009-08-30 at 16:34 +0400, Eugene Kirpichov wrote:
Here's my version that works in 0.7s for me for a file with 10^7 "999999999"'s but for some reason gets a 'wrong answer' at SPOJ :)
{-# LANGUAGE BangPatterns #-} module Main where
import qualified Data.ByteString.Lazy as B import Data.Word
answer :: Int -> B.ByteString -> Int answer k = fst . B.foldl' f (0, 0) where f :: (Int,Int) -> Word8 -> (Int,Int) f (!countSoFar, !x) 10 | x`mod`k==0 = (countSoFar+1, 0) | otherwise = (countSoFar, 0) f (!countSoFar, !x) c = (countSoFar, 10*x+(fromIntegral c)-48)
readInt :: B.ByteString -> Int readInt = B.foldl' (\x c -> 10*x+fromIntegral c-48) 0
main = do (line, rest) <- B.break (==10) `fmap` B.getContents let [n, k] = map readInt . B.split 32 $ line putStrLn . show $ answer k rest - 1
Eugene, I ran your code on one of my test files and it gave the same answer as my code. So I submitted it and it was accepted. Its fast - twice as fast as my solution, using much less memory. Overall its the 4th fastest Haskell solution. (but its still 10 * slower than C/C++) I'll have to read up on BangPatterns to try to understand what its doing!
I submitted it as:
{-# LANGUAGE BangPatterns #-} module Main where
import qualified Data.ByteString.Lazy as B import qualified Data.Word as DW
answer :: Int -> B.ByteString -> Int answer k = fst . B.foldl' f (0, 0) where f :: (Int,Int) -> DW.Word8 -> (Int,Int) f (!countSoFar, !x) 10 | x`mod`k==0 = (countSoFar+1, 0) | otherwise = (countSoFar, 0) f (!countSoFar, !x) c = (countSoFar, 10*x+(fromIntegral c)-48)
readInt :: B.ByteString -> Int readInt = B.foldl' (\x c -> 10*x+fromIntegral c-48) 0
main :: IO () main = do (line, rest) <- B.break (==10) `fmap` B.getContents let [_, k] = map readInt . B.split 32 $ line putStrLn . show $ answer k rest - 1
-- Eugene Kirpichov Web IR developer, market.yandex.ru

module Main where
import qualified Data.ByteString.Lazy as B
import Data.ByteString.Nums.Careless -- from bytestring-nums package
bint :: B.ByteString -> Int
bint = int
main = do
line : rest <- B.split 10 `fmap` B.getContents
let [n, k] = map int . B.split 32 $ line
putStrLn . show . length . tail . filter ((==0).(`mod`k).bint) $ rest
This does a 100MB file in 2.7s (probably because the file is cached by
the filesystem).
2009/8/30 Steve
Hi, I'm tackling a Sphere Online Judge tutorial question where it tests how fast you can process input data. You need to achieve at least 2.5MB of input data per second at runtime (on an old machine running ghc 6.6.1). This is probably close to the limit of Haskell's ability.
https://www.spoj.pl/problems/INTEST/
I can see that 24 haskell programmers have solved it, but most are very close to the 8 secs limit (and 6/24 are even over the limit!).
Here's my code. It fails with a "time limit exceeded" error. (I think it would calculate the correct result, eventually).
module Main where
import qualified Data.List as DLi import qualified System.IO as SIO
main :: IO () main = do line1 <- SIO.hGetLine SIO.stdin let k = read $ words line1 !! 1 s <- SIO.hGetContents SIO.stdin print $ count s k
count :: String -> Int -> Int count s k = DLi.foldl' foldFunc 0 (map read $ words s) where foldFunc :: Int -> Int -> Int foldFunc a b | mod b k == 0 = a+1 | otherwise = a
I tried using Data.ByteString but then found that 'read' needs a String, not a ByteString. I tried using buffered IO, but it did not make any difference.
Any suggestions on how to speed it up?
Regards, Steve
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Eugene Kirpichov Web IR developer, market.yandex.ru

On Sun, 2009-08-30 at 14:40 +0400, Eugene Kirpichov wrote: Thanks, that works nicely too. However, I believe its not a standard package, so I don't think it can be used for Sphere Online problems. I timed a test run on a 10MB file and its a little slower than my solution with the ByteString readInt improvement. Steve
module Main where
import qualified Data.ByteString.Lazy as B import Data.ByteString.Nums.Careless -- from bytestring-nums package
bint :: B.ByteString -> Int bint = int
main = do line : rest <- B.split 10 `fmap` B.getContents let [n, k] = map int . B.split 32 $ line putStrLn . show . length . tail . filter ((==0).(`mod`k).bint) $ rest
This does a 100MB file in 2.7s (probably because the file is cached by the filesystem).
2009/8/30 Steve
: Hi, I'm tackling a Sphere Online Judge tutorial question where it tests how fast you can process input data. You need to achieve at least 2.5MB of input data per second at runtime (on an old machine running ghc 6.6.1). This is probably close to the limit of Haskell's ability.
https://www.spoj.pl/problems/INTEST/
I can see that 24 haskell programmers have solved it, but most are very close to the 8 secs limit (and 6/24 are even over the limit!).
Here's my code. It fails with a "time limit exceeded" error. (I think it would calculate the correct result, eventually).
module Main where
import qualified Data.List as DLi import qualified System.IO as SIO
main :: IO () main = do line1 <- SIO.hGetLine SIO.stdin let k = read $ words line1 !! 1 s <- SIO.hGetContents SIO.stdin print $ count s k
count :: String -> Int -> Int count s k = DLi.foldl' foldFunc 0 (map read $ words s) where foldFunc :: Int -> Int -> Int foldFunc a b | mod b k == 0 = a+1 | otherwise = a
I tried using Data.ByteString but then found that 'read' needs a String, not a ByteString. I tried using buffered IO, but it did not make any difference.
Any suggestions on how to speed it up?
Regards, Steve
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

I've uploaded a new version of bytestring-nums that, while still slower than the fast/custom codes, allows Eugene's earlier program to a little more than 20% faster than it did before. It no longer handles spurious characters in the input by skipping over them (this is probably not a common requirement, anyways). http://hackage.haskell.org/package/bytestring-nums-0.3.0 I suspect that splitting the string into pieces and then mapping the parser over the pieces will never be faster than an all-in-one parser/tester/incrementer like the fast programs have. -- Jason Dusek
participants (8)
-
Bulat Ziganshin
-
Don Stewart
-
Eugene Kirpichov
-
Gwern Branwen
-
Jason Dusek
-
Mark Wotton
-
Peter Verswyvelen
-
Steve