Fast Integer Input

I need a function which has fast conversion from Bytestring to Integer. I am currently using this: import qualified Data.ByteString.Lazy.Char8 as BS readInteger :: BS.ByteString -> Integer readInteger x = case BS.readInteger x of Just (i,_) -> i Are there faster implementations? This function takes 1.8 seconds to convert 2000 integers of length 10^13000. I need it to be smaller that 0.5 sec. Is it possible?

2010/8/23 <200901104@daiict.ac.in>:
This function takes 1.8 seconds to convert 2000 integers of length 10^13000. I need it to be smaller that 0.5 sec. Is it possible?
2000 integers of magnitude 10^13000 equals to about 26 MBytes of data (2000 numbers each 13000 digits long). Rounding 1.8 seconds to two seconds we conclude that you proceed with speed about 13MBytes per second. Assuming you have CPU clock frequency near 2.6GHz, you performing about 200 clock cycles per input digit. 10^13000 roughly equal to 2^39000. Or (2^32)^1219 - 1219 32-bit words of representation. So you're doing some last nextN = (n*10)+currentDigit conversion operations in less that one clock cycle per word. Either I err'd in my math, or you're performing better than most of us could expect. Maybe you are off in your 10^13000 by an order of magnitude.

Serguey Zefirov wrote:
2010/8/23 <200901104@daiict.ac.in>:
This function takes 1.8 seconds to convert 2000 integers of length 10^13000. I need it to be smaller that 0.5 sec. Is it possible?
2000 integers of magnitude 10^13000 equals to about 26 MBytes of data (2000 numbers each 13000 digits long). Rounding 1.8 seconds to two seconds we conclude that you proceed with speed about 13MBytes per second. Assuming you have CPU clock frequency near 2.6GHz, you performing about 200 clock cycles per input digit.
10^13000 roughly equal to 2^39000. Or (2^32)^1219 - 1219 32-bit words of representation. So you're doing some last nextN = (n*10)+currentDigit conversion operations in less that one clock cycle per word.
You can do better than calculating (n*10 + d) repeatedly, using a divide and conquer scheme, which is in fact implemented in ByteString's readInteger: ((a*10 + b) * 100 + (c*10 + d)) * 10000 + ... This helps because now we're multiplying numbers of roughly equal size, which using FFT and related methods, as gmp uses, can be sped up immensely, beating the quadratic complexity that you get with the naive approach. The timings seem about right. HTH, Bertram

2010/8/23 Bertram Felgenhauer
Serguey Zefirov wrote: The timings seem about right.
Thank you for letting me know about divide-and-conquer variant. But I am still amuzed that producing 1200 words of data from 13Kbytes of text took those little 200 cycles of CPU. This is quite interesting and I think I should investigate that later.
participants (3)
-
200901104@daiict.ac.in
-
Bertram Felgenhauer
-
Serguey Zefirov