
Am Dienstag, 7. Oktober 2008 04:21 schrieb Jason Dagit:
On Mon, Oct 6, 2008 at 7:06 PM, Mike Coleman
wrote: Hi,
I could use a little help. I was looking through the Real World Haskell book and came across a trivial program for summing numbers in a file. They mentioned that that implementation was very slow, as it's based on String's, so I thought I'd try my hand at converting it to use lazy ByteString's. I've made some progress, but now I'm a little stuck because that module doesn't seem to have a 'read' method.
There's a readInt method, which I guess I could use, but it returns a Maybe, and I don't see how I can easily strip that off.
So:
1. Is there an easy way to strip off the Maybe that would allow an equivalently concise definition for sumFile? I can probably figure out how to do it with pattern matching and a separate function--I'm just wondering if there's a more concise way.
I'm not a ByteString expert, but there should be an easy way to solve this issue of Maybe.
If you go to hoogle (http://www.haskell.org/hoogle/) and type this: [Maybe a] -> [a] it says: Data.Maybe<http://haskell.org/ghc/docs/latest/html/libraries/base/Data-Mayb e.html> .catMaybes<http://haskell.org/ghc/docs/latest/html/libraries/base/Data-Mayb e.html#v:catMaybes>:: [Maybe a] -> [a]<http://haskell.org/ghc/docs/latest/html/libraries/base/Data-Maybe.html# v:catMaybes>
As the top search result.
This means that you can convert any list of maybes into a list of what you want. It just tosses out the Nothings.
Since readInt returns a Maybe (Int,ByteString), Data.List.unfoldr would be a better fit for his needs. The bytestring-lexing package (http://hackage.haskell.org/packages/archive/bytestring-lexing/0.1.2/doc/html...) provides readDouble, which is also pretty fast, I think.
2. Why doesn't ByteString implement 'read'? Is it just that this function (like 'input' in Python) isn't really very useful for real programs?
I think probably for things more complex than parsing ints it's best to make your own parser? I seem to recall that someone was working on a library of parsing functions based on bytestring? Maybe someone else can comment?
At least parsec 3.0.0 has ByteString parsing modules (I've never used it, so I don't know how well they work). IIRC, there's a plan to expand the bytestring-lexing package, too.
3. Why doesn't ByteString implement 'readDouble', etc.? That is, why
are Int and Integer treated specially? Do I not need readDouble?
I think readInt was mostly implemented because integer reading was needed a lot for benchmarks and programming challenge sites and people noticed it was way slower than needed so someone put in the effort it optimize it. Once it was optimized, that must have satisfied the need for fast number reading.
More's underway, readDouble already delivered.
I would agree that at least for Prelude types it would be nice to have efficient bytestring based parsers. Do we have Read/Show classes specifically for working in bytestrings? Maybe that's what the world needs in the bytestring api. Then again, I'm not really qualified to comment :) For all I know it already exists.
partially.
Jason