
Using insertWith' gets time down to 30-40 secs (thus only being 3-4 times slower than PHP). PHP still is at 13 secs, does not require installing libraries - does not require compilation and is trivial to write. A trivial C++ application takes 11-12secs and even with some googling was trivial to write. Excerpts from Felipe Almeida Lessa's message of Mon Jan 30 17:36:46 +0100 2012:
Then please take a deeper look into my code. What you said that you've tried is something else. I didn't say that I tried your code. I gave enumerator package a try counting lines which I expected to behave similar to conduits because both serve a similar purpose. Then I hit the the "sourceFile" returns chunked lines issue (reported it, got fixed) - ....
Anyway: My log files are a json dictionary on each line: { id : "foo", ... } { id : "bar", ... } Now how do I use the conduit package to split a "chunked" file into lines? Or should I create a new parser "many json >> newline" ? Except that I think my processJson for this test should look like this because I want to count how often the clients queried the server. Probalby I should also be using CL.fold as shown in the test cases of conduit. If you tell me how you'd cope with the "one json dict on each line" issue I'll try to benchmark this solution as well. -- probably existing library functions can be used here .. processJson :: (M.Map T.Text Int) -> Value -> (M.Map T.Text Int) processJson m value = case value of Ae.Object hash_map -> case HMS.lookup (T.pack "id") hash_map of Just id_o -> case id_o of Ae.String id -> M.insertWith' (+) id 1 m _ -> m _ -> m _ -> m Marc Weber