I have been surprised at how rarely switching to Text or ByteString makes things significantly faster. If you do this you should look at Data.ByteString.Builder or Data.Text.Lazy.Builder._______________________________________________On Fri, May 19, 2017 at 4:17 AM, Станислав Черничкин <schernichkin@gmail.com> wrote:Try to use Text or ByteString instead of strings. Try to use compile and execute methods (http://hackage.haskell.org/package/regex-tdfa-1.2.1/docs/Text-Regex-TDFA-ByteString.html), make sure regex get compiled once.--2017-05-16 12:12 GMT+03:00 Bram Neijt <bneijt@gmail.com>:Dear reader,
I decided to do a little project which is a simple search and replace
program for large text files.
Written in Haskell, it does a few different regex matches on each line
and stores them in a leveldb key-value store to create a
consistent/reviewable search-replace index. It should provide for some
simple/brute-force anonymization of data and therefore I called it
hanon (sorry, could not think of a better name).
https://github.com/BigDataRepublic/hanon
The code works, but I've done some benchmarking to compare it with
Python and the code is about 80x slower then doing the same thing in
Python, making it useless for larger data files.
I'm obviously doing something wrong.
Could you give me tips on improving the performance of this code?
Probably mainly looking at
https://github.com/BigDataRepublic/hanon/blob/master/src/Mapper.hs
where the regex code lives?
Greetings,
Bram
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.Sincerely, Stanislav Chernichkin.
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.