
On Monday 27 September 2010 10:54:56, Christian Maeder wrote:
I wonder if a generic version costs performance?
lines = breaksBy (== "\n") (or "linesBy" or "splitBy")
Cheers Christian
Gut feeling said it shouldn't and benchmarking supports that. So, is that a generic enough operation to add it to the Data.List API? If yes, the best name has to be found. On the one hand, splitBy or breaksBy sound nicer than linesBy, because generically, it has nothing to do with lines. On the other hand, neither break nor split[At] remove the separators while lines does. Also, there's linesBy in Data.List.Split (http://hackage.haskell.org/packages/archive/split/0.1.2.1/doc/html/Data- List-Split.html) which does exactly that. But Data.List.Split.linesBy is faster (for reasonably short lines). However, it dies a horrible death (Stack space overflow: current size 67108864 bytes.) for very long lines and is then much slower if you give it enough stack to complete. So I would say, put the generic version into Data.List as linesBy. I think that deserves its own proposal.