On Sun, Sep 28, 2014 at 6:59 PM, Greg Weber <greg@gregweber.info> wrote:

Should the code be using Seq instead of a List?

On Sun, Sep 28, 2014 at 3:53 PM, David Feuer <david.feuer@gmail.com> wrote:
One good option might be dropWhileR, to match Data.Sequence, but I'd only want to use that if the dropWhileEnd name were deprecated in favor of takeUntilLastSatisfying or whatever—otherwise it's just too confusing.

On Sep 28, 2014 4:39 PM, "David Feuer" <david.feuer@gmail.com> wrote:
BACKGROUND:

A somewhat common idiom I discovered digging around the GHC tree is the use of

    reverse . dropWhile p . reverse

to remove some unwanted things from the end of a list. The lists involved tend to be short (e.g., file names, package names, lines of user input, etc.) and the amount removed from the end of the list tends to be short (a trailing newline, the last part of a filename, extra spaces, etc.). I initially intended to replace all of these with Data.List.dropWhileEnd p. Unfortunately, my benchmarking showed that this had a substantial negative impact on performance. Data.List.dropWhileEnd is defined like this:

    dropWhileEnd p = foldr (\x r -> if p x && null r then [] else x:r) []

This is lazy in the *spine* of the list--it can "flush its buffer" and produce list elements any time p x is found to be false. This is the best you can do if you need to process a very large list and don't want to have to load the whole thing into memory, and/or your predicate is *very* cheap. Unfortunately, it pays a steep price: for non-trivial p, it's strict in the *elements* of the list. This makes it unsuitable, from a performance standpoint, for most of the purposes for which I've seen reverse . dropWhile p . reverse used.

CONCEPT:

Add (by some name) a function that is lazy in the elements while strict in the spine:

dropWhileEndLE :: (a -> Bool) -> [a] -> [a]
-- specification: dropWhileEndLE p = reverse . dropWhile p . reverse
dropWhileEndLE p = foldr (\x r -> if null r && p x then [] else x:r) []

This is a good bit faster than the double-reversing version, presumably because it presumably allocates its buffer stack on the GHC stack rather than spraying junk across the heap.

If I were a Time Lord, I'd go back in time and add to the library, instead of dropWhileEnd as it is, a function with a name like

    takeUntilLastSatisfying p = dropWhileEnd (not . p)

which has a name that explains better what it does, and I'd define dropWhileEnd to be dropWhileEndLE. But I'm not a Time Lord, so I'd like to hear what name or naming approach other people would recommend.

_______________________________________________
Libraries mailing list
Libraries@haskell.org
http://www.haskell.org/mailman/listinfo/libraries