streaming with logs / metadata

8 Apr 2019

      I have a pattern that's come up several times, and I'm wondering if there's
a name for it, or a better way to do it.  It has to do with processing
streams, which in the presence of laziness and the absence of effects can be
done with just lists.  For instance, imagine a parser that produces [Token],
then various processing stages that go [Token] -> [x], [x] -> [y],
[y] -> [Output].

If I have a transform function that  works on a single element 1:1 and is
stateless, then I can express it with 'map'.  If it needs simple state, then I
can switch 'map' for 'mapAccumL', which is just a specialization of the State
monad.  If it needs to know the next element, I can compose a 'zipNext' (zip xs
(drop 1 xs)) on the front.  If it returns multiple elements, there's concatMap,
and of course mapM for arbitrary monads.

Now suppose it needs warnings, or logging, so to preserve streaming I wind up
with '[Either e a] -> [Either e b]'.  I still want to do all the same stuff,
but factor out the Lefts so I don't have to be explicitly passing through all
the time.  So I start reimplementing the usual list functions:

    module E where
    map :: (a -> b) -> [Either e a] -> [Either e b]
    mapE :: (a -> Either e b) -> [Either e a] -> [Either e b]
    concatMap :: (a -> [b]) -> [Either e a] -> [Either e b]
    concatMapE :: (a -> [Either e b]) -> [Either e a] -> [Either e b]
    mapAccumL :: (state -> a -> (state, Either e b)) -> state -> [Either e a]
        -> (state, [Either e b])

And of course the monadic / applicative variants:

    mapM :: Applicative m => (a -> m b) -> [Either e a] -> [Either e b]
    mapEM :: Monad m => (a -> m (Either e b)) -> [Either e a] -> m [Either e b]

... etc.  But to deal with supplying extra state, or looking into the future
or filtering I have to reinvent those too:

    zip :: [a] -> [Either e b] -> [Either e (a, b)]
    zip{2,3,4,...} :: ...
    zipNexts :: [Either e a] -> [Either e (a, [a])]
    filter :: (a -> Bool) -> [Either e a] -> [Either e a]

The zips become asymmetrical, but they're in service of a common pattern which
I'm not sure how to describe, but is used to get the Lefts out of the way.
E.g. if a duration can be derived from a Right value, to annotate Rights with
start times: 'E.zip (scanl (+) 0 (map durationOf (Either.rights xs))) xs'.
It's a bit unsatisfying because it relies on the transformation preserving 1:1
with its input, or at least new elements are added only at the end and I can
ignore them.  'scanl . map' happens to have this property, but it's not obvious
or guaranteed.

My questions are, is this a known pattern?  Is there an implementation on
hackage, or at least a good thing to call it?

And then, is there a a way to construct these functions systematically, or to
compose them from smaller parts?  The 1:1 stateless ones like 'map' and 'mapM'
work out nicely because they're just 'fmap . fmap' or 'traverse . traverse',
and I can implement 'concatMapM' and then implement a lot of other kinds of
maps as specializations, though efficiency may suffer.  But for the zips and
filters and takeWhiles I wind up just rewriting the recursion.  They feel
ad-hoc and unsatisfying, that if I were more clever I would figure out how to
abstract out the Lefts in one spot, once and for all.

There is a general way to get all of the '[a] -> [b]' functions into
'[Either e a] -> [Either e b]', and that's to partitionsEithers, apply second,
then concat, but of course it spoils streaming and is inefficient if there are
lots of Lefts.  Still, it's the only way I can think of to get those Lefts out
of the picture in a modular way.  Using a Writer I think is equivalent.

There's a similar problem with being able factor out annotations, similar to
the way I want to factor out logs / warnings / other metadata above.  E.g. if
we have '[(annot, a)] -> [(annot, b)]' then we start with a simple fmap.fmap,
but any more complicated transforms means you either have to intersperse the
"ignore annotation" logic and clutter things up with 'snd's and 'second's
everywhere, or come up with a whole new map filter zip toolkit.

Evan Laforge

tags

participants (1)