My proposal to move liftA2 into the Applicative class should help get fmaps out of internal nodes in traverse implementations. But there are still two trouble spots:

1. The smaller trouble spot is at the root, when dealing with newtype wrappers. For example, the simplest way to write traverse for Data.Sequence.Seq would be

instance Traversable Seq where

traverse f (Seq xs) = Seq <$> traverse f xs

The extra fmap may not be free. Today, the only fix is to manually inline the definition of the underlying traversal and manually fuse the fmaps. It works, but it's gross! We could work around this easily using an extra Traversable method:

mapTraversed :: Applicative f

=> (t b -> r) -> (a -> f b) -> t a -> f r

but that wouldn't help with the larger problem below.

2. Leaves are really painful. Data structures that store at most one element per leaf make obvious traversals expensive. For example, in Data.Map, the obvious implementation would be

instance Traversable (Map k) where

traverse _ Tip = pure Tip

traverse f (Bin s k v l r) =

liftA3 (Bin s k) (f v) (traverse f l) (traverse f r)

The trouble is that we would generate a slew of `pure Tip`s that we'd then have to combine. For instance, (Bin 1 k v Tip Tip) would give liftA3 (Bin 1 k) (f v) (pure Tip) (pure Tip)

Today we work around this by looking ahead at the children of the element we're considering, so we get

fmap (\v' -> Bin 1 k v' Tip Tip) (f v)

instead, which is still pretty lousy but better.

It makes me rather sad to have to write disgustingly ugly Traversable instances just to avoid silly performance issues like this. Does anyone have an idea for fixing (2), and ideally simultaneously taking care of (1)?

Thanks,

David Feuer