it's easy to see that it's not just about more than one output per input. It's about n pieces of input producing m pieces of output, where (n,m) may even -- and probably does -- depend on previous inputs!
The exercise asks for an implementation of the following Arrow instance:
> first :: arr a b -> arr (a,c) (b,c)
which, specialized to our case, is just SP a b -> SP (a,c) (b,c).
It should now be apparent what the 'trickiness' is. On the one hand, indeterminate a's need to be fed in before indeterminate b's get pulled out. On the other hand, the c's need to behave as if they were in a no-op assembly line. One c goes in, the one (and same!) c drops out.