I'm not sure it's possible to do that efficiently.
Here's a hypothetical situation: you have a stream with elements tagged as 1 and 2, but none tagged as 0. If somebody applies an operation to the stream of 0 elements (say stdoutLn), we have to process every single element—and perform every single effect—in the input stream before we know that the stream at index 0 is empty. In general, if we apply an operation to an element of one of the output streams, we'd have to process at minimum all the input elements up to and including that particular element. The important thing is that, semantically, the output of your demux operation is not n independent streams, but n views into a single stream.
It's probably possible to implement a version of this function that doesn't process the *entire* input stream up-front—it would just process as much of the input as it needed when you look at any given element in the output—but it probably needs a different type than just Vector to make it work correctly, and I'm not sure how to do that. More importantly, the behavior of this function would still be confusing; it might *look* like you have several distinct streams, but you'd have to do the effects and store the results of every single step in the input stream even if you only used one of your demuxed streams.
Does that explanation make sense? I haven't done much streaming stuff in a while, so I'm struggling a bit with how to express my intuitions about it.