
Michael Snoyman
In this particular case, it will work due to the implementation of snk. In general, however, you're correct: you should not use the same sink twice.
I haven't thought about it much yet, but my initial recommendation would be to create a new Conduit using SequencedSink, which takes the three lines and then switches over to a passthrough conduit. The result looks like this:
I think I'm getting the conduit stuff, at least on a high level. As a
little exercise I have ported a simplified variant of the 'netlines'
enumerator to the conduit library. This is the code:
import qualified Data.ByteString as B
netLine :: (Resource m) => Int -> Sink B.ByteString m B.ByteString
netLine n0 = sinkState (n0, B.empty) push (return . snd)
where
push (n, str') dstr' =
return $
case B.elemIndex 10 dstr' of
Nothing ->
let dstr = B.take n dstr'
str = B.append str' dstr
in str `seq` StateProcessing (n - B.length dstr, str)
Just i ->
let (pfx, sfx) = B.splitAt i dstr'
str = B.append str' (B.take n pfx)
in str `seq` StateDone (Just . B.copy $ B.tail sfx) str
netLines :: (Resource m) => Int -> Conduit B.ByteString m B.ByteString
netLines n = sequenceSink () (\s -> fmap (\ln -> Emit s [ln]) (netLine n))
It reads a 256 MiB file with random data in 1.3 seconds and runs in
constant memory for infinite lines. This is reassuring.
But anyway, is this the proper/idiomatic way to do it, or would you go
for a different direction?
Greets,
Ertugrul
--
Key-ID: E5DD8D11 "Ertugrul Soeylemez