file splitter with enumerator package

22 Jul 2011

      Hi everyone,

A friend of mine recently asked if I knew of a utility to split a
large file (4gb in his case) into arbitrarily-sized files on Windows.
Although there are a number of file-splitting utilities, the catch was
it couldn't break in the middle of a line. When the standard "why
don't you use Linux?" response proved unhelpful, I took this as an
opportunity to write my first program using the enumerator package.

If anyone has time, I'm really interested in knowing if there's a
better way to take the incoming stream and output it directly to a
file. The basic steps I'm taking are:

1) Data.Enumerator.Binary.take -- grabs the user-specified number of
bytes, then (because it returns a lazy ByteString) I use
Data.ByteString.Lazy.hPut to output the chunk
2) Data.Enumerator.Binary.head -- after using take for the big chunk,
it inspects and outputs individual characters and stops after it
outputs the next newline character
3) I close the handle that steps 1&2 used to output the data and then
repeat 1&2 with the next handle (an infinite lazy list of filepaths
like part1.csv, part2.csv, and so on)

The full code is pasted here: http://hpaste.org/49366, and while I'd
like to get any other feedback on how to make it better, I want to
note that I'm not planning to release this as a utility so I wouldn't
want anyone to spend extra time performing a full code review.

Thanks!
Eric

Eric Rasmussen

Felipe Almeida Lessa

Eric Rasmussen

Yves Parès

Felipe Almeida Lessa

Eric Rasmussen

Yves Parès

David McBride

Yves Parès

David McBride

Eric Rasmussen

David McBride

yi huang

yi huang

tags

participants (5)