
I'm reworking external sort to meet the needs of my app and I'm running into some trouble that I was hoping to get some advice on. I'm asking a lot of questions here so don't feel obligated to answer them all if you have input on one point: The code is here: http://tinyurl.com/extsort 1) The sort gives the correct result, but when I tried to sort very large files (~4 Gb) I got this message: tssql: /var/folders/Tl/TlS1rTCyFpWU9s-IsNFkdE+++TI/-Tmp-/sort4106.txt: openBinaryFile: resource exhausted (Too many open files) I tried to be careful about limiting the number of open file handles but I must have done something wrong. Do you see where I'm leaking file handles? 2) I'm guessing there's a smarter way to do unwrapMonads? 3) I'd like to create a function like BS.hGetContents which lazily reads the ByteString but in addition to closing the file after the last byte is read I want it to delete the file that I'm reading from. Like hGetContentsThenDelete. It seems like http://hackage.haskell.org/package/lazyio but it isn't clear to me how I would do this I cant do
LazyIO.run $ do str <- BS.hGetContents file removeFile file return str
can I? I assume that would close the file too soon 4) Is there any other wacky stuff in my code that I should change? I told you it was a lot :-) Thanks! -Keith -- keithsheppard.name

On Sat, Jul 11, 2009 at 08:40:10PM -0400, Keith Sheppard wrote:
2) I'm guessing there's a smarter way to do unwrapMonads?
unwrapMonads is actually sequence :), see [1]. [1] http://hackage.haskell.org/packages/archive/base/latest/doc/html/src/Control... -- Felipe.

Felipe,
Thanks. I need to learn how to use hoogle better :-)
All,
I've just figured out why I'm leaking file handles so please ignore question 1
-Keith
On Sat, Jul 11, 2009 at 9:59 PM, Felipe Lessa
On Sat, Jul 11, 2009 at 08:40:10PM -0400, Keith Sheppard wrote:
2) I'm guessing there's a smarter way to do unwrapMonads?
unwrapMonads is actually sequence :), see [1].
[1] http://hackage.haskell.org/packages/archive/base/latest/doc/html/src/Control...
-- Felipe. _______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
-- keithsheppard.name

On Sat, Jul 11, 2009 at 08:40:10PM -0400, Keith Sheppard wrote:
4) Is there any other wacky stuff in my code that I should change?
I would probably write readBinFiles as
readBinFiles :: [String] -> IO [BS.ByteString] readBinFiles = mapM readB where readB file = openBinaryFile file ReadMode >>= BS.hGetContents
You may also write pointless code ;)
readBinFiles :: [String] -> IO [BS.ByteString] readBinFiles = mapM_ $ flip (>>=) BS.hGetContents . flip openBinaryFile ReadMode
Another way of improving your code is trying to write the functions in the order that one would read them (that is, bottom-up or top-down). In the start you seem to be following a top-down approach until you reach a referecen to bufferPartialSortsBy which is on the other side :). HTH, -- Felipe.

On Sun, Jul 12, 2009 at 4:27 AM, Felipe Lessa
On Sat, Jul 11, 2009 at 08:40:10PM -0400, Keith Sheppard wrote:
4) Is there any other wacky stuff in my code that I should change?
I would probably write readBinFiles as
readBinFiles :: [String] -> IO [BS.ByteString] readBinFiles = mapM readB where readB file = openBinaryFile file ReadMode >>= BS.hGetContents
You may also write pointless code ;)
readBinFiles :: [String] -> IO [BS.ByteString] readBinFiles = mapM_ $ flip (>>=) BS.hGetContents . flip openBinaryFile ReadMode
You can greatly improve that by using the kleisli composition operator :
readBinFiles = mapM (BS.hGetContents <=< flip openBinaryFile ReadMode)
But if this is lazy bytestrings, this will leak handles like crazy... A nice solution would be to use the safe-lazy-io package : it is easy to add a finalizer that will remove the file once hGetContents is finished (with System.IO.Lazy.Internal.finallyLI) and to read a list of files lazily without leaking handles (see System.IO.Lazy.concat). http://hackage.haskell.org/package/safe-lazy-io -- Jedaï
participants (3)
-
Chaddaï Fouché
-
Felipe Lessa
-
Keith Sheppard