
Hi there, If I take example imap.hs import System.IO import Network.HaskellNet.IMAP import Text.Mime import qualified Data.ByteString.Char8 as BS import Control.Monad -- the next lines were changed to fit to my local imap server imapServer = "imap.mail.org" user = "" pass = "" main = do con <- connectIMAP imapServer login con user pass mboxes <- list con mapM print mboxes select con "INBOX" msgs <- search con [ALLs] mapM_ (\x -> print x) (take 4 msgs) forM_ (take 4msgs) (\x -> fetch con x >>= print) and change the last line (in order to print all messages of the mailbox) into: forM_ msgs (\x -> fetch con x >>= print) I get Stack space overflow: current size 8388608 bytes. Use `+RTS -Ksize -RTS' to increase it. Is this something to be fixed in the HaskellNet code, or in the example code? -- Manfred

On Mon, 25 Jul 2011, Manfred Lotz wrote:
Hi there, If I take example imap.hs
import System.IO import Network.HaskellNet.IMAP import Text.Mime import qualified Data.ByteString.Char8 as BS import Control.Monad
-- the next lines were changed to fit to my local imap server imapServer = "imap.mail.org" user = "" pass = ""
main = do con <- connectIMAP imapServer login con user pass mboxes <- list con mapM print mboxes
This should be mapM_ and 'ghc -Wall' spots this problem since 6.12.
select con "INBOX" msgs <- search con [ALLs] mapM_ (\x -> print x) (take 4 msgs) forM_ (take 4msgs) (\x -> fetch con x >>= print)

On Tue, 26 Jul 2011 10:17:22 +0200 (CEST)
Henning Thielemann
On Mon, 25 Jul 2011, Manfred Lotz wrote:
Hi there, If I take example imap.hs
import System.IO import Network.HaskellNet.IMAP import Text.Mime import qualified Data.ByteString.Char8 as BS import Control.Monad
-- the next lines were changed to fit to my local imap server imapServer = "imap.mail.org" user = "" pass = ""
main = do
s> > con <- connectIMAP imapServer
login con user pass mboxes <- list con mapM print mboxes
This should be mapM_ and 'ghc -Wall' spots this problem since 6.12.
The compiler (7.04) doesn't tell me anything about it.
select con "INBOX" msgs <- search con [ALLs] mapM_ (\x -> print x) (take 4 msgs) forM_ (take 4msgs) (\x -> fetch con x >>= print)
I'm not quite sure I understand what you mean. Stack overflow comes from this: forM_ msgs (\x -> fetch con x >>= print) If I change it to: mapM_ (\x -> fetch con x >>= print) msgs there is the same stack overflow. -- Thanks, Manfred

Quoth Manfred Lotz
I'm not quite sure I understand what you mean. Stack overflow comes from this: forM_ msgs (\x -> fetch con x >>= print)
If I change it to: mapM_ (\x -> fetch con x >>= print) msgs
there is the same stack overflow.
I didn't understand that myself, but neither do I know what might be wrong. One thing to consider is that email messages can be very large. Looking at messages received in the last 10 days I see I have one that exceeds your reported stack size, and that isn't counting the extra space required for text representation of non printing characters etc. There may be messages that you simply can't "print". The HaskellNet IMAP "fetch" is actually FETCH ... BODY[], i.e., the whole contents of the message. Normal practice for giant data files is to send them as part of a MIME multipart/mixed message, and something like the above can proceed with a reasonable chance of success if it avoids these attachments by fetching BODY[1] (or BODY[1.1], etc. depending on actual structure.) I just fetched the 10Mb message I mentioned above to check the structure, and it happened in the blink of an eye - BODY[1] is smaller than the header. I don't see any support for fetch by part, you might have to hack that up yourself. You may ideally also want to fetch BODYSTRUCTURE, but practically I might go out on a limb and predict that you won't run into messages where the first part is a multipart/mixed with a large attachment - so if the object is just a survivable first part, you could live without BODYSTRUCTURE analysis and optimistically ask for BODY[1]. Moving on to practical use of email via IMAP, you'd also want to be able to fetch and decode the attachments. At this point, it's interesting to return to the question of space requirements. Donn

On Tue, 26 Jul 2011 14:17:08 -0700 (PDT)
Donn Cave
Quoth Manfred Lotz
, ... I'm not quite sure I understand what you mean. Stack overflow comes from this: forM_ msgs (\x -> fetch con x >>= print)
If I change it to: mapM_ (\x -> fetch con x >>= print) msgs
there is the same stack overflow.
I didn't understand that myself, but neither do I know what might be wrong. One thing to consider is that email messages can be very large. Looking at messages received in the last 10 days I see I have one that exceeds your reported stack size, and that isn't counting the extra space required for text representation of non printing characters etc. There may be messages that you simply can't "print".
The HaskellNet IMAP "fetch" is actually FETCH ... BODY[], i.e., the whole contents of the message. Normal practice for giant data files is to send them as part of a MIME multipart/mixed message, and something like the above can proceed with a reasonable chance of success if it avoids these attachments by fetching BODY[1] (or BODY[1.1], etc. depending on actual structure.) I just fetched the 10Mb message I mentioned above to check the structure, and it happened in the blink of an eye - BODY[1] is smaller than the header.
I don't see any support for fetch by part, you might have to hack that up yourself. You may ideally also want to fetch BODYSTRUCTURE, but practically I might go out on a limb and predict that you won't run into messages where the first part is a multipart/mixed with a large attachment - so if the object is just a survivable first part, you could live without BODYSTRUCTURE analysis and optimistically ask for BODY[1].
Moving on to practical use of email via IMAP, you'd also want to be able to fetch and decode the attachments. At this point, it's interesting to return to the question of space requirements.
Donn
The problem seems to lie in the HaskellNet package. If for example I only fetch a specific message m <- fetch con 2092 having a size of some 1.2m then I get the same stack overflow. If at runtime I specify +RTS -K40M -RTS it works but takes over 40 seconds to fetch the message. -- Manfred

Quoth Manfred Lotz
The problem seems to lie in the HaskellNet package. If for example I only fetch a specific message m <- fetch con 2092 having a size of some 1.2m then I get the same stack overflow.
If at runtime I specify +RTS -K40M -RTS it works but takes over 40 seconds to fetch the message.
That's not so good, but I wouldn't be surprised if it's a natural parsing problem, I mean it's just a lot of data to run through a Haskell parser. IMAP does give you the means to mitigate the problem - the big data transfer in a FETCH response is preceded by a byte count - but to really take advantage of that, how far do you go? I don't have much experience with general purpose parsers, do they often support an efficient counted string read? Is it OK to return String, or do we need a more efficient type (e.g., ByteString?) Is it OK to return any kind of string value - given that a message part could be arbitrarily long (and needs to be decoded), do you go to a lot of trouble to support large message parts but not extremely large ones? For me, the answer is for the parser to bail out, reporting the counted input as a count but leaving it to the application to actually effect the data transfer and return to finish the parse. That's only moderately complicated, but it's part of a general philosophy about application driven I/O vs. protocol parsing that seems to be mine alone. I have no idea how much could be done to tighten up HaskellNet.IMAP. Someone who understands it well enough might be able to get a miraculous improvement with a strictness annotation or something. Maybe you could track that down with profiling. Donn

On Wed, 27 Jul 2011 09:47:52 -0700 (PDT)
Donn Cave
Quoth Manfred Lotz
, ... The problem seems to lie in the HaskellNet package. If for example I only fetch a specific message m <- fetch con 2092 having a size of some 1.2m then I get the same stack overflow.
If at runtime I specify +RTS -K40M -RTS it works but takes over 40 seconds to fetch the message.
That's not so good, but I wouldn't be surprised if it's a natural parsing problem, I mean it's just a lot of data to run through a Haskell parser.
Yep, I agree. Perhaps the library should provide a fetchRaw function to get the whole message without much parsing. Perhaps I could tell the package author what the problem is, and he is happy to provide a solution. In the end the only thing I need is to get the full message because I want to feed bogofilter to learn that a message is ham or spam. For the time being I decided to write my own program to fetch the data because it is a good exercise for a Haskell beginner as I am. -- Manfred

Quoth Manfred Lotz
In the end the only thing I need is to get the full message because I want to feed bogofilter to learn that a message is ham or spam.
For the time being I decided to write my own program to fetch the data because it is a good exercise for a Haskell beginner as I am.
Sure, for a very limited case where you don't have to support any options at all, it's as easy as you want it to be. All the responses are one line only and only one to a line, so you can read line by line (and then switch to block read after a line that ends with {count}.) The way I understand it, though, you do not need the full message, you would be better off with the first part only. The following parts in a multipart/mixed message will just be reams of byte64 encoded nonsense for bogofilter's purposes, true? If you fetch "BODY[1]", you may once in a while get more than you need - both text and HTML versions of a multipart/alternative sub-part - but that won't happen often. Donn

Hello,
I guess I should stick my hand up as the supposed maintainer of HaskellNet.
Unfortunately I can't say that I know the code that well. Two years ago I
rescued it
from bitrot cabalized it and when I couldn't get any response from the
original author put myself down as the maintainer.
It is a package which is starting to show its age. Michael Snoyman and I
had a conversation in February agreeing that I should try to revamp it by
applying some techniques such as those used in blaze html. Unfortunately,
I haven't had the time.
I agree with the post above that for mime mail HaskellNet shouldn't be
retreiving all of the messages with their message bodies. I might see if I
can get a chance to work on it a little this weekend but if someone is using
the library and has some time to make some changes that person woould be
very welcome (and I'd be more than happy if someone wishes to take over as
the maintenance).
-Rob
On Wed, Jul 27, 2011 at 5:47 PM, Donn Cave
Quoth Manfred Lotz
, ... The problem seems to lie in the HaskellNet package. If for example I only fetch a specific message m <- fetch con 2092 having a size of some 1.2m then I get the same stack overflow.
If at runtime I specify +RTS -K40M -RTS it works but takes over 40 seconds to fetch the message.
That's not so good, but I wouldn't be surprised if it's a natural parsing problem, I mean it's just a lot of data to run through a Haskell parser.
IMAP does give you the means to mitigate the problem - the big data transfer in a FETCH response is preceded by a byte count - but to really take advantage of that, how far do you go? I don't have much experience with general purpose parsers, do they often support an efficient counted string read? Is it OK to return String, or do we need a more efficient type (e.g., ByteString?) Is it OK to return any kind of string value - given that a message part could be arbitrarily long (and needs to be decoded), do you go to a lot of trouble to support large message parts but not extremely large ones?
For me, the answer is for the parser to bail out, reporting the counted input as a count but leaving it to the application to actually effect the data transfer and return to finish the parse. That's only moderately complicated, but it's part of a general philosophy about application driven I/O vs. protocol parsing that seems to be mine alone.
I have no idea how much could be done to tighten up HaskellNet.IMAP. Someone who understands it well enough might be able to get a miraculous improvement with a strictness annotation or something. Maybe you could track that down with profiling.
Donn
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Tue, 26 Jul 2011, Manfred Lotz wrote:
main = do s> > con <- connectIMAP imapServer login con user pass mboxes <- list con mapM print mboxes
This should be mapM_ and 'ghc -Wall' spots this problem since 6.12.
The compiler (7.04) doesn't tell me anything about it.
It seems that it is no longer part of -Wall. But since this mistake is very common, I think it would be better. Problem is, that several libraries like parser libraries are designed for silently throwing away results. You have to switch on -fwarn-unused-do-bind, according to http://www.haskell.org/ghc/docs/7.0-latest/html/users_guide/options-sanity.h... Also in case this does not fix your stack space overflow, (mapM_ print) is the correct (space-efficient) way.
I'm not quite sure I understand what you mean. Stack overflow comes from this: forM_ msgs (\x -> fetch con x >>= print)
If I change it to: mapM_ (\x -> fetch con x >>= print) msgs
there is the same stack overflow.
forM_ and mapM_ are equal in this respect, the underscore is important.

This should be mapM_ and 'ghc -Wall' spots this problem since 6.12.
The compiler (7.04) doesn't tell me anything about it.
Henning> It seems that it is no longer part of -Wall. Indeed, that's not part of -Wall. http://www.haskell.org/ghc/docs/7.0.4/html/users_guide/options-sanity.html Am I the only one who assumed so far that Wall turned on all existing warnings ?
From the doc :
-Wall: Turns on all warning options that indicate potentially suspicious code. The warnings that are not enabled by -Wall are -fwarn-tabs, -fwarn-incomplete-record-updates, -fwarn-monomorphism-restriction, -fwarn-unused-do-bind, and -fwarn-implicit-prelude. -w: Turns off all warnings, including the standard ones and those that -Wall doesn't enable. If there were no backward compatibility issues, I'd prefer to just see -w and -Wall swaped. -w would mean "We let the GHC team decide what subset of warnings they really want us to observe", and -Wall would mean "We really want them all". -- Paul

On 27 July 2011 17:42, Paul R
This should be mapM_ and 'ghc -Wall' spots this problem since 6.12.
The compiler (7.04) doesn't tell me anything about it.
Henning> It seems that it is no longer part of -Wall.
Indeed, that's not part of -Wall.
http://www.haskell.org/ghc/docs/7.0.4/html/users_guide/options-sanity.html
Am I the only one who assumed so far that Wall turned on all existing warnings ?
From the doc :
-Wall: Turns on all warning options that indicate potentially suspicious code. The warnings that are not enabled by -Wall are -fwarn-tabs, -fwarn-incomplete-record-updates, -fwarn-monomorphism-restriction, -fwarn-unused-do-bind, and -fwarn-implicit-prelude.
-w: Turns off all warnings, including the standard ones and those that -Wall doesn't enable.
If there were no backward compatibility issues, I'd prefer to just see -w and -Wall swaped. -w would mean "We let the GHC team decide what subset of warnings they really want us to observe", and -Wall would mean "We really want them all".
Ummm, going by what you quoted, -w _disables_ warnings, which isn't at all like you say you want it to be! ;-) -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Hem hem ... I should never try to write anything sensible before putting my thick glasses. -w does not turn ON all warnings, but turns them OFF, so my previous comment regarding swapping its definition with -Wall is just nonsense. Sorry for the noise. Still, do you think there could be room for a -Wsuspicious that would be defined as current -Wall, and for a more intuitive meaning for -Wall : turns on really all warnings ? Paul> Indeed, that's not part of -Wall. Paul> http://www.haskell.org/ghc/docs/7.0.4/html/users_guide/options-sanity.html Paul> Am I the only one who assumed so far that Wall turned on all Paul> existing warnings ? Paul> From the doc : Paul> -Wall: Turns on all warning options that indicate potentially Paul> suspicious code. The warnings that are not enabled by -Wall Paul> are -fwarn-tabs, -fwarn-incomplete-record-updates, -fwarn-monomorphism-restriction, -fwarn-unused-do-bind, Paul> and -fwarn-implicit-prelude. Paul> -w: Turns off all warnings, including the standard ones and those Paul> that -Wall doesn't enable. Paul> If there were no backward compatibility issues, I'd prefer to just Paul> see -w and -Wall swaped. -w would mean "We let the GHC team decide Paul> what subset of warnings they really want us to observe", and -Wall Paul> would mean "We really want them all". -- Paul

On 27 July 2011 17:56, Paul R
Hem hem ... I should never try to write anything sensible before putting my thick glasses. -w does not turn ON all warnings, but turns them OFF, so my previous comment regarding swapping its definition with -Wall is just nonsense. Sorry for the noise.
Still, do you think there could be room for a -Wsuspicious that would be defined as current -Wall, and for a more intuitive meaning for -Wall : turns on really all warnings ?
Definitely. -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com
participants (6)
-
Donn Cave
-
Henning Thielemann
-
Ivan Lazar Miljenovic
-
Manfred Lotz
-
Paul R
-
Robert Wills