
Hi, What is the status of the MIME Strike Force? The goals proposed at http://www.haskell.org/haskellwiki/Libraries_and_tools/MIMEStrikeForce promise a very useful library. Has the design been initiated? Kind regards, Arie

At Wed, 27 Jun 2007 18:54:58 +0200 (CEST), Arie Peterson wrote:
What is the status of the MIME Strike Force?
The goals proposed at http://www.haskell.org/haskellwiki/Libraries_and_tools/MIMEStrikeForce promise a very useful library. Has the design been initiated?
Currently it is on hold while I work on some other higher priority projects. But, I do hope to get back to it soon. (Or, perhaps someone else will have time to work on it). It seems to me that the mime-string package is very close to what we need for parsing MIME messages. If all you care about is parsing MIME messages, I highly recommend it. This leaves the problem of creating and transforming MIME messages. The real difficulty in this area is creating a good API. If you want to take a peek at what I have, take a look at: http://www.n-heptane.com/nhlab/repos/haskell-mime/APIs.hs Currently I am difficulty dealing with headers that could appear more than once. For example, the 'Keywords' header can appear multiple times. So, there are two cases to handle: 1. add an additional Keywords header 2. deleted any existing Keywords headers and add the new one Other fields, such as 'Subject', can appear only once. One way to express a filter that modifies an existing message is the following:
exampleHeaders = ( (setHeader (Subject "whee")) . (setHeader (Subject "bork")). (addHeader (Keywords ["baz", "bar", "bam"])) . (addHeader (Keywords ["zip", "zap", "zop"])) )
where setHeader ensures that a header only appears once, and addHeader appends the header, leaving existing instances alone. The type system ensures that you can never call addHeader on (Subject "whee"). Unfortunately, that code seems a bit verbose. we can make some helper functions that reduce the verbosity a bit,
subject = setHeader . Subject keywords = addHeader . Keywords
exampleHeaders3 :: [RawHeader] -> [RawHeader] exampleHeaders3 = ((subject "whee") . (subject "bork") . (keywords ["baz", "bar", "bam"]) . (keywords ["zip", "zap", "zop"]))
That is nice, except that we don't know which headers are going to use setHeader and which are doing to use addHeader. So, the results might be a bit suprising. Additionally, keywords always uses addHeader, but in some cases we might want it to use setHeader. Another option is to trying to use infix operators, such as: (.+.) for setHeader (.*.) for addHeader
exampleHeaders2 :: [RawHeader] exampleHeaders2 = ((Subject "whee") .+. (Subject "bork") .+. (Keywords ["baz", "bar", "bam"]) .*. (Keywords ["zip", "zap", "zop"]) .*. empty )
This is good because: 1. is it shorter than the first example that used setHeader/addHeader 2. the information about whether setHeader/addHeader is being used is preserved. 3. it is easy to choose which one you want (setHeader/addHeader) But, it is also really wonky because the operator has a bit of a postfix feel to it. For example, it is the .*. at the end of this line that is making it use addHeader.
(Keywords ["baz", "bar", "bam"]) .*.
If we wanted to use setHeader here, we would have to change it to:
(Keywords ["baz", "bar", "bam"]) .+.
This behaviour seems pretty unintutive. A whole other area I have not dealt with yet is data-mining and filters that depend on the values of existing fields. For example: 1. find all the headers that contain the string XXX 2. find all the Keywords fields and merge them into a single Keywords field So, that is where I currently am. Once the API is worked out, I think things should progress pretty easily. If you have any ideas, let me know. I think I may be picking this up again in September or October. HaXml, SYB, Uniplate, and HList seem like good places to get some ideas. If anyone has suggestions for other projects to look at, let me know. j.

exampleHeaders2 :: [RawHeader] exampleHeaders2 = ((Subject "whee") .+. (Subject "bork") .+. (Keywords ["baz", "bar", "bam"]) .*. (Keywords ["zip", "zap", "zop"]) .*. empty ) [...] But, it is also really wonky because the operator has a bit of a postfix feel to it. For example, it is the .*. at the end of this line that is making it use addHeader.
So why not use flip on each operator to get empty .*. (Subject "blah") .+. ... ? That might be a little bit more comfortable ? Marc Weber

At Wed, 27 Jun 2007 21:14:02 +0200, Marc Weber wrote:
exampleHeaders2 :: [RawHeader] exampleHeaders2 = ((Subject "whee") .+. (Subject "bork") .+. (Keywords ["baz", "bar", "bam"]) .*. (Keywords ["zip", "zap", "zop"]) .*. empty ) [...] But, it is also really wonky because the operator has a bit of a postfix feel to it. For example, it is the .*. at the end of this line that is making it use addHeader.
So why not use flip on each operator to get empty .*. (Subject "blah") .+. ... ?
That might be a little bit more comfortable ?
Ah, good idea. I'll play with that. Thanks! j.

On Wed, 27 Jun 2007, Jeremy Shaw wrote:
Currently I am difficulty dealing with headers that could appear more than once.
I recommend that you treat the header fields as an ordered list. Do not
use the latitude that the specification gives you to re-order headers, and
do not assume that messages you have to process will be within the minimum
and maximum count requirements for each field. (This rules out encoding
those requirements in the type system.) Postmasters will hate your
software if you do either of these things :-)
You need to support appending new fields to the top as well as the bottom
of the header. Although it's traditional in most situations to add a new
field to the bottom of the header, Received: fields must be added to the
start. For any application that does message processing on an MTA, it's a
*really* good idea to add new fields to the top of the header, so that
they appear interspersed amongst the Received: lines in a way that
indicates where and when the processing happened. Doing this also means
your program will play nicely with DKIM. Ignore the strict syntax in RFC
2822 that prevents arbitrary trace header fields: this is a bug that will
be fixed in the next version of the spec, and in practice software doesn't
mind unexpected trace fields.
Tony.
--
f.a.n.finch

Jeremy Shaw wrote:
What is the status of the MIME Strike Force?
Currently it is on hold while I work on some other higher priority projects. But, I do hope to get back to it soon. (Or, perhaps someone else will have time to work on it).
OK. Good to hear it is still alive, if slumbering. I know nothing about MIME, so I will make some comments ;-).
One way to express a filter that modifies an existing message is the following:
exampleHeaders = ( (setHeader (Subject "whee")) . (setHeader (Subject "bork")). (addHeader (Keywords ["baz", "bar", "bam"])) . (addHeader (Keywords ["zip", "zap", "zop"])) )
where setHeader ensures that a header only appears once, and addHeader appends the header, leaving existing instances alone. The type system ensures that you can never call addHeader on (Subject "whee"). Unfortunately, that code seems a bit verbose.
If you want to signify that some headers are added and others replaced - which seems a good idea - then it's not so bad, is it? Perhaps replacing 'setHeader' by 'set', and removing some parentheses, it's pretty minimal: modifyHeaders = set (Subject "whee") . add (Keywords ["quux","blub"]) Anyway, I wouldn't bother too much about the exact syntax, at this stage.
A whole other area I have not dealt with yet is data-mining and filters that depend on the values of existing fields. For example:
1. find all the headers that contain the string XXX 2. find all the Keywords fields and merge them into a single Keywords field
Right. Another one: 3. Examine the contents of a certain header, and use the result to modify other headers. (Think of a spam filter, for instance.) I'm not sure what types of transformation we need to support. (I personally only need 'pure' parsing and composing, not this 'on-the-fly' transforming, but it is clearly necessary for some applications.) There are many similar problems. Suppose you need to change a CSS file, by changing RGB colours to corresponding HSL colours, but only within certain media sections and selectors. Furthermore, layout/whitespace/comments must be preserved as much as possible. As with modifying an e-mail on-the-fly, this cannot be done by first parsing the whole thing, then applying the transformation as a pure function, then unparsing. I have a vague idea on a way to deal with this, using some kind of stateful stream processor. I'll try to code it up some time; maybe it could be useful for MIME handling as well. Greetings, Arie
participants (4)
-
Arie Peterson
-
Jeremy Shaw
-
Marc Weber
-
Tony Finch