Re: [Haskell-beginners] How to Lex, Parse, and Serialize-to-XML email messages

Hi Roger,
I realize you've already finished with the project, but for the future I
think its a lot easier to use a parser combinator with Text.Parsec and
Text.Parsec.String to do a similar thing. For example, if you were parsing
XML to get a parse a single tag, you would try something like this:
parseTag :: Parser Tag
parseTag = many1 alphanum <?> "tag"
To get a tagged form, try
parseTagged :: Parser (Tag, [Elem])
parseTagged = do
char '<'
name <- parseTag
char '>'
content <- many (try parseElem)
string "</"
parseTag
char '>'
return (name, content)
<?> "tagged form"
and so one. I haven't tried this out, but a parser similar to yours would
go something like this:
--Datatypes
type DisplayName = String
type EmailAddress = String
data Mailbox = Mailbox DisplayName EmailAddress deriving (Show)
parseFromHeader :: Parser [Mailbox]
parseFromHeader = do
string "From: "
mailboxes = many (try parseMailbox)
return mailboxes
parseMailbox :: Parser Mailbox
parseMailbox = do
parseComments
-- Names are optional
parseComments
name <- try parseDisplayName
parseComments
address <- parseEmailAddress
parseComments
try char ','
return Mailbox name address
> "Parse an indidivuals mailbox"
parseEmailAddress :: Parser EmailAddress
parseEmailAddress = do
try char '<'
handle <- many1 (noneof "@") -- Or whatever is valid here
char '@'
domain <- parseDomain
try char '<'
return handle++@++domain
parseDomain :: Parser String
parseDomain =
(char '[' >> parseDomain >>= (\domainName -> do char ']'
return domainName))
<|> parseWebsiteName >>= return
And so on. Again, I've tested none of the Email header bits but the XML bit
works. It requires some level of comfort with monadic operations, but
beyond that I think it's a much simpler may to parse.
Regards,
Tim Holland
On 28 June 2013 03:00,
Send Beginners mailing list submissions to beginners@haskell.org
To subscribe or unsubscribe via the World Wide Web, visit http://www.haskell.org/mailman/listinfo/beginners or, via email, send a message with subject or body 'help' to beginners-request@haskell.org
You can reach the person managing the list at beginners-owner@haskell.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of Beginners digest..."
Today's Topics:
1. data declaration using other type's names? (Patrick Redmond) 2. Re: data declaration using other type's names? (Brandon Allbery) 3. Re: data declaration using other type's names? (Nikita Danilenko) 4. Re: what to do about excess memory usage (Chadda? Fouch?) 5. Re: what to do about excess memory usage (James Jones) 6. How to Lex, Parse, and Serialize-to-XML email messages (Costello, Roger L.)
----------------------------------------------------------------------
Message: 1 Date: Thu, 27 Jun 2013 11:24:51 -0400 From: Patrick Redmond
Subject: [Haskell-beginners] data declaration using other type's names? To: beginners@haskell.org Message-ID: Content-Type: text/plain; charset=UTF-8 Hey Haskellers,
I noticed that ghci lets me do this:
data Foo = Int Int | Float :t Int Int :: Int -> Foo :t Float Float :: Foo :t Int 4 Int 4 :: Foo
It's confusing to have type constructors that use names of existing types. It's not intuitive that the name "Int" could refer to two different things, which brings me to:
data Bar = Bar Int :t Bar Bar :: Int -> Bar
Yay? I can have a simple type with one constructor named the same as the type.
Why is this allowed? Is it useful somehow?
--Patrick
------------------------------
Message: 2 Date: Thu, 27 Jun 2013 11:37:46 -0400 From: Brandon Allbery
Subject: Re: [Haskell-beginners] data declaration using other type's names? To: The Haskell-Beginners Mailing List - Discussion of primarily beginner-level topics related to Haskell Message-ID: < CAKFCL4U-E4B_+cts0vpNX8Ar9wccQDjgzWOYHLXLsLAv+Qn_cg@mail.gmail.com> Content-Type: text/plain; charset="utf-8" On Thu, Jun 27, 2013 at 11:24 AM, Patrick Redmond
wrote:
I noticed that ghci lets me do this:
Not just ghci, but ghc as well.
Yay? I can have a simple type with one constructor named the same as the type. Why is this allowed? Is it useful somehow?
It's convenient for pretty much the situation you showed, where the type constructor and data constructor have the same name. A number of people do advocate that it not be used, though, because it can be confusing for people. (Not for the compiler; data and type constructors can't be used in the same places, it never has trouble keeping straight which is which.)
It might be best to consider this as "there is no good reason to *prevent* it from happening, from a language standpoint".
-- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net
participants (1)
-
Tim Holland