It fails. Note that this isn't a problem with parseOnly- the real code looks like:
parseAllUris :: AP.Parser (Bldr.Builder, [ Urls ])
parseAllUris = msum [ aUri, noUri, finished ]
where
finished = return (mempty, [])
aUri = do
(txt, url) <- AP.match $
URI.uriParser URI.laxURIParserOptions
(bldr, us) <- msum [ noUri, finished ]
return $ (mempty, (Url (E.decodeUtf8 txt) url
: prependText bldr us))
noUri = do
c <- AP.anyChar
(bldr, us) <- parseAllUris
return $ ((Bldr.charUtf8 c) `mappend` bldr, us)
And this has the problem as well- parsing a URL with anything following it fails, and it doesn't detect any URLs. The parseOnly is just the easy way to demonstrate it.
So, my question is, is there some way in attoparsec to tell it to just parse as much as makes sense, and leave the rest? Alternatively, is this a problem with the way URI.ByteString module constructed it's parser, and a different parser could work? Or, worst of all, is this a problem with the way that URIs are defined and no conforming parser will work?
Thanks.