
See also the incremental XML parser in HaXml, described in "Partial parsing: combining choice with commitment", IFL 2006. It has constant space usage (for some patterns of usage), even with extremely large inputs. http://www.google.co.uk/url?sa=t&rct=j&q=malcolm+wallace+partial+parsing&source=web&cd=2&ved=0CEEQFjAB&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.135.7512%26rep%3Drep1%26type%3Dpdf&ei=Db3BUNmiOsfS4QTAkoDYAw&usg=AFQjCNHHywUCvaFv8eBoQ-x9jj4GOMHo2w On 5 Dec 2012, at 05:37, Johan Tibell wrote:
Hi Oleg,
On Tue, Dec 4, 2012 at 9:13 PM,
wrote: I am doing, for several months, constant-space processing of large XML files using iteratees. The file contains many XML elements (which are a bit complex than a number). An element can be processed independently. After the parser finished with one element, and dumped the related data, the processing of the next element starts anew, so to speak. No significant state is accumulated for the overall parsing sans the counters of processed and bad elements, for statistics. XML is somewhat like JSON, only more complex: an XML parser has to deal with namespaces, parsed entities, CDATA sections and the other interesting stuff. Therefore, I'm quite sure there should not be fundamental problems in constant-space parsing of JSON.
BTW, the parser itself is described there http://okmij.org/ftp/Streams.html#xml
It certainly is possible (using a SAX style parser). What you can't have (I think) is a function:
decode :: FromJSON a => ByteString -> Maybe a
and constant-memory parsing at the same time. The return type here says that we will return Nothing if parsing fails. We can only do so after looking at the whole input (otherwise how would we know if it's malformed).
The use cases aeson was designed for (which I bet is the majority use case) is parsing smaller messages sent over the network (i.e. in web service APIs) so this is the only mode of parsing it supplies.
-- Johan
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe