New subject: haskell xml parsing for larger files?

20 Feb 2014

      Have you looked at tagsoup?
On Feb 20, 2014 3:30 AM, "Christian Maeder" 
wrote:

Hi,

I've got some difficulties parsing "large" xml files (> 100MB).
A plain SAX parser, as provided by hexpat, is fine. However, constructing a
tree consumes too much memory on a 32bit machine.

see http://trac.informatik.uni-bremen.de:8080/hets/ticket/1248

I suspect that sharing strings when constructing trees might greatly reduce
memory requirements. What are suitable libraries for string pools?

Before trying to implement something myself, I'ld like to ask who else has
tried to process large xml files (and met similar memory problems)?

I have not yet investigated xml-conduit and hxt for our purpose. (These
look scary.)

In fact, I've basically used the content trees from "The (simple) xml
package" and switching to another tree type is no fun, in particular if
this gains not much.

Thanks Christian
_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: haskell xml parsing for larger files?

Chris Smith

Christian Maeder

Chris Smith

tags

participants (2)