
7 Oct
2010
7 Oct
'10
8:41 a.m.
Michael Snoyman
As far as I know, Neil Mitchel's tagsoup[1] parses according to the HTML 5 parsing rules, but it just generates a list of Tags[2], so you'd have to build the DOM tree up from there. I personally have had great experience with tagsoup. It's even the core of HTML-scraping technology powering searchonce[3].
Yep, someone else wrote me privately to say this (that tagsoup respects
the html5 lexing rules). So I'll be using this as the basis of an html5
DOM parser. Stay tuned!
G
--
Gregory Collins