
18 Apr
2011
18 Apr
'11
5:41 p.m.
Since the document claims it is HTML, you should be parsing it with an HTML parser. Try hxt-tagsoup -- specifically, the "parseHtmlTagSoup" arrow.