
Peter, I haven't looked at this code in a while, but... as far as I'm aware it's stable and reliable. The parser was written to follow, as closely as I could manage, the specification in RFC2396 (http://www.ietf.org/rfc/rfc2396.txt) - experience in writing this parser was used as feedback (among many others) in the development of RFC2396. The parser does not attempt to be in any respect scheme-aware. The parentheses here are, as far as I'm aware, quite legitimate in a generic URI, and I think no warning or refusal is appropriate for a generic URI parser. (URIs can be and are used in many places other than web pages.) However, there are additional constraints that may be appropriate for specific URI schemes - maybe like reserving parentheses as you suggest - and were I to implement these I would do so in a layer built upon the generic URI parser: given the generic parse, a lookup on the scheme name could select an additional validation function. #g -- Peter Gammie wrote:
Hello,
I'm wondering what the state of this parser is.
It parses the contents of the src attribute in the following:
<p><img src="javascript:alert('XSS');" alt=""/></p>
which causes IE 5.5 (and probably 6) to show a dialog box. (I lifted this example from the list at http://ha.ckers.org/xss.html)
I was hoping the parser in Network.URI would choke on it - the parentheses are reserved, at least.
cheers peter
-- Graham Klyne Contact info: http://www.ninebynine.org/#Contact