
Graham: Thanks for your sterling efforts here. I concur with your professional opinion. I will investigate what I can layer on top of your library. cheers peter On 27/05/2008, at 3:52 PM, Graham Klyne wrote:
Peter,
[I should be more careful ... I meant RFC 3986, as in:]
I haven't looked at this code in a while, but... as far as I'm aware it's stable and reliable. The parser was written to follow, as closely as I could manage, the specification in RFC3986 (http://www.ietf.org/rfc/rfc3986.txt) - experience in writing this parser was used as feedback (among many others) in the development of RFC2396.
The parser does not attempt to be in any respect scheme-aware. The parentheses here are, as far as I'm aware, quite legitimate in a generic URI, and I think no warning or refusal is appropriate for a generic URI parser. (URIs can be and are used in many places other than web pages.)
However, there are additional constraints that may be appropriate for specific URI schemes - maybe like reserving parentheses as you suggest - and were I to implement these I would do so in a layer built upon the generic URI parser: given the generic parse, a lookup on the scheme name could select an additional validation function.
#g --
Peter Gammie wrote:
Hello, I'm wondering what the state of this parser is. It parses the contents of the src attribute in the following: <p><img src="javascript:alert('XSS');" alt=""/></p> which causes IE 5.5 (and probably 6) to show a dialog box. (I lifted this example from the list at http://ha.ckers.org/xss.html) I was hoping the parser in Network.URI would choke on it - the parentheses are reserved, at least. cheers peter
-- Graham Klyne Contact info: http://www.ninebynine.org/#Contact