
Am Freitag, 14. Oktober 2005 16:25 schrieben Sie:
On Fri, Oct 14, 2005 at 04:20:24PM +0200, Wolfgang Jeltsch
wrote: I always couldn't understand why one has to write regular expressions as strings
Because the language used inside these strings is standard, multi-language, widely used and documented?
Well, in my opinion, the standard regexp syntax is rather awkward so that diverging from the standard might be a good thing. However, my proposal was not about introducing a new syntax. If I had just used a different syntax, I had used strings for representing regexps as well. But my main point is to not use strings for representing regexps at runtime because this means that parsing is done at runtime. This might result in a loss of efficiency. In addition, no syntax checks can be done at runtime. The situation gets worse if you try to manipulate regular expressions. Now lets consider using an algebraic datatype for regexps: data RegExp = Empty | Single Char | RegExp :+: RegExp | RegExp :|: RegExpt | Iter RegExp Manipulating regular expressions now becomes easy and safe – you are just not able to create "syntactically incorrect regular expressions" since during runtime you don't deal with syntax at all. In addition, the usage of a special datatype can provide more flexibility. Representing regexps as strings means that regexps can only denote sets of strings. In contrast, the above datatype could easily be extendend to allow arbitrary lists instead of just strings: data RegExp token = Empty | Single token | RegExp token :+: RegExp token | ... If you really need a Perl-like syntax for regular expressions, the strings representing the regexps should be parsed at compile-time and transformed into expressions of a special regexp datatype like the one above. However, I don't like the idea of extending the language with a special regexp syntax. Why handle a specific, albeit common, syntax for a special case of regexps (string-only) special? What about other things than regexps? Should they also get a language extension? I'd say that the better way would be to use Template Haskell for this purpose: myRegExp = $(regExp "[a-z0-9]") This way, special syntaxes are not hard-wired into the language but can be activated by importing a corresponding module. Best wishes, Wolfgang