ANN: TextRegexLazy-0.56, (=~) and (=~~) are here

Announcing: TextRegexLazy version 0.56 Where: Tarball from http://sourceforge.net/projects/lazy-regex darcs get --partial [--tag=0.56] http://evenmere.org/~chrisk/trl/stable/ License : BSD, except for DFAEngine.hs which is LGPL (derived from CTK light) Development/unstable version is at: darcs get [--partial] http://evenmere.org/~chrisk/trl/devel/ This is the version that has eaten John Meacham's JRegex library and survived to become strong. Thanks John! It now compiles against the posix regexp provided by the c library and the pcre library, in addition to the "full lazy" and the "DFA" backends. All 4 backends can accept regular expressions given as String and as ByteString. All 4 backends can run regular expressions against String and ByteString. In particular, the PosixRE and PCRE can run very efficiently against ByteString. (Though the input for the PosixRE needs to end in a \NUL character for efficiency). So there are 4*2*2 = 16 ways to use to provide input to this library. And the RegexContext class has at least 11 instances that both (=~) and (=~~) can target. So that is 4*2*2*11*2 = 352 things you can do with this library! Get your copy today! To run with cabal before 1.1.4 you will need to comment out the "Extra-Source-Files:" line in the TextRegexLazy.cabal file. The Example.hs file:
{-# OPTIONS_GHC -fglasgow-exts #-} import Text.Regex.Lazy import Text.Regex.Full((=~),(=~~)) -- or DFA or PCRE or PosixRE
main = let b :: Bool b = ("abaca" =~ "(.)a") c :: [MatchArray] c = ("abaca" =~ "(.)a") d :: Maybe (String,String,String,[String]) d = ("abaca" =~~ "(.)a") in do print b print c print d
This produces:
True [array (0,1) [(0,(1,2)),(1,(1,1))],array (0,1) [(0,(3,2)),(1,(3,1))]] Just ("a","ba","ca",["b"])
You can also use makeRegex and makeRegexOpts to compile and save a regular expression which will be used multiple times. Each of the 4 backends has a separate "Regex" data type with its own option types. For low level access, the WrapPCRE and WrapPosix modules expose a typesafe layer around the c libraries. You can query the "getVersion :: Maybe String" to see if the have been compiled into the library. It may be possible to use WrapPCRE and the UTF8 option flags to do unicode regex matching with PCRE. ( The Full and DFA backends use the Haskell unicode Char already ). Adding new types to String/ByteString is a matter of adding instances to the existing classes. Feedback and comments of any length is welcome. -- Chris Kuklewicz

Ooops. I just patched the efficiency of ByteStringPCRE to agree with the original announcement. Use darcs get --partial http://evenmere.org/~chrisk/trl/stable/ to get the fixed version. A new 0.57 tarball will go to sourceforge soon. Chris Kuklewicz wrote:
Announcing: TextRegexLazy version 0.56 Where: Tarball from http://sourceforge.net/projects/lazy-regex darcs get --partial [--tag=0.56] http://evenmere.org/~chrisk/trl/stable/ License : BSD, except for DFAEngine.hs which is LGPL (derived from CTK light)
Development/unstable version is at: darcs get [--partial] http://evenmere.org/~chrisk/trl/devel/
This is the version that has eaten John Meacham's JRegex library and survived to become strong. Thanks John!
It now compiles against the posix regexp provided by the c library and the pcre library, in addition to the "full lazy" and the "DFA" backends.
All 4 backends can accept regular expressions given as String and as ByteString.
All 4 backends can run regular expressions against String and ByteString.
In particular, the PosixRE and PCRE can run very efficiently against ByteString. (Though the input for the PosixRE needs to end in a \NUL character for efficiency).
So there are 4*2*2 = 16 ways to use to provide input to this library. And the RegexContext class has at least 11 instances that both (=~) and (=~~) can target. So that is 4*2*2*11*2 = 352 things you can do with this library! Get your copy today!
To run with cabal before 1.1.4 you will need to comment out the "Extra-Source-Files:" line in the TextRegexLazy.cabal file.
The Example.hs file:
{-# OPTIONS_GHC -fglasgow-exts #-} import Text.Regex.Lazy import Text.Regex.Full((=~),(=~~)) -- or DFA or PCRE or PosixRE
main = let b :: Bool b = ("abaca" =~ "(.)a") c :: [MatchArray] c = ("abaca" =~ "(.)a") d :: Maybe (String,String,String,[String]) d = ("abaca" =~~ "(.)a") in do print b print c print d
This produces:
True [array (0,1) [(0,(1,2)),(1,(1,1))],array (0,1) [(0,(3,2)),(1,(3,1))]] Just ("a","ba","ca",["b"])
You can also use makeRegex and makeRegexOpts to compile and save a regular expression which will be used multiple times. Each of the 4 backends has a separate "Regex" data type with its own option types.
For low level access, the WrapPCRE and WrapPosix modules expose a typesafe layer around the c libraries. You can query the "getVersion :: Maybe String" to see if the have been compiled into the library.
It may be possible to use WrapPCRE and the UTF8 option flags to do unicode regex matching with PCRE. ( The Full and DFA backends use the Haskell unicode Char already ).
Adding new types to String/ByteString is a matter of adding instances to the existing classes.
Feedback and comments of any length is welcome.

Chris Kuklewicz wrote:
Announcing: TextRegexLazy version 0.56 Where: Tarball from http://sourceforge.net/projects/lazy-regex darcs get --partial [--tag=0.56] http://evenmere.org/~chrisk/trl/stable/ License : BSD, except for
Great! - Thanks for all your hard work in making this available to everyone!
DFAEngine.hs which is LGPL (derived from CTK light)
I sense some possible problems coming... [in another post]
Bulat Ziganshin wrote:
Hello Chris,
Wednesday, August 2, 2006, 3:16:58 PM, you wrote:
Announcing: TextRegexLazy version 0.56
your feature list is really strong! it will be great now to make it a part of GHC standard distribution
Does the LGPL license for DFAEngine.hs use the static linking exception or not? If so, and if it is desirable to allow LGPL code without the static linking exception into the standard lib distro, then perhaps a useful project for someone would be to write a Haskell program that traverses source for an app and builds an appropriate static library containing the object code for all non-LGPL modules, with debug info stripped etc so that it's obfuscated (this being the object that must also be distributed to satisfy the linkage requirements of the LGPL licences for the other modules + GMP math lib), and also makes a list of all the LGPL modules and creates a text file containing the total merged licence agreement that one needs to distribute with one's exe. Note however that unfortunately this might not solve the problem in the face of whole-program optimization unless the LGPL conditions would be satisfied by the ability to build a non-optimized app but I've a feeling (though I'm certainly not a lawyer or legal expert) that this might unfortunately be at bit optimistic. I wonder what deviation from the code in DFAEngine.hs would be legally regarded as being "different" code so we could make the modifications and put a BSD3 licence on it. Should licences and other legal stuff (the ghosts of Ancient Rome which had their rightful place at a much earlier stage of human history) apply to type definitions, mathematical insights, functions etc in the first place? (Shakespeare's play "The Merchant of Venice" says it all) On a more positive note, I note that the European Parliament voted (last year iirc) that software patents are just a lot of rubbish and are null and void in Europe so at least that's one tender bud of common sense that's managed to burst through the asphalt. Regards, Brian. -- Freedom has no strings attached. Laws originate all the human misery on the planet. http://www.metamilk.com
participants (2)
-
Brian Hulley
-
Chris Kuklewicz