
I have a string that needs to be split/tokenized based on a delimiter. This can easily be accomplished using 'break' if the delimiter is only 1 character (i.e. break isSpace "this is a string"), but I can't see any way of using this for a delimiter with multiple characters. in this case, I have a string containing multiples fields seperated by *two* blank lines (\n\n). I can't just break on the newline character, as single newline characters can be found inside each field. any idea how I can do this without too much hassle? -James (jamesd@mena.org.au)

G'day all. On Wed, Apr 02, 2003 at 11:26:46AM +1000, jamesd@mena.org.au wrote:
in this case, I have a string containing multiples fields seperated by *two* blank lines (\n\n). I can't just break on the newline character, as single newline characters can be found inside each field.
any idea how I can do this without too much hassle?
Here's some code I wrote some time ago which does Knuth-Morris-Pratt string searching: http://haskell.org/wiki/wiki?RunTimeCompilation Note that there are a couple of differences between matchKMP and break which you will no doubt discover. Cheers, Andrew Bromage

On Wed, Apr 02, 2003 at 11:26:46AM +1000, jamesd@mena.org.au wrote:
in this case, I have a string containing multiples fields seperated by *two* blank lines (\n\n). I can't just break on the newline character, as single newline characters can be found inside each field.
any idea how I can do this without too much hassle?
Heres a somewhat stupid, and somewhat compact way of doing this, assuming that three blank lines counts as an identical delimiter. It doesn't even pretend to attempt to solve the general case of multiple character delimiters, but instead breaks on empty lines. tok s = map (concatMap (++ "\n")) $ break null $ lines s (warning: untested code!) David Roundy

I have a string that needs to be split/tokenized based on a delimiter. This can easily be accomplished using 'break' if the delimiter is only 1 character (i.e. break isSpace "this is a string"), but I can't see any way of using this for a delimiter with multiple characters.
in this case, I have a string containing multiples fields seperated by *two* blank lines (\n\n). I can't just break on the newline character, as single newline characters can be found inside each field.
any idea how I can do this without too much hassle?
There's a split function that does this in lambdabot's cvs tree: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/haskell-libs/libs/lambdabot/U... Here's a demo: Prelude Util> split "foo" "bazfoobarfooblipp" ["baz","bar","blipp"] -- Shae Matijs Erisson - 2 days older than RFC0226 #haskell on irc.freenode.net - We Put the Funk in Funktion 10 PRINT "HELLO" 20 GOTO 10 ; putStr $ fix ("HELLO\n"++)
participants (4)
-
Andrew J Bromage
-
David Roundy
-
jamesd@mena.org.au
-
Shae Matijs Erisson