Regular Expression to Determinate Finite Automata translator

Hi, I am a Haskell newbie. I have coded a Regular Expression to Determinate Finite Automata translator. Algorithm from the Dragon Book. Would someone eyeball the code and give me suggestions please. I have not done anything on character classes yet though. And the parsing is a bit of a hack. What I am not sure about is having to have multiple versions of similar datatype, each with variations in order to enumerate and generate followPos set. Is there a better way of implementing this ? Many thanks in advance, Aaron

Some comments: - You can run your code thru HLint, here it gives me 27 suggestions. - Why don't you derive the Show instance for RE instead of writing it by yourself? - Note that do x do y ... is the same as do x y ... - You can parametrize RE as data RE s p = Epsilon | Leaf Char s p | Selection (RE s p) (RE s p) | Sequence (RE s p) (RE s p) | Kleene (RE s p) | Optional (RE s p) | End s deriving (Show) type RE1 = RE () () type RE2 = RE State () type RE3 = RE State Pos Cheers! =) -- Felipe.

The simplest way to make a recogniser out of a RE is to use one of the available parsing libraries: module RE where import Text.ParserCombinators.UU import Text.ParserCombinators.UU.Examples data RE = Epsilon | Leaf Char | Selection RE RE | Sequence RE RE | Kleene RE | Optional RE | End re_to_fsm :: RE -> Parser String re_to_fsm re = case re of Leaf c -> lift <$> pSym c Selection re1 re2 -> re_to_fsm re1 <|> re_to_fsm re2 Sequence re1 re2 -> (++) <$> re_to_fsm re1 <*> re_to_fsm re2 Kleene re -> concat <$> pList (re_to_fsm re) Optional re -> re_to_fsm re `opt` "" End -> pure "" t = re_to_fsm ((Kleene (Leaf 'a') `Sequence` Kleene (Leaf 'b')) `Selection` (Kleene (Leaf 'a') `Sequence` (Kleene (Leaf 'c') ))) t1 = run t "aaabbb" t2 = run t "aaaaccccccc" t3 = run t "aaddcc" test = run (re_to_fsm (Kleene (Leaf 'a') `Sequence` Kleen (Left 'b')) "aaabbb" *RE> t1 -- -- > Result: "aaabbb" -- *RE> t2 -- -- > Result: "aaaaccccccc" -- *RE> t3 -- -- > Result: "aacc" -- > Correcting steps: -- > Deleted 'd' at position 2 expecting one of ['a', 'c', 'a', 'b'] -- > Deleted 'd' at position 3 expecting 'c' -- *RE> On 22 jul 2010, at 20:51, Aaron Gray wrote:
Hi,
I am a Haskell newbie. I have coded a Regular Expression to Determinate Finite Automata translator. Algorithm from the Dragon Book.
Would someone eyeball the code and give me suggestions please.
I have not done anything on character classes yet though. And the parsing is a bit of a hack.
What I am not sure about is having to have multiple versions of similar datatype, each with variations in order to enumerate and generate followPos set.
Is there a better way of implementing this ?
Many thanks in advance,
Aaron
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (3)
-
Aaron Gray
-
Felipe Lessa
-
S. Doaitse Swierstra