On Sun, Jan 23, 2011 at 5:23 PM, Stephen Tetley
<stephen.tetley@gmail.com> wrote:
I don't think you can do this "simply" as you think you would always
have to build a parse tree.
Isn't it enough to maintain a stack of open parens, brackets, char- and string-terminators and escape chars? Below is my attempt at solving the problem without an expression parser.
In practice, if you follow the skeleton syntax tree style you might
find "not caring" about the details of syntax is almost as much work
as caring about them. I've tried a couple of times to make a skeleton
parser that does paren nesting and little else, but always given up
and just used a proper parser as the skeleton parser never seemed
robust.
Indeed I doubt that the implementation below is robust and it's too tricky to be easily maintainable. I include it for reference. Let me know if you spot an obvious mistake..
splitTLC :: String -> [String]
splitTLC = parse ""
type Stack = String
parse :: Stack -> String -> [String]
parse _ "" = []
parse st (c:cs) = next c st $ parse (updStack c st) cs
next :: Char -> Stack -> [String] -> [String]
next c [] xs = if c==',' then [] : xs else c <: xs
next c (_:_) xs = c <: xs
infixr 0 <:
(<:) :: Char -> [String] -> [String]
c <: [] = [[c]]
c <: (x:xs) = (c:x):xs
updStack :: Char -> Stack -> Stack
updStack char stack =
case (char,stack) of
-- char is an escaped character
(_ ,'\\':xs) -> xs -- the next character is not
-- char is the escape character
('\\', xs) -> '\\':xs -- push it on the stack
-- char is the string terminator
('"' , '"':xs) -> xs -- closes current string literal
('"' , ''':xs) -> ''':xs -- ignored inside character
('"' , xs) -> '"':xs -- opens a new string
-- char is the character terminator
(''' , ''':xs) -> xs -- closes current character literal
(''' , '"':xs) -> '"':xs -- ignored inside string
(''' , xs) -> ''':xs -- opens a new character
-- parens and brackets
(_ , '"':xs) -> '"':xs -- are ignored inside strings
(_ , ''':xs) -> ''':xs -- and characters
('(' , xs) -> '(':xs -- new opening paren
(')' , '(':xs) -> xs -- closing paren
('[' , xs) -> '[':xs -- opening bracket
(']' , '[':xs) -> xs -- closing bracket
-- other character don't modify the stack (ignoring record syntax)
(_ , xs) -> xs