
Hi David,
What you are wanting to do is query and transform a reasonably large
AST. Fortunately there are solutions, generic programming should do
exactly what you want without too much hassle. I'd personally
recommend taking a look at the Uniplate library, starting by reading
the Haskell Workshop 2007 paper (its an academic paper, but for you
the really useful stuff is in Section 2 which is more tutorial style)
- http://www-users.cs.york.ac.uk/~ndm/uniplate/
I'm sure I could solve your particular queries using Uniplate, but
I'll leave it for you to take a look and figure out. If you can't
figure out how to do it after reading section 2 of the paper mail back
and I'll take a closer look.
Thanks
Neil
2009/3/3 David Miani
Hi, I'm working on a Haskell library for interacting with emacs org files. For those that do not know, an org file is a structured outline style file that has nested headings, text, tables and other elements. For example:
* Heading 1 Some text, more text. This is a subelement of Heading 1 1. You can also have list 1. and nested lists 2. more...
** Nested Heading (subelement of Heading 1) text... (subelement of Nested Heading)
** Another level 2 heading (subelement of Heading 1) | Desc | Value | |-----------+--------------------------------------| | Table | You can also have tables in the file | | another | row | | seperator | you can have seps as well, eg | |-----------+--------------------------------------|
* Another top level heading There are many more features, see orgmode.org
My library enables read and write access to a subset of this format (eg lists aren't parsed atm).
The data structures used for writing are:
data OrgFile = OrgFile [OrgFileElement] data OrgFileElement = Table OrgTable | Paragraph String | Heading OrgHeading
-- heading level, title, subelements data OrgHeading = OrgHeading Int String [OrgFileElement]
data OrgTable = OrgTable [OrgTableRow]
data OrgTableRow = OrgTableRow [String] | OrgTableRowSep
To write a file you contruct a OrgFile out of those elements, and pass it to a writeOrgFile func. Eg:
writeOrg $ OrgFile [Heading (OrgHeading 1 "h1" [Paragraph "str"])] would produce: * h1 str
I was going to use the same data structures for reading an org file, but it quickly became apparent that this would not be suitable, as you needed the position of the file of an element to be able to report errors. Eg if you needed to report an error that a number was expected, the message "'cat' is not a number" is not very useful, but "Line 2031: 'cat' is not a number" is. So the data structures I used were:
data FilePosition = FilePosition Line Column
data WithPos a = WithPos { filePos :: FilePosition, innerValue :: a }
data OrgTableP = OrgTableP [WithPos OrgTableRow]
data OrgFileElementP = TableP OrgTableP | ParagraphP String | HeadingP OrgHeadingP
data OrgHeadingP = OrgHeadingP Int String [WithPos OrgFileElementP]
data OrgFileP = OrgFileP [WithPos OrgFileElementP]
Finally there is a function readOrg, which takes a string, and returns an OrgTableP.
Now, this all works as expected (files are correctly being parsed and written), however I am having a lot of trouble trying to come up with a decent API to work with this. While writing an OrgFile is fairly easy, reading (and accessing inner parts) of an org file is very tedious, and modifying them is horrendous.
For example, to read the description line for the project named "Project14" in the file:
* 2007 Projects ** Project 1 Description: 1 Tags: None ** Project 2 Tags: asdf,fdsa Description: hello * 2008 Projects * 2009 Projects ** Project14 Tags: RightProject Description: we want this
requires the code:
type ErrorS = String listToEither str [] = Left str listToEither _ (x:_) = Right x
get14 :: OrgFileP -> Either ErrorS String get14 (OrgFileP elements) = getDesc =<< (getRightProject . concatProjects) elements where concatProjects :: [WithPos OrgFileElementP] -> [OrgHeadingP] concatProjects [] = [] concatProjects ((WithPos _ (HeadingP h)) : rest) = h : concatProjects rest concatProjects (_ : rest) = concatProjects rest
getRightProject :: [OrgHeadingP] -> Either ErrorS OrgHeadingP getRightProject = listToEither "Couldn't find project14" . filter (\(OrgHeadingP _ name _) -> name == "Project14")
getDesc :: OrgHeadingP -> Either ErrorS String getDesc (OrgHeadingP _ _ children) = case filter paragraphWithDesc (map innerValue children) of [] -> Left $ show (filePos $ head children) ++ ": Couldn't find desc in project" ((ParagraphP str):_) -> Right str _ -> error "should not be possible"
paragraphWithDesc :: OrgFileElementP -> Bool paragraphWithDesc (ParagraphP str) = str =~ "Description" paragraphWithDesc _ = False
If you think that is bad, try writing a function that adds the Tag "Hard" to Project2 :(
What I really need is a DSL that would allow sql like queries on an OrgFileP. For example: select (anyHeading `next` headingWithName "Project14" `withFailMsg` "couldn't find p14" `next` paragraphMatchingRegex "Description" `withFailMsg` "no desc") org `output` paragraphText
would return a String OR select (anyHeading `next` headingWithName "Project2" `next` paragraphMatchingRegex "Tag:") org `modify` paragraphText (++ ",Hard")
would return an OrgFile, with the new Hard tag added.
However, I don't know if this is even possible, how to do it, or if there is a better alternative to this. I would really apreciate any hints with regards to this. It would be useful to know if there are other libraries that also face this problem, and how they solved it.
Finally, I would be grateful for any other advice regarding my code. One thing that has bugged me is my solution for having file position info - my solution never seemed very elegant.
Thanks, David
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe