Haskell XSLT interpreter?

older
foreign import unsupported in GHCi...

S. Alexander Jacobson

11 Feb 2006 11 Feb '06

7:51 p.m.

Has anyone written a pure haskell xslt interpreter? If not, how difficult would it be to do so? -Alex- ______________________________________________________________ S. Alexander Jacobson tel:917-770-6565 http://alexjacobson.com

Show replies by date

Neil Mitchell

11 Feb 11 Feb

8:26 p.m.

Hi, I don't know of any, but there may well be, I've never looked. It probably wouldn't be that difficult to do, since XSLT is a functional language. There is probably lots of code in HaXml you could reuse (since the syntax for XSLT is XML). The only slightly taxing thing would be that XSLT is not pure (see the document function), so you may have to put most of it in the IO Monad. This would be very handy since Yhc uses XSLT to do some stuff (for example, http://www-users.cs.york.ac.uk/~ndm/yhc/bytecodes.html) and currently the choices seem to be MSXSL (which is great for me on Windows, but sucks a bit for others), or Xalan which is very slow. Having a Haskell XSLT is something I considered doing before, but never got round to... Thanks Neil On 11/02/06, S. Alexander Jacobson wrote:

...

Has anyone written a pure haskell xslt interpreter? If not, how difficult would it be to do so?

-Alex-

______________________________________________________________ S. Alexander Jacobson tel:917-770-6565 http://alexjacobson.com _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Colin Paul Adams

8:32 p.m.

...

...
...
...
...
"Neil" == Neil Mitchell writes:

Neil> Hi, I don't know of any, but there may well be, I've never Neil> looked. Neil> It probably wouldn't be that difficult to do, since XSLT is Neil> a functional language. There is probably lots of code in Neil> HaXml you could reuse (since the syntax for XSLT is Neil> XML). The only slightly taxing thing would be that XSLT is Neil> not pure (see the document function), so you may have to put Neil> most of it in the IO Monad. In what way is the document() function not pure? In XSLT 2.0 at least, it is defined in such a way that it can be implemented as a pure function. That is, the static context is defined to include a mapping from URIs to document nodes, and document() returns those document nodes that it's argument nodes map to. -- Colin Adams Preston Lancashire

Neil Mitchell

12 Feb 12 Feb

1:22 p.m.

...

In what way is the document() function not pure?

See [http://www.w3schools.com/xsl/func_document.asp]. In particular their example: i.e. this function needs to load the file celsius.xml. Note that instead of celsius.xml there could be an arbitrary complex expression here, that builds up a filename.

...

That is, the static context is defined to include a mapping from URIs to document nodes, and document() returns those document nodes that it's argument nodes map to.

I think I understand what you are saying, but might have missed slightly. While you can consider it as a mapping from URI to document node, there are an infinite number of URI's, and therefore an infinite amount of context. A more practical implementation would be to retrieve the document nodes as they are requested, and then cache them - but this means having IO to do the initial retrieve. Thanks Neil

Colin Paul Adams

2:20 p.m.

...

...
...
...
...
"Neil" == Neil Mitchell writes:

>> In what way is the document() function not pure? Neil> See [http://www.w3schools.com/xsl/func_document.asp]. In Neil> particular their example: Neil> select="document('celsius.xml')/celsius/result[@value=$value]"/> Neil> i.e. this function needs to load the file celsius.xml. No - it needs to retrieve the resource named by the relative URI 'celsius.xml'. If the base URI for the xsl:value-of element is a file URI, then this will indeed be a file. But that does not NECESSARILY imply any I/O. >> That is, the static context is defined to include a mapping >> from URIs to document nodes, and document() returns those >> document nodes that it's argument nodes map to. Neil> I think I understand what you are saying, but might have Neil> missed slightly. While you can consider it as a mapping from Neil> URI to document node, there are an infinite number of URI's, Neil> and therefore an infinite amount of context. A more Neil> practical implementation would be to retrieve the document Neil> nodes as they are requested, and then cache them - but this Neil> means having IO to do the initial retrieve. It does, in practise - but this might not be done as part of the tranformation - it might be done as part of the preparation. Likewise, in theory, parsing the initial XML document, and serialising the result, are not necessarily done by the XSLT processor. XSLT is defined as a transformation from an initial instance of the XPath data model (i.e. a tree of nodes in memory) to a new instance of the XPath data model. In practise, an XSLT processor will usually parse an XML file and serialise the result to a file (or network socket), but this is tectonically not part of the transformation. -- Colin Adams Preston Lancashire

Neil Mitchell

2:35 p.m.

...

It does, in practise - but this might not be done as part of the tranformation - it might be done as part of the preparation.

It can't be done before any transformations, since the URI can be a function, and the result of that function might be defined in terms of transformations and the nodes in the active document. It can't be done after, since the document node loaded by a document might have transformations applied to it. It might be possible to have a pure transformation function that returns a partially solved result, requiring IO, which then has the IO bit done and the remaining transformation applied. However, it does seem a requirement that IO and transformation are somehow interleaved. Thanks Neil

Lennart Augustsson

8:30 p.m.

Neil Mitchell wrote:

...

...
In what way is the document() function not pure?

See [http://www.w3schools.com/xsl/func_document.asp]. In particular their example:

i.e. this function needs to load the file celsius.xml. Note that instead of celsius.xml there could be an arbitrary complex expression here, that builds up a filename.

...
That is, the static context is defined to include a mapping from URIs to document nodes, and document() returns those document nodes that it's argument nodes map to.

I think I understand what you are saying, but might have missed slightly. While you can consider it as a mapping from URI to document node, there are an infinite number of URI's, and therefore an infinite amount of context. A more practical implementation would be to retrieve the document nodes as they are requested, and then cache them - but this means having IO to do the initial retrieve.

A function you could have in Haskell that would make some of these things pure is getFileReader :: IO (FilePath -> String) which would return a (semantically pure) function that maps file names to file contents. -- Lennart

Graham Klyne

13 Feb 13 Feb

3:49 p.m.

S. Alexander Jacobson wrote:

...

Has anyone written a pure haskell xslt interpreter? If not, how difficult would it be to do so?

(Ah, another cool project idea that fell by the wayside <sigh>!) Back when I was doing more web work in Haskell, inventing a translation of XSLT into Haskell was one of the ideas I was gestating. Unfortunately (or not), a day job came along and distracted me from that. ... Without reading in detail, I notice subsequent debate about how to write a pure function that deals with XML constructs that might perform IO. This was one of the problems I encountered when working on HaXML: I wanted to have options to use use the parser in "pure" mode (String -> XML), but also to be able to support full XML that may require I/O (XML defines an internal subset that doesn't require processors to perform I/O). In the event, I cheated and used unsafePerformIO. But it did occur to me that by parameterizing the XML processing function with a polymorphic function to turn an entity declaration into a string, like this: getEntityString :: Monad m => decl -> m String then the dependency on IO could itself be parameterized. For "pure" use, an identity monad could be used, which the calling program could safely unwrap. But if external entity support is required, then the type 'm' must be (or incorporate) an IO, so the value returned to the calling program would only be accessible within an IO monad. I feel sure this must be a known Haskell idiom for this kind of problem, but I can't say that I've noticed it anywhere. Or is there a snag I didn't notice? #g -- Graham Klyne For email: http://www.ninebynine.org/#Contact

Chris Kuklewicz

7:01 p.m.

Graham Klyne wrote:

...

S. Alexander Jacobson wrote:

...
Has anyone written a pure haskell xslt interpreter? If not, how difficult would it be to do so?

(Ah, another cool project idea that fell by the wayside <sigh>!)

Back when I was doing more web work in Haskell, inventing a translation of XSLT into Haskell was one of the ideas I was gestating. Unfortunately (or not), a day job came along and distracted me from that.

...

Without reading in detail, I notice subsequent debate about how to write a pure function that deals with XML constructs that might perform IO. This was one of the problems I encountered when working on HaXML: I wanted to have options to use use the parser in "pure" mode (String -> XML), but also to be able to support full XML that may require I/O (XML defines an internal subset that doesn't require processors to perform I/O). In the event, I cheated and used unsafePerformIO. But it did occur to me that by parameterizing the XML processing function with a polymorphic function to turn an entity declaration into a string, like this:

getEntityString :: Monad m => decl -> m String

then the dependency on IO could itself be parameterized. For "pure" use, an identity monad could be used, which the calling program could safely unwrap. But if external entity support is required, then the type 'm' must be (or incorporate) an IO, so the value returned to the calling program would only be accessible within an IO monad.

I feel sure this must be a known Haskell idiom for this kind of problem, but I can't say that I've noticed it anywhere. Or is there a snag I didn't notice?

#g

The Haskell Zipper-based file server/OS, as discussed at http://lambda-the-ultimate.org/node/1036 does the (Monad m) trick. The processing is largely done without knowing which Monad it is, so IO is impossible. But it can call-out to the controller, requesting an operation that requires IO. The Zipper example uses partial continuations, which may be overkill for the XML processing since you don't have interacting parallel threads.

Johan Jeuring

15 Feb 15 Feb

9:03 a.m.

...

Has anyone written a pure haskell xslt interpreter? If not, how difficult would it be to do so?

A master student of mine implemented XSLT in Haskell a couple of years ago. I've uploaded his thesis on http://www.cs.uu.nl/~johanj/MSc/danny.pdf If you're interested in the code, mail me. His implementation is partially in Haskell, partially in attribute grammar code (which generates Haskell using the Utrecht AG system). The code hasn't been used since 2001, so it might contain some bitrot. -- Johan

7081

Age (days ago)

7085

Last active (days ago)

List overview

Download

9 comments

7 participants

participants (7)

Chris Kuklewicz
Colin Paul Adams
Graham Klyne
Johan Jeuring
Lennart Augustsson
Neil Mitchell
S. Alexander Jacobson