
There doesn't seem to be any option to make Pandoc produce actual MathML output. Is there a reason for this? (The only option I can see is to spit out raw LaTeX plus a 70KB JavaScript program to transform this into MathML at the client end --- which seems a little silly to me. There's also no way to style the raw LaTeX differently in case JavaScript is unavailable.) Also, while Markdown *almost* does what I want, there are a few small constructs it doesn't have. For example, I'd like to have some way to denote a "term" the first time I use it. I could just use italics, but I'd prefer some way to visually indicate that this isn't just an emphasised word, it's a new technical term. In HTML, I'd use a style class, and in LaTeX I'd define a new command. But I can't see a way to do something that will still allow Pandoc to generate correct LaTeX *and* correct HTML from a single Markdown source... Any hints?

On Sun, Oct 12, 2008 at 6:21 AM, Andrew Coppin
Also, while Markdown *almost* does what I want, there are a few small constructs it doesn't have. For example, I'd like to have some way to denote a "term" the first time I use it. I could just use italics, but I'd prefer some way to visually indicate that this isn't just an emphasised word, it's a new technical term. In HTML, I'd use a style class,
Markdown allows arbitrary HTML tags, so you can just put the terms in
a <dfn> element.
I don't know if that will work with the LaTeX conversion. Markdown is
specifically designed to produce HTML, so it's not clear to me how
Pandoc does any of the non-HTML output formats.
--
Dave Menendez

David Menendez wrote:
Markdown allows arbitrary HTML tags, so you can just put the terms in a <dfn> element.
I don't know if that will work with the LaTeX conversion. Markdown is specifically designed to produce HTML, so it's not clear to me how Pandoc does any of the non-HTML output formats.
Well, no... Markdown is a way of marking up text in a way which is still moderately readable to human beings. You can turn it into any markup format in principle. The trouble as, as soon as Pandoc doesn't understand the markup, you can't really expect it to handle the translation any more...

On Sun, Oct 12, 2008 at 11:45 AM, Andrew Coppin
David Menendez wrote:
Markdown allows arbitrary HTML tags, so you can just put the terms in a <dfn> element.
I don't know if that will work with the LaTeX conversion. Markdown is specifically designed to produce HTML, so it's not clear to me how Pandoc does any of the non-HTML output formats.
Well, no... Markdown is a way of marking up text in a way which is still moderately readable to human beings. You can turn it into any markup format in principle. The trouble as, as soon as Pandoc doesn't understand the markup, you can't really expect it to handle the translation any more...
The first sentence on the Markdown web page is: "Markdown is a
text-to-HTML conversion tool for web writers."
http://daringfireball.net/projects/markdown/
The Markdown syntax guide states: "Markdown's syntax is intended for
one purpose: to be used as a format for writing for the web. ... For
any markup that is not covered by Markdown's syntax, you simply use
HTML itself."
Markdown text may contain arbitrary HTML blocks. Any attempt to
produce LaTeX or PDF from Markdown must therefore be able to convert
arbitrary HTML.
--
Dave Menendez

+++ Andrew Coppin [Oct 12 08 11:21 ]:
There doesn't seem to be any option to make Pandoc produce actual MathML output. Is there a reason for this?
1. Nobody has written the LaTeX -> MathML code yet, and I've been too lazy. Anyone who is interested in doing this should get in touch. 2. Not all browsers can process MathML. The current system (using the LaTeXMathML.js javascript) has the advantage of "falling back" to raw LaTeX in browsers that don't support MathML. John

John MacFarlane wrote:
+++ Andrew Coppin [Oct 12 08 11:21 ]:
There doesn't seem to be any option to make Pandoc produce actual MathML output. Is there a reason for this?
1. Nobody has written the LaTeX -> MathML code yet, and I've been too lazy. Anyone who is interested in doing this should get in touch.
Well, I'd certainly be "interested". I use mathematics *a lot* in my writing. Presumably modifying a large program like Pandoc is intractably difficult though? It strikes me that perhaps using LaTeX to enter mathematical markup is rather against the spirit of Markdown. Surely there should be an option to include raw LaTeX, but a more "natural" encoding that covers "most" mathematics would be nice also. Of course, that means somebody has to design it first...
2. Not all browsers can process MathML. The current system (using the LaTeXMathML.js javascript) has the advantage of "falling back" to raw LaTeX in browsers that don't support MathML.
It's been a while since I looked, but I believe the spec provides a way to provide an "alternative" block of XML, similar to the 'alt' tag in the <img> element, for precisely this reason. (And if there was a math converter, rather than raw LaTeX you could provide something a little easier on the eyes given what raw Unicode + plain HTML can do...) MathML has the advantage that it's machine-readable as well as human-readable. That probably doesn't matter right now, but maybe it will someday.

1. Nobody has written the LaTeX -> MathML code yet, and I've been too lazy. Anyone who is interested in doing this should get in touch.
Well, I'd certainly be "interested". I use mathematics *a lot* in my writing. Presumably modifying a large program like Pandoc is intractably difficult though?
Just write a separate library that parses LaTeX input and returns MathML output. Pandoc could then use this library. So you wouldn't need to know anything about pandoc's internals. Just write a function teXMathToMathML :: String -> String. This would be a great contribution! You could get a head start by looking at the LaTeXMathML.js code.
It strikes me that perhaps using LaTeX to enter mathematical markup is rather against the spirit of Markdown. Surely there should be an option to include raw LaTeX, but a more "natural" encoding that covers "most" mathematics would be nice also. Of course, that means somebody has to design it first...
I think it makes good sense to use LaTeX, which is already designed to be natural but flexible, and is already known by most mathematicians. My guess is that in designing a more natural format, one would eventually reinvent something like LaTeX... John

John MacFarlane wrote:
1. Nobody has written the LaTeX -> MathML code yet, and I've been too lazy. Anyone who is interested in doing this should get in touch.
Well, I'd certainly be "interested". I use mathematics *a lot* in my writing. Presumably modifying a large program like Pandoc is intractably difficult though?
Just write a separate library that parses LaTeX input and returns MathML output. Pandoc could then use this library. So you wouldn't need to know anything about pandoc's internals. Just write a function
teXMathToMathML :: String -> String.
This would be a great contribution! You could get a head start by looking at the LaTeXMathML.js code.
OK. I'll give that a go at some point...
I think it makes good sense to use LaTeX, which is already designed to be natural but flexible, and is already known by most mathematicians.
Seems like a valid argument.
My guess is that in designing a more natural format, one would eventually reinvent something like LaTeX...
I would dispute that. I don't think anybody will claim that "\DeclareMathOperator{\erf}{erf}" is natural or intuitive, nor the low-level trickery required to correctly typeset arrays and so forth. (Look at how LaTeX typesets tables. Now look at how Markdown does it. Yeah.) Even so, designing something better is probably a research project [since typeset mathematics uses *so* many obscure symbols and advanced typesetting conventions, and ASCII is woefully unable to cope]. Using LaTeX is probably a very useful step in the right direction.

2008/10/17 Andrew Coppin
It strikes me that perhaps using LaTeX to enter mathematical markup is rather against the spirit of Markdown. Surely there should be an option to include raw LaTeX, but a more "natural" encoding that covers "most" mathematics would be nice also. Of course, that means somebody has to design it first...
Here's something along those lines, which I found recently on the W3C MathML software page: http://www1.chapman.edu/~jipsen/asciimath.html It's a converter from an ASCII syntax to Presentation MathML, written in JavaScript to allow mathematical notation on web pages to be converted to MathML in browsers that support it, or kept as ASCII in browsers that don't. There's a specification of the ASCII syntax which would be a good starting point if you want to write another implementation: http://www1.chapman.edu/~jipsen/mathml/asciimathsyntax.html Presumably this can't express everything that MathML can (and it doesn't deal with Content MathML), so it would be useful to support MathML in the source, like Markdown allows inline HTML, or LaTeX. Andy
participants (4)
-
Andrew Coppin
-
Andy Smith
-
David Menendez
-
John MacFarlane