GSoC Proposal: Pandoc improvements including EPUB 3.0 reader

I'm looking to submit a proposal to (mainly) add an EPUB reader to pandoc. I've spent the last few weeks getting to know the code base and wrote a proposal in the last few days. I would really appreciate any comments on the proposal and any further suggestions or things to look out for! Looking forward to doing some hacking on pandoc independent of this! Full proposal: https://www.dropbox.com/s/tdiimqa8mj22vq3/gsoc.pdf Has anyone looked into MathML -> Latex conversion? It would be nice to have this in the EPUB parser to deal with embedded equations. Below is a sketch outline of the suggested implementation. *Embedded Base64 images* - Replace Target in the Image constructor with a new constructor which can either be a Target as before or a base64 encoding. - Update HTML5 reader to read embedded images successfully. *EPUB 3.0 reader* - Utilise the HTML parser with rawTags enabled - Extract additional information about structure from walking over the AST
participants (1)
-
Matthew Pickering