Re: [Haskell-cafe] Converting wiki pages into pdf

8 Sep 2011


      Is it possible to automate this process rather than manually clicking
and downloading  using Haskell ?

Thank You
Mukesh Tiwari

On Thu, Sep 8, 2011 at 6:11 PM, Max Rabkin  wrote:
...
This doesn't answer your Haskell question, but Wikpedia has
PDF-generation facilities ("Books"). Take a look at
http://en.wikipedia.org/wiki/Help:Book (for single articles, just use
the "download PDF" option in the sidebar).
--Max
On Thu, Sep 8, 2011 at 14:34, mukesh tiwari
 wrote:
...
Hello all
I am trying to write a Haskell program which download html pages from
wikipedia   including images and convert them into pdf . I wrote a
small script
import Network.HTTP
import Data.Maybe
import Data.List
main = do
       x <- getLine
       htmlpage <-  getResponseBody =<< simpleHTTP ( getRequest x ) --
open url
       --print.words $ htmlpage
       let ind_1 = fromJust . ( \n -> findIndex ( n `isPrefixOf`) .
tails $ htmlpage ) $ "<!-- content -->"
           ind_2 = fromJust . ( \n -> findIndex ( n `isPrefixOf`) .
tails $ htmlpage ) $ "<!-- /content -->"
           tmphtml = drop ind_1 $ take ind_2  htmlpage
       writeFile "down.html" tmphtml
and its working fine except some symbols are not rendering as it
should be. Could some one please suggest me how to accomplish this
task.
Thank you
Mukesh Tiwari
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe