Hello all
I am trying to write a Haskell program which download html pages from wikipedia including images and convert them into pdf . I wrote a small script
import Network.HTTP
import Data.Maybe
import Data.List
main = do
x <- getLine
htmlpage <- getResponseBody =<< simpleHTTP ( getRequest x ) --open url
--print.words $ htmlpage
let ind_1 = fromJust . ( \n -> findIndex ( n `isPrefixOf`) . tails $ htmlpage ) $ "<!-- content -->"
ind_2 = fromJust . ( \n -> findIndex ( n `isPrefixOf`) . tails $ htmlpage ) $ "<!-- /content -->"
tmphtml = drop ind_1 $ take ind_2 htmlpage
writeFile "down.html" tmphtml
and its working fine except some symbols are not rendering as it should be. Could some one please suggest me how to accomplish this task.
Thank you
Mukesh Tiwari