
On Monday 06 June 2011, 20:19:49, Manfred Lotz wrote:
Hi all, I'm a bit lost. I'm having a problem with xml light but as a Haskell newbie it could be of course that the problem is sitting between my keyboard and chair.
Not really. Yes, you're doing something wrong, but it's a bit delicate and nothing obvious. It's the unobvious behaviour of lazy IO.
I tried hard to create a minimal example.
Here is a minimal xmltest.hs:
<-----------------------------snip--------------------------------> module Main where
import System.Environment.UTF8 import qualified System.IO.UTF8 as U
import System.IO import Text.XML.Light
data CTest = CTest { ctName :: String , ctLocation :: String } deriving (Show,Read,Eq,Ord)
getXmlContent :: Handle -> IO CTest getXmlContent inh = do xml <- U.hGetContents inh
Problem, part 1
let content = parseXMLDoc xml case content of Just c -> do let name = case findChild (unqual "name") c of Nothing -> "<unknown>" Just n' -> strContent n' let path = case findChild (unqual "location") c of Nothing -> "<unknown>" Just path' -> case findAttr (unqual "path") path' of Nothing -> "<unknown>" Just p -> p return CTest { ctName=name, ctLocation=path} _ -> fail "not expected"
readXmlFile :: FilePath -> IO CTest readXmlFile f = do inh <- U.openBinaryFile f ReadMode xml <- getXmlContent inh hClose inh
Problem, part 2, the real problem. The point is that getXmlContent doesn't really parse the file yet, it returns a thunk saying how to get the result from the file contents. Therefore, it doesn't need to read the entire file, just enough of it to find out whether parseXMLDoc returns a Just contents or a Nothing. Then you close the handle, explicitly. That means, it's closed immediately, leaving the unread portion of the file unread. If the bla bla is long enough, the location tag is in the unread part, and when finally the search for that tag is forced by printing, the tag is not contained in the input. If you leave out the call to hClose, leaving the closing to hGetContents when it reaches the end of the file, the file contents is not truncated, and the location found. But then the file handle might remain half-closed longer than you wish (you could run out of file handles if you open a lot of files without explicitly closing them before the next [bunch] is opened). If you force the result before closing the handle, enough of the file is read to find the desired elements, and you make sure that there's no leaking file handle -- unless there's an exception in getXmlContent, so that the hClose is never reached. To prevent that, use Control.Exception.bracket, or use withBinaryFile, which does that for you.
return xml
doSomething :: Show a => a -> IO () doSomething xml = print xml
main :: IO () main = do args <- getArgs xml <- readXmlFile $ head args doSomething xml <-----------------------------snap-------------------------------->
Question: What is my error?