Can someone help? The program below works fine with small files but when I try to use it on the one I need to (about 3 million lines of data) it produces no output. The hard disk is hammered - I assume this is the run time system paging. My suspicion is that the program is trying to read in the whole file before processing it. Is this correct? If so, how do I make the program lazy so that it processes a line at time? By the way, the MD5 function which I use and is included as part of HSLIBS has the type String -> IO String. The MD5 algorithm really is a function and should have type String -> String. Do people agree and if so how do I get it changed? Dominic. -- Compile with ghc -o test test.hs -static -package util -- under Windows. module Main(main) where import IO(openFile, hPutStr, IOMode(ReadMode,WriteMode,AppendMode)) import MD5 import Char -- showHex and showHex' convert the hashed values to -- human-readable hexadecimal strings. showHex :: Integer -> String showHex = map hexDigit . map (fromInteger . (\x -> mod x 16)) . takeWhile (/=0) . iterate (\x -> div x 16) . toInteger hexDigit x | (0 <= x) && (x <= 9) = chr(ord '0' + x) | (10 <= x) && (x <=16) = chr(ord 'a' + (x-10)) | otherwise = error "Outside hexadecimal range" powersOf256 = 1 : map (*256) powersOf256 showHex' x = showHex $ sum (zipWith (*) (map ((\x -> (mod x 16)*16 + (div x 16)) . toInteger . ord) x) powersOf256) -- The type Anon and function anonymize hide the anonymisation -- process. In this case, it's a hash function -- digest :: String -> IO String which implements MD5. type Anon a = IO a class Anonymizable a where anonymize :: a -> Anon a -- MyString avoids overlapping instances of Strings -- with the [Char] data MyString = MyString String deriving Show instance Anonymizable MyString where anonymize (MyString x) = do s <- digest x return ((MyString . showHex') s) instance Anonymizable a => Anonymizable [a] where anonymize xs = mapM anonymize xs filename = "ldif1.txt" fileout = "ldif.out" readAndWriteAttrVals = do h <- openFile fileout WriteMode s <- readFile filename a <- anonymize((map MyString) (lines s)) hPutStr h (unlines (map (\(MyString x) -> x) a)) main = readAndWriteAttrVals ------------------------------------------------------------------------------------------------- 21st century air travel http://www.britishairways.com
participants (1)
-
Steinitz, Dominic J