
Hi Michael,
Michael Schober
[...] I took the liberty to modify the output a little bit to my needs - maybe a future reader will find it helpful, too. It's attached below.
I kind of played around with your example a little bit and wondered if it could be implemented in terms of just the basic Haskell Platform modules and functions. So as an exercise I rolled my own directory traversal and duplicate finder functions. This is what I came up with: - walkDirWith: walks a given directory with a given function that takes a Handle to any (unknown type) value, and returns association lists of paths and the unknown type values. - filePathMap: I think roughly analogous to your duplicates function. - main: In the third line of the main function, I use hFileSize as an example of a function that takes a Handle to an IO value, in this case IO Integer. A hash function could easily be put in here. The last line pretty-prints the Map in a tree-like format. import System.IO import System.Environment (getArgs) import System.Directory ( doesDirectoryExist , getDirectoryContents) import Control.Monad (mapM) import Control.Applicative ((<$>)) import System.FilePath ((>)) import qualified Data.Map as M walkDirWith :: FilePath -> (Handle -> IO r) -> IO [(r, FilePath)] -> IO [(r, FilePath)] walkDirWith path f walkList = do isDir <- doesDirectoryExist path if isDir then do paths <- getDirectoryContents path concat <$> mapM (\p -> walkDirWith (path > p) f walkList) [p | p <- paths, p /= ".", p /= ".."] else do rValue <- withFile path ReadMode f ((:) (rValue, path)) <$> walkList filePathMap :: Ord r => [(r, FilePath)] -> M.Map r [FilePath] filePathMap pathPairs = foldl (\theMap (r, path) -> M.insertWith' (++) r [path] theMap) M.empty pathPairs main :: IO () main = do [dir] <- getArgs fileSizes <- walkDirWith dir hFileSize $ return [] putStr . M.showTree $ filePathMap fileSizes Obviously there's no right or wrong way to do it, but I'm wondering what you think. Regards, Yawar