
Bryan O'Sullivan wrote:
apfelmus wrote:
Thank you for the review and constructive criticism. Although highly amusing, some of what you write is elliptical enough in style that I'm having trouble following a few of your points. Doubly so since I've thus far gone years without paying attention to Control.Applicative
Oui, you're right. I'm going to elaborate, although critique incompréhensible is a must, otherwise it would quickly become obvious that the critic has absolutely no clue ;)
The RecursionPredicate decides whether to recurse into or a sub-directory or not (it could be mentioned explicitly that the predicate is only invoked on directories).
Good Point!
Cette soupe is much too monadic! In particular, we have the isomorphism
(FileInfo -> a) ≅ FindClause a
witnessed by the function pair
(f,g) = ((`liftM` fileInfo), evalClause)
Of course, the encapsulation was made to allow a formulation like
extension ==? ".hs"
which does not mention the FileInfo parameter. But alas, this can also be achieved more naturally by appealing to the
instance Control.Applicative ((->) a)
The intention is to present the predicates (FileInfo -> a) in a way that does not mention the parameter over and over. I mean, one could write the example as \info -> (extension info == ".hs") || (extension info == ".lhs") but threading the parameter around will quickly become a nuisance with more complex queries. One solution is to use the Reader monad, and in essence that's what System.FilePath.Find currently does. But I think that a) The Reader is better seen as applicative functor than as monad b) Making (FindClause a) a mere type synonym of (FileInfo -> a) has the benefit that the user can choose whether he wants to use monads or applicative functors via the respective instances or whether he does not. Programming with applicative functors is like programming with monads but you're only allowed to use return and `ap`. The example can be written for monads as return (||) `ap` (return (== ".hs") `ap` extension) `ap` (return (== ".lhs") `ap` extension) or for general applicative functors as (||) <$> ((== ".hs") <$> extension) <*> ((== ".lhs") <$> extension) In the end, one will probably use the custom combinators (==?) and (||?) anyway, so that it doesn't matter whether it's monad or applicative or whatever. But not making (FindClause a) opaque gives more freedom to the user. Note that providing functionality by making things instances of very powerful classes should be documented explicitly. A recent question on haskell-cafe was about Data.Tree and one can argue that the innocent looking instances for Foldable, Functor and Traversable are maybe 75% of the functionality that Data.Tree offers :)
Abstracting the concrete representation away into FindClause hinders reuse of already existing functionality from System.FilePath,
Yes, that's unfortunate.
The monad could make sense if the predicate might involve not only the file status but also looking inside the file.
How would you suggest doing so? Just a simple unsafeInterleaveIO readFile?
Returning all files with a certain magic number would be such a use case but System.FilePath.Find currently does not offer zis possibilité.
I'm certainly all for improving it, so this kind of suggestion is most welcome.
I guess that using (FileInfo -> IO Bool) and providing a specialized (FileInfo -> Bool) version is probably a safe way. But with IO, the traversal could even change files in place, which is probably not a good idea. Maybe unsafePerformIO is the best solution, because you may safely close the file after having evaluated predicate (unsafePerformIO $ hGetContents handle) (fileinfo) to weak head normal form, i.e. True or False. I think it fits perfectly.
Un autre point is that the library offers a function fold that is almost a foldl' over all files in a directory tree, but not quite: here, the fold goes over FileInfos but the list is a simple list of FilePaths. One has to foldM over the latter and get the file infos again to achieve the same effect.
But the FilePath is embedded in a FileInfo, so a double traversal is unneeded.
Yes. I've not been clear, I mean the other way round. Using System.FilePath.Find.fold gives you both file status and file path but the ought-to-be equivalent approach of using foldl' on the list returned by find only gives the file path but no file status. So, the suggestion is to make find return a list of FileInfo find :: ... -> IO [FileInfo] If you only want a list of file paths, you can always do liftM (map filepath) . find ... Of course, this leads to the question whether find should be factored even more into generate & prune find r p dir = map filepath . filter p . directoryWalk r dir with the intention that System.FilePath.Find only exports directoryWalk. Which leads to the question whether directoryWalk can be factored as well which in turn leads to the next question:
Le maître de cuisine thinks that these quirks stem from the fact that the file system lacks a purely functional design. The observation is that given persistence, reading does not need to be monadic. In other words, directory tree traversal might well return a list of files
data File = File { contents :: String, status :: FileStatus }
with contents and status. More generally, the traversal could be a side-effect-free traversal of a pure data structure.
The darcs authors already tried this, and gave up on the idea. Once you have a pure data structure, you start developing notions that it makes sense to manipulate it, and then all is lost once you turn your mind to applying those manipulations to the real world.
Yes, the data structure ought to be read-only. But by making it an abstract data type with proper access functions, maybe this goal can be achieved. In any case, with such a data structure, directoryWalk can be factored as well. In particular, I have something in mind, based on the following Question: "Given a directory tree, what are you going to do with it?". Answers : "Well, I rename every file" (=> map) "Calculate total size" (=> fold) The assumption is that everything you'll ever do with a huge directory tree is to map or fold it. So, here comes my crazy speculation: make the directory tree a phantom type data DirTree a type DirectoryTree = DirTree FileInfo and implement Functor, Foldable and Traversable for it. So, printing out all files would become Data.Foldable.mapM_ (putStrLn . filepath) :: DirTree FileInfo -> IO () In conclusion, there are several ways to generalize an interface. One way is to add more options and parameters to a function. But the other way is to shatter a monolithic function into tiny pieces that can be reassembled and composed at will. I think that the latter is the spirit and the source of power of functional programming. Regards, apfelmus