
On 4/22/12 6:30 PM, Alvaro Gutierrez wrote:
On Sun, Apr 22, 2012 at 4:45 PM, Brandon Allbery
wrote: One reason: modules serve multiple purposes; one of these is namespacing, and in the case of interfaces to foreign libraries that may force a division that would otherwise not exist.
Interesting. Could you elaborate on what the other purposes are, and perhaps point to an instance of the foreign library case?
The main purpose of namespacing (IMO) is to separate concerns and make it easier to figure out how a project fits together. The primary goal of modules is to resolve namespacing issues. Consider one of my own libraries (chosen randomly via Safari's url autocompletion): http://hackage.haskell.org/package/bytestring-lexing When I inherited this package there were the Data.ByteString.Lex.Double and Data.ByteString.Lex.Lazy.Double modules, which were separated because they provide the same API but for strict vs lazy ByteStrings. Both of those modules are concerned with lexing floating point numbers. I inherited the package because I wanted to publicize some code I had for lexing integers in various formats. Since that's quite a different task than lexing floating point numbers, I put it in its own module: Data.ByteString.Lex.Integral. When dealing with FFI code, because of the impedance mismatch between Haskell and imperative languages like C, it's clear that there's going to be some massaging of the API beyond simply declaring FFI calls. As such, clearly we'd like to have separate modules for doing the low-level binding vs presenting a high-level API. Moreover, depending on what you're interfacing with, you may be forced to have multiple low-level modules. For example, if you use Google protocol buffers via the hprotoc package, then it will generate a separate module for each buffer type. That's fine, but usually it's not something you want to foist on your users. On the other hand, the main purpose of packages or libraries is as unit of distribution, code reuse, and separate compilation. Even with the Haskell culture of making small libraries, most worthwhile units of distribution/reuse/compilation tend to be larger than a single namespace/concern. Thus, it makes sense to have more than one module per package, because otherwise we'd need some higher level mechanism in order to manage the collections of package-modules which should be considered a single unit (i.e., clients will almost always want the whole bunch of them). However, centralization is prone to bottlenecks and systemic failure. As such, while it would be nice to ensure that a given module is provided by only one package, there is no mechanism in place to enforce this (except at compile time for the code that links the conflicting modules together). With few exceptions, it's considered bad form to knowingly use the same module name as is being used by another package. In part, it's bad form because egos are involved; but it's also bad form because there's poor technical support for resolving namespace collisions for module names. In GHC you can use -XPackageImports, which is workable but conflates issues of code with issues of provenance, which the Haskell Report intentionally keeps separate. However, until better technical support is implemented (not just for GHC, but also jhc, UHC,...) it's best to follow social practice.
I'm confused as to how type families vs. fundeps play a role here -- as far as I can tell both are compiler extensions that do not provide modules.
Both TFs (or rather associated types) and fundeps aim to solve the same problem. Namely: when using multi-parameter type classes, it is often desirable to declare that one parameter is wholly defined by other parameters, either for semantic reasons or (more often) to help type inference. Since they both aim to solve the same problem, this raises a new problem: for some given type class, do I implement it with TF/ATs or with fundeps? Some people figured to solve the new issue by implementing it both ways in separate packages, but reusing the same module names. (Witness for example mtl-2 aka monads-fd, vs monads-tf.) In practice, that didn't work out so well. Part of the reason for failure is that although fundeps and TF/ATs are formally equivalent in theory, in practice the implementation of TF/ATs has(had?) been missing some necessary machinery, and consequentially the TF/AT versions were not as powerful as the original fundep versions. Though the butterfly dependency issues certainly didn't help.
I'm interested to see examples where two or more well-known yet unrelated modules clash under the same name; I can't imagine them coexisting in public very long -- wouldn't the confusion among users (e.g. when looking for documentation) be enough to either reconcile the modules or change one of the names?
That's not much of a problem in practice. There are lots of different books with a Chapter 1, but rarely is there any confusion about which one is meant. The same is true of module names in packages. -- Live well, ~wren