Amen to Iavor's proposal!

Hierarchical decomposition leads to arbitrary and thus unguessable decisions, because many such decompositions are possible.  This problem nearly always happens, as Clay Shirky illustrates at http://www.shirky.com/writings/ontology_overrated.html .  Iavor has given some examples. Data vs Control provides some more.  Another, as Wolfgang hinted at, is UI vs Graphics.  These two notions overlap, with neither being more specific than the other.

Module hierarchy tries to give ontology and collision-avoidance.  Ontology is an failure as we've seen (and inevitably so, as Clay Shirky demonstrates).  Collision-avoidance has failed also, as Iavor pointed out, since packages can easily have module name collisions (e.g., I had a Data.Fun at one point).  However, we already prohibit collisions of package names, so we can get module uniqueness by using the package name as the top-level portion of every module in a package.  Beyond that requirement, package implementors can use whatever organzation style they like.

   - Conal

On Fri, Jun 19, 2009 at 12:08 AM, Iavor Diatchki <iavor.diatchki@gmail.com> wrote:
Hi,
I agree with Johan that the name hierarchy should be changed.  The
current approach has a number of drawbacks.  In no specific order:

 * Trying to use a single hierarchy to classify modules is inaccurate
because many module could logically belong in multiple locations.  We
have many examples that demonstrate this in the current hierarchy:
Text is not Data; the HTTP protocol is under Network, but XML is under
Text even though both are text based protocols; URLs are under Network
(and so are neither Data nor Text), file operations are under
System.IO but Network operations are in their own name space.  This is
not because the authors of the packages were not careful in selecting
the names.  The problem is that for many module there isn't a single
name that describes its content.

 * The current naming convention makes it harder to understand
programs (independent of overly long import names like
Network.Protocol.Http.Cookies, which could be just as well described
as Protocol.Network.Http.Cookies).  The real problem with readability
is that looking at the imports of a module does not give any
indication of what package the modules come from, which makes it hard
to understand the dependencies of the module and, more pragmatically,
makes it hard to lookup documentation for the module contents.

 * The current naming convention does not scale because each package
may introduce modules that are placed all over the name hierarchy.
For example, the utf-8 library redefines some IO operations so it has
modules under System.IO, it provides some ByteString support so it
also has modules under Data.ByteString, and finally it also deals with
text, so it has modules under Text.Codec.  This is a problem because
it is hard for package writers to avoid name collisions, without
knowing the modules in all available packages.

I think that a better way to organize our programs is to prefix the
modules in a package with the package name.  This will avoid the name
collision issue (or at least, greatly simplify it, because packages
that are uploaded to hackage need to have different names).  It would
also make the dependencies of a module quite obvious.  It would also
make our import lists much simpler.  For example, we would write
"import HaXml" instead of import "Text.XML.HaXML", or "import
Parsec.Char" instead of "import Text.ParsingCombinators.Parsec.Char".
If classifying modules according to their purpose is necessary (and I
am not sure that it is, if we can do it at the package level), then we
could think of a more suitable mechanism to achieve that goal then the
hierarchical names.

-Iavor

On Tue, Jun 16, 2009 at 7:45 AM, Ian Lynagh<igloo@earth.li> wrote:
> On Fri, Jun 12, 2009 at 10:46:07PM +0200, Johan Tibell wrote:
>>
>> Perhaps it's time to overhaul the hierarchy. Some top level module
>> namespaces like Network have become very crowded. Network is a very generic
>> name that it conveys very little information today when most software has a
>> network component. I suggest that parts of it be broken out into new top
>> level modules. As a first step I suggest we create a new Http (and not HTTP
>> with all caps please) module where we can have:
>>
>> Http.Client
>> Http.Server
>> Http.UrlEncoding
>> Http.Cookies
>> etc.
>
> I don't follow the logic. If Network is crowded, doesn't that mean we
> should be aiming to subdivide it, e.g. moving
>    Network.Http.*
> to
>    Network.Protocol.Http.*
> (FSVO "Protocol"; could be "Tcp", or something else entirely)?
>
> If we move everything up to the root then the root will be even more
> crowded than Network is.
>
>
> Thanks
> Ian
>
> _______________________________________________
> Libraries mailing list
> Libraries@haskell.org
> http://www.haskell.org/mailman/listinfo/libraries
>
_______________________________________________
Libraries mailing list
Libraries@haskell.org
http://www.haskell.org/mailman/listinfo/libraries