Re: new Library Infrastructure spec.

7 Jun 2004

      At 17:22 06/06/04 -0400, Isaac Jones wrote:
...
Graham Klyne  writes:
...
General:
After a first reading, I find myself a little confused about the
module naming invariant (sect 2.2), and subsequent comments about
prefixes (e.g. 4.2.1, 4.2.3).  I think that the term "prefix" is used
exclusively to refer to file location, but the name suggests something
"deeper" is intended.
True.  Can you suggest a wording that would be more clear?  For
instance, the explanation of the flags in section 4.2 say that they
"specifies where the installed files for the package should be
installed".
I'm replying without context properly "swapped in", so my apologies if 
these comments seem a bit disconnected...

If the term prefix is used purely to indicate the file location, then I 
suggest calling it a "path name prefix".  I think my confusion may be that 
I'm not clear about the interaction (if any) between path names and module 
names.  With current Haskell systems, my experience is that a hierarchical 
module name corresponds with the trailing components of the path where the 
module is stored.  So there's a leading path component that might be 
interpreted as a "prefix".

It's difficult to make concrete suggestions about something where I'm not 
confident of the actual facts, but I think an added sub-section 2.x (say, 
just after 2.2) that explains the relationship between module names, file 
names and packages; e.g. (and the details here may be wrong):

[[
2.x Packages, file names and module names

In [some|many] Haskell implementations, hierarchical module names are 
related to the location of the module files that the Haskell compiler uses 
for the module definition.  For example, when a compiler imports a module, 
it uses specified file location(s) as the starting point for a search to 
locate the module definition file.  Suppose the specified file location is 
<path>, then the compiler may look for module "Foo.Bar.Baz" in location:
     <path>/Foo/Bar/Baz
(or similar, depending on the hosts system directory and file naming syntax).

When a package is installed, the value of <path> is a "package path name 
prefix", or just "prefix", that indicates where the package files are 
installed.  Subdirectories of this prefix location correspond to the module 
hierarchy declared within the package.  Module names are not themselves 
affected by the package path name prefix used, so differenct versions of a 
module hierarchy may be stored in different locations.
]]
...
What if I change it to: "specifies where the installed files for the
package should be installed (ie the location of the files
themselves)."
[Sorry, I've lost context to respond to that particular point;  I hope the 
above thoughts help.]
...
...
I'm also (mildly) concerned that the invariant
might create problems for distributing and installing large libraries
(cf. the problem noted in section 3.1.2).
While appreciating the desire noted in 2.2 for avoidance of
complexity, I can't help wondering if there's a "minimal" approach to
qualifying module names that would allow module version-mixing in a
program.  I'm thinking of something like a "hidden" qualification of
each module name that is derived from the particular source of the
module (location, hash of content, or something else).  Then,
different versions of a module might be made visible to different
source files without prohibiting multiple versions appearing in a
program.  The details of such differentiation would be entirely
compiler dependent, so I think the design space is not unduly
prejudiced.
These issues involve tricky compiler symbol naming issues which I
won't comment on (maybe one of the compiler hackers can explain).
Perhaps in the future, we can relax these requirements, however.
I was peripherally aware of some of that discussion, but I thought the 
biggest problems may have been that the earlier proposals forced a 
particular naming architecture on the compiler implementations in an area 
which is quire implementation sensitive.  I was hoping that the "hidden" 
qualification might mitigate that difficulty.  But, in truth, if there is 
scope for future relaxation of the framework, I think that this issue 
should properly be deferred.

One day, if Haskell truly succeeds as we'd like, I think the ability to 
have different versions of a package within a single program will be an 
important consideration.  For example, one of the problems with Microsoft's 
DLL architecture has been the problems caused when one application upgrades 
a DLL which is used by other applications.  Maintaining total backward 
compatibility can be really hard;  the alternative (AFAICT) is to allow 
some components to continue to use an older version as newer versions are 
introduced.

(A random thought that crosses my mind is source preprocessing, so that, e.g.:

    import Foo.Bar.Baz

might be replaced with

    import Foo.Bar.Baz.V10 as Foo.Bar.Baz

as a package is installed.

That would (maybe) be messy to do, but it suggests a possible approach in 
the Haskell compiler front-end which would not impact the complex code 
analysis phases that come later?  Maybe a stop-gap might be to incorporate 
appropriate functionality into cpphs?  I don't really like all this, I'm 
just trying to offer ideas.)
...
...
Section 1.2 and elsewhere:
I find the use of Unix scripting features may be less easy to operate
on (say) Windows systems.  I think Simon Marlow's suggestion may be
better:
(snip runhaskell explanation)
I just commented on this in my reply to Keith.  This scripting feature
is a convenience for Unix and has no real effect on the behavior for
windows.  In windows, the script must somehow be executed.  In any
system, the user may choose to compile the module by hand.
OK, maybe then play down that aspect slightly?  E.g. in section 1.2, after 
the example:
[[
The Haskell program imports a main program from Distribution.Simple, which 
impleemnts the HPS simple build infrastructure.  The first line is present 
for Unix systems to ensure that, when it is run, the file is interpreted by 
the program runhugs.  On other operating systems, different mechanisms may 
be used to ensure that the file is interpreted as a Haskell program.
]]

Hmmm... I'm still not entirely happy with this.  It seems that the 
"universal" setup.lhs format is being adapted to serve the needs of a 
single operating system.  Indeed, to serve a particular Haskell system (Hugs).
...
One thing that the windows community might do is write a program which
reads the first line of an .hs or .lhs file to see if it's a #! line,
and if so executes it with the given compiler, if not, drops you into
an interpreter, or an editing environment or whatever.
Isn't this akin to what Simon suggested, except that his suggestion was 
independent of any particular operating system?
...
I just don't know what the default double-click behavior for windows
should be, really.  Can you have a right-click "Open With" context
menu?  Can windows prompt you with the options (run, load in
interpreter, or edit)?
(For Windows, it's normally based on the file extension and registry 
entries;  this may be problematic if there are multiple Haskell 
installations on a single machine.)

[...]
...
...
So what's my concrete suggestion?  Maybe (a) to emphasize the Setup
test option (sect 4.2), and (b) define a default location in the
install path for a  test suite, or an entry in the package description
(4.1, and elsewhere) specifying a command to run the test suite.
It occurs to me that this is one of the tasks that can be implemented
somewhat independently of most of the system.
Yes, I think it's mostly a documentation thing, maybe with a couple of 
helpful defaults to minimize the barriers to actually shipping self-test code.
...
...
Section 3.1.1, final para:
I found this paragraph really hard to understand.  It seems to be
contradicting itself (...are separate...all use the same user
packages...), but maybe I'm just not understanding.
It says that since the package database depends on the compiler and
version, you should be careful if you have two installations of the
same compiler and the same version.
Lets say that a user mounts his home directory on his linux machine
and on his solaris machine.  He also uses ghc-6.2 on both machines.
His user packages are installed in ~/ghc-6.2-packages (or something)
and are compiled for Linux.  Now he had better be careful using the
solaris version of the compiler, because his packages are compiled for
Linux, and they're in a spot where the solaris compiler will find
them!
Your 1st paragraph above is much clearer.  Let's see if I can make it fit 
to replace the text I commented on...

[[
The package database takes account of the compiler and version, but not the 
host system.
So, if there are multiple installations of the same compiler/version on 
different hosts, they may not share a library package installation.
]]

If that catches the essence of what you are saying, details can be 
elaborated as needed by subsequent sentences.

#g

------------
Graham Klyne
For email:
http://www.ninebynine.org/#Contact