
Simon,
... At the moment, the only packages you can add in this way are:
ALUT, HGL, HUnit, OpenAL, OpenGL, QuickCheck, X11, cgi, fgl, haskell-src, html, mtl, network, parsec, time, xhtml
... instead include smaller and more "fundamental" packages: ByteString, regexps, Collections, Edisson, Filepath, Time, networking, web.
I agree with your original idea and Bulat's: the libraries packaged with a compiler should, as a general rule, be those that support the core features of the language or usage of the language itself. Not far off from that are things like being able to work with the operating system--these are essentially related to I/O--because without them your machinations would only be able to operate on themselves. Without getting philosophical about what is and is not necessary for a language to "speak," I'm just looking at the general big-libraries packaged with mature languages like Ada and C++. So here is my vote: Top priority: those you already mentioned, with Bulat's vote for Edison, ByteString, Filepath and Time; I vote for fgl (in my experience fgl is as useful in its own way as Data.Map), mtl, and haskell-src. So the core list might be: *base, haskell98, template-haskell, stm, mtl, haskell-src, *ByteString, Edison, fgl --only because for Edison and fgl there are no reasonable alternatives, so these are in a sense "standard" *readline, unix, Win32, Filepath and Time (they really complement unix and Win32) *Cabal--though this is an extraordinary convenience, it is not strictly necessary (Haskell-GHC users could always be forced to Include everything outside of a standard system directory. I don't mean a GHC-system directory--there are far too many language-specific directory structures crawling around, it's like trying to hide from Wal-Mart.) ... well, that's how I had started the email, but I think your original idea is right: stick with only what you absolutely need and with what is part of the Haskell "standard" library: *base, haskell98, template-haskell, stm, mtl, haskell-src and Cabal are fine; the rest can go together in a special "distribution." Why add mtl and haskell-src? Those seem to be situations where you would have to install a specific library (mtl, say) and specially integrate it into your build of another. At some point the dependencies merit the adoption of a "standard." Extra-Haskell parsing is a good example of leaving things out. HaXml's parser (HuttonMeijerWallace, PolyLazy) is roughly interchangeable with Parsec in many ways--though reasonable people may differ on which is actually better--and installing or uninstalling either would be easy. Things that are missing... There are some library systems that really should come standard and therefore need development. The most specific I can think of are deep-core Haskell debugging libraries (would you think of grabbing gcc without gdb?)--a GHC "HAT." Things that allow specialised code such as haskell-src-exts's dynamic loading modules would also be good--this is like having dynamic libraries for Haskell. Sorry for the length; I've been literally bugged-out with getting a commercially and GPL-usable comparable replacement for GMP going... Personally I would love to give GHC a built-in interface to a real high-level math library (vectors, arbitrary-precision Reals?) but that would be as bad as forcing all gcc users to uninstall a hypothetical gcc-BLAS before they could install their own system- tuned or more precise library. I have been realising more and more that Integers are not a math-library but a data-type (hence bitwise operations). -Pete

Hello Peter, Wednesday, August 23, 2006, 10:00:00 AM, you wrote:
... At the moment, the only packages you can add in this way are:
ALUT, HGL, HUnit, OpenAL, OpenGL, QuickCheck, X11, cgi, fgl, haskell-src, html, mtl, network, parsec, time, xhtml
... instead include smaller and more "fundamental" packages: ByteString, regexps, Collections, Edisson, Filepath, Time, networking, web.
Top priority: those you already mentioned, with Bulat's vote for Edison, ByteString, Filepath and Time; I vote for fgl (in my experience fgl is as useful in its own way as Data.Map), mtl, and haskell-src.
sorry, but you miscitated me and seems to misinterpret whole idea: 1. Simon suggests that there is a core GHC distribution. it should be GHC _compiler_ itself and contains only libraries whose implementation are closely tied to compiler version (such as template haskell) or required to build ghc itself (such as regexp-posix). on the package-friendly OSes it should be the only GHC distribution, with intent that all other libraries required for compilation of concrete program, will be installed automagically on demand 2. For windows-like OSes where users prefer to see larger monolithic installations we should include more libraries in "stadrad distribution". i suggested to exclude from list above graphics/sound libs, i.e. leave only HUnit, QuickCheck, cgi, fgl, haskell-src, html, mtl, network, parsec, time, xhtml and add more "fundamental" libraries: ByteString, regexps, Collections, Edisson, Filepath, Time, networking, web my main idea is that libraries i suggest to include is very small so that they will not make installer substantially larger. the rule of thumb is that libraries bundled should be much smaller that 'base' lib and implements one of following * data structures and algorithms (highest priority) * web/networking * interfacing with OS such as file operations (lowest priority) i think that this are the kinds of libraries most widely used. moreover, exclusion of large and useless graphics libs will allow to cut down GHC installer by about 20%, while "selling" of basic ghc without these libraries will not by for us any more size reduction 3. We can also create "larger installer" which includes libraries and tools that also highly popular but too big to "sell" them to all who want to download ghc. i mean at first place graphics, databases and RAD tools. concrete, wxHaskell, gtk2hs, db libs, VisualHaskell and EclipseFP and last - all the libraries installed except for core ones will be easily upgradable without upgrading ghc what will boost their development. on the other side, when we upgrade ghc itself, it should be possible to leave versions of these libraries that are already installed -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Bulat Ziganshin
1. Simon suggests that there is a core GHC distribution. it should be GHC _compiler_ itself and contains only libraries whose implementation are closely tied to compiler version [...]
2. For windows-like OSes where users prefer to see larger monolithic installations we should include more libraries in "standard distribution".
I'd like to see this (an extra-libs) package to be more than just a bandaid for package management deficiency. A standard library install could encourage the use of those libraries, and perhaps also work as a recommendation of specific libraries. I'd certainly encourage (virtual) Debian/RPM-packages that depend on the standard package bundle, so that it is easy to get the standard libraries for a default installation, and to get coherent installations on different machines.
I suggested to exclude from list above graphics/sound
I'd prefer to include libraries if they are stable, useful, and provide unique functionality. I'm a bit ambivalent of including libraries with overlapping functionality - while I expect opposing this to be futile, I do think it is better to focus effort on a single standard implementation. Oh, and while this is currently a GHC issue, it would be nice if other systems supported the same set of libraries. -k -- If I haven't seen further, it is by standing in the footprints of giants

Bulat Ziganshin wrote:
on the other side, when we upgrade ghc itself, it should be possible to leave versions of these libraries that are already installed
Now *that* is the tricky part. It's something I believe is important and I'd like to see GHC support this in the future. Just replacing GHC without upgrading libraries (or RTS) should be possible, but we have to be careful not to modify any shared knowledge between GHC and the RTS. If the RTS is upgraded, we have to be careful about things that the RTS knows about the base package. If the base package is upgraded without also replacing the other libraries... this is where it gets really tricky. Binary dependencies between library code tend to be very deep due to cross-module inlining and optimisations, so right now the chances of upgrading base without replacing everything else are almost zero. To be able to do this I believe we have to track very carefully the API/ABI that a package is exposing, so that we can be sure that a replacement is truly compatible. This may mean restricting optimisations across package boundaries. Cheers, Simon

Simon Marlow wrote:
If the base package is upgraded without also replacing the other libraries... this is where it gets really tricky. Binary dependencies between library code tend to be very deep due to cross-module inlining and optimisations, so right now the chances of upgrading base without replacing everything else are almost zero. To be able to do this I believe we have to track very carefully the API/ABI that a package is exposing, so that we can be sure that a replacement is truly compatible. This may mean restricting optimisations across package boundaries.
I think it would be a great pity to sacrifice any optimizations, if it was at all possible to simply reconfigure/rebuild/reinstall all the existing packages on one's system after upgrading ghc. As long as there was a tool to automatically do this it wouldn't be a big deal. Perhaps packages which are not in ghc core which depend on the RTS (perhaps indirectly via other C libs they link to which call into Haskell) could be explicitly marked as such, so the user could manually check to see if the package had been tested with the latest ghc, or if there was an updated version to download (hopefully this could be automated using a central database of all known packages). Regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com

Hello Simon, Wednesday, August 23, 2006, 2:26:56 PM, you wrote:
Just replacing GHC without upgrading libraries (or RTS) should be possible
imho this is not very useful. typically, bug fixes are spread over compiler, rts and libraries, so upgrading only compiler seems very strange idea and upgrading to major new version (6.4.2->6.6) will be impossible anyway
If the base package is upgraded without also replacing the other libraries... this is where it gets really tricky. Binary dependencies between library code tend to be very deep due to cross-module inlining and optimisations, so right now the chances of upgrading base without replacing everything else are almost zero. To be able to do this I believe we have to track very carefully the API/ABI that a package is exposing, so that we can be sure that a replacement is truly compatible. This may mean restricting optimisations across package boundaries.
i think that better way is to supply non-core libs in source form and just recompile them in this case. so, eventually windows installation should become equivalent of installing core ghc and then downloading/compiling all the bundled libs. the benefit of packaging these libs together with core ghc should be just that some "standard library" exists and automatically downloaded with any windows ghc installation. in all other ways these libs should be no different from "non-standard" ones for package-based unixes i propose making packages for all libs that was considered "standard" as requirement to consider GHC port to this OS as complete so, GHC developers will not be bothered (at least, in theory :D) with libraries distribution/support/upgrading problems, but nevertheless we will retain some "standard libraries" set, which is supposed to be supported by many OS/compiler combinations as a consequence from this idea, non-core libs should be installed in source form and then compiled by usual cabal procedure. on the installation of new ghc version existing libs should be recompiled for new ghc version. or alternatively, ghc should compile each library when it first time used (with this ghc installation), like jhc does ability to make at least minor ghc upgrades without recompiling libraries installed (major upgrades are definitely impossible because of .hi format changes) seems great but means that we can't inline even 'head' definition outside of 'base' lib :) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Am Mittwoch, 23. August 2006 15:23 schrieb Bulat Ziganshin:
[...] i think that better way is to supply non-core libs in source form and just recompile them in this case. so, [...]
Nice theory, but this doesn't work at all in practice: The majority of the packages mentioned so far are not purely Haskell, so one needs tons of development tools and C libraries, headers, etc. (all in a consistent state, of course) to compile those packages, which is a bit tricky on *nices and a huge task on WinDoze. And there might even be packages where Joe User can't get the development tools without signing an NDA... Regarding the sets of packages: This discussion comes up again and again, but it does not really focus on the main point, namely that we simply don't know what is important for a specific user or not. Some people can't live without XML parsing, others want fancy graphics and sound, others depend on elaborated datastructures but are otherwise happy with putStrln. So the only sane way to proceed is as Simon suggested: Ship GHC with the packages needed to compile and run "Hello, world!" and those which are intimately tied to GHC, and all the rest is shipped separately. Cross-module optimizations introduce a lot of dependencies, but are one of the strong points of GHC and are *absolutely* necessary to get decent performance for some applications. I would very much object to give this up for more stable ABIs. This is not a real problem on mature platforms with sensible package management systems, like most modern Linux distros, where one could easily upgrade GHC and automatically upgrade all the Haskell packages needed. Of course WinDoze is a bit different here. Therefore I opt for 2 different WinDoze installers, everything between doesn't make sense IMHO: * A set of highly modularized, small, separate installers for GHC/core packages and each non-core package. * As an alternative, a "sumo"/"omnibus"/<whatever you call it> installer containing everything from the central darcs repo, plus probably even more. This can have the option to install only a subset of the contained packages. The best option would of course be some kind of "net installer", just like Cygwin's setup.exe, but this is of course something for the future. Meanwhile, the first set of small installers should make people happy which have only a limited amount of disk space and a slow internet connection, while the "sumo" installer should make people with modern machines and high-speed ADSL/cable-modem/T1 more happy. Let's not forget that we live in a world where patches regularly exceed 100MB, downloadable game demos are >1GB and disks with >200GB are common even in cheap new computers. In such a setting, it is hard to argue that it is "much better" to surf the Net for an hour to get all the packages one wants instead of downloading and installing everything in a single click within minutes... Cheers, S.

Hi
Nice theory, but this doesn't work at all in practice: The majority of the packages mentioned so far are not purely Haskell, so one needs tons of development tools and C libraries, headers, etc. (all in a consistent state, of course) to compile those packages, which is a bit tricky on *nices and a huge task on WinDoze.
I believe its spelt Windows :) GHC ships with a large chunk of gcc and assorted stuff on Windows, and you can happily use GHC to compile up C programs.
And there might even be packages where Joe User can't get the development tools without signing an NDA...
I'm not convinced a project that is open source should be shipping things that people can't build, kind of goes against the whole open source thing, and is just plain annoying.
* A set of highly modularized, small, separate installers for GHC/core packages and each non-core package.
* As an alternative, a "sumo"/"omnibus"/<whatever you call it> installer containing everything from the central darcs repo, plus probably even more. This can have the option to install only a subset of the contained packages.
The best option would of course be some kind of "net installer", just like Cygwin's setup.exe, but this is of course something for the future.
Once cabal/hackage is finished, something like this probably becomes quite easly to do - so it might not be that far in the future.
Meanwhile, the first set of small installers should make people happy which have only a limited amount of disk space and a slow internet connection, while the "sumo" installer should make people with modern machines and high-speed ADSL/cable-modem/T1 more happy. Let's not forget that we live in a world where patches regularly exceed 100MB, downloadable game demos are >1GB and disks with >200GB are common even in cheap new computers. In such a setting, it is hard to argue that it is "much better" to surf the Net for an hour to get all the packages one wants instead of downloading and installing everything in a single click within minutes...
Just because I have a fast machine, doesn't mean I want to spend all the time downloading GHC. And people now have a fast net connection and can bittorrent movies all day, making hard disk space precious once more. I agree with Bulat that its sensible to try and keep some focus towards small and light, since those are things that impress developers, which is our target market. Thanks Neil

Hi Bulat,
sorry, but you miscitated me and seems to misinterpret whole idea:
Sorry about that. I put the emphasis on your mention of "fundamental," almost to the exclusion of compiler-builds. My point was that there are two design considerations when you are designing a compiler system: (1) the compiler itself (which practically everyone agrees should optimally be stand-alone); and (2) the core language libraries. Domain-specific languages like Mozart, research languages or languages with only one compiler, such as Clean, and languages which lack good FFI support tend to be complete language packages. Relatively mature languages such as Fortran, C, C++, Ada and ML (smlnj, MLTon) have "standard library" systems; I think Haskell is moving in that direction. The library systems in Haskell have gone beyond the Haskell98 standard of "core" functionality with extensions, such as GHC- specific code (and libraries integrated with it), TH, MTL and Arrows. What is "standard" is more properly a matter for Haskell- prime but may (and has) been implemented into Haskell compiler systems, especially the "biggies," GHC and nhc98. As compilers become more plentiful organisations and even individuals may move away from the Microsoft/Borland/CodeWarrior core distributions and introduce their own separate compiler for a language, such as Intel, Sun and IBM have for C, C++ and Fortran; Comeau and Digital Mars for C ++; and many others. Haskell hasn't gotten that far yet--once JHC and Yhc are production ready they might fill that position.
2. For windows-like OSes where users prefer to see larger monolithic installations we should include more libraries in "standard distribution".
I seriously believe the reason for standard distributions on Windows is the extreme difficulty of getting things to build correctly and work together. Once you have reached that beautiful point where almost everything is balanced and relatively stable--Quick! package it before it breaks again! Package distributions for OS X and mostly- GUI Linux distributions are a convenience; they aren't practically necessary as they are with Windows. Imagine trying to tag GHC distributions to a capricious system like MinGW--which may have older versions of gcc installed.
* data structures and algorithms (highest priority) * web/networking * interfacing with OS such as file operations (lowest priority)
--web/networking? When I first wrote that email last night I agreed with you that including web and networking tools would be good, as a kind of uniform-interface to varying low-level systems libraries, but these are the kinds of libraries that are easily installed separately from a distribution and may be volatile enough to merit separate update cycles. For HTML, XML and cgi-networking in particular there are several stand-alone, stable libraries too choose from.
i think that this are the kinds of libraries most widely used. ... and useless graphics libs will allow to cut down GHC installer by about 20%, while "selling" of basic ghc without these libraries will not by for us any more size reduction
I agree. Utility is, however, a very relative term. Core language systems such as general Haskell-tuned data structures is not because at some point all programs need them to function.
3. We can also create "larger installer" which includes libraries and tools that also highly popular but too big to "sell" them to all who want to download ghc. i mean at first place graphics, databases and RAD tools. concrete, wxHaskell, gtk2hs, db libs, VisualHaskell and EclipseFP
Separately maintained, of course; that would give some freedom to Porters :)
and last - all the libraries installed except for core ones will be easily upgradable without upgrading ghc... boost their development.
That is the hope, I guess. The unfortunate problem with Haskell is rampant, rapid bit-rot: some programs written or maintained into 2003---only three years ago!--are already outdated. GreenCard is a prime example of this. My point in emphasising a somewhat standard compiler-and-core-libraries setup was to encourage widespread support and maintenance of "new-standard" libraries and to ensure that by forcing the compiler to build with them, they would not be left behind. -Peter
participants (7)
-
Brian Hulley
-
Bulat Ziganshin
-
Ketil Malde
-
Neil Mitchell
-
Peter Tanski
-
Simon Marlow
-
Sven Panne