
Dear library folks, Having finished my current batch of compiler hacking I'd like to kickstart the libraries discussion again. As you may recall, we were stuck on the layout of the library hierarchy - below I've appended a new version (relative to Malcolm's last version) with some changes which I've listed first. I hope we can take this and converge on something which we're all happy with. I've also included an attempt at a set of guidelines for the contents of the top-level categories, as requested by one or two people: this should make it easier to place new libraries, and it's also useful to see some of the places where categories overlap. We should also write down guidelines on naming (not only for libraries themselves but also for functions, types and classes) at some point. There are several process-related issues to agree on before we can start actually writing code, but this message is too long already so I'll try to tackle them in separate mails tomorrow. Cheers, Simon Changes to the hierarchy - Data and Structures merged. The distinction between the two isn't clear: eg. IORef and STRef could be considered structures, as could Complex, Maybe, Either, List, PackedString etc. Put them all under Data - I think Data.Array, Data,Trees etc. look quite natural. - Moved Control back up to the top level (was previously in System). I'll give way on the Lang issue, but I think we'll have to expand the remit of Data to include not only data types themselves but also operations and classes over data (we already have Data.Memo), so now Dynamic and Generics fit in there too. Having a 'Type' hierarchy doesn't seem appropriate, because most libraries in Data also define types and it would just create confusion. - Moved Monad back into Control. - Added hypothetical Network.Protocol.{HTTP,FTP,SMTP,...} - Added Data.Bool, Data.Tuple - Added Text.Show & Text.Read - Moved Prelude.ShowFunctions to Text.Show.Functions. - Clarified that Numeric exports the Haskell 98 numeric classes (Num, Integral, Real etc.). - Renamed URL back to URI. - Capitalised NHC (hope that's OK). - Algebra & Numeric could be merged (as suggested by Dylan Thurston in the previous thread), I haven't made any changes here yet though. Numeric seems an appropriate place for the existing numeric operations and classes, but 'Math' would be less so. - There are still a few things left in the Prelude that don't have a home elsewhere: curry, uncurry, id, const, (.), asTypeOf, seq, ($!). ------------------------------------------------------------------------ ----- Hierarchy guidelines Control Libraries which provide functions, types or classes whose purpose is primarily to express control structure. Data Libraries which provide data types, operations over data types, or type classes, except for libraries for which one of the other more specific categories is appropriate. Algebra ? (someone more knowledgable please fill this in) Database Libraries for providing access to or operations for building databases. Debug Support for debugging Haskell programs. FileFormat Support for reading and/or writing various file formats (except language source which lives in Language, and textual file formats which are catered for in Text). Foreign Interaction with code written in a foreign programming language. Graphics Libraries for producing graphics or providing graphical user interfaces. Language Libraries for operating on or generating source code in various programming languages, including parsers, pretty printers, abstract syntax definitions etc. Numeric Functions and classes which provide operations over numeric data. Network Libraries for communicating over a network, including implementations of network protocols and standards. System Libraries for communication with the system on which the Haskell program is running (including the runtime system). Text Libraries for parsing and generating data in a textual format (including structured textual formats such as XML, HTML, but not including programming language source, which lives in Language). Others: GHC, NHC, Edison Further top-level names will be allocated on an as-needed basis. ------------------------------------------------------------------------ ----- Prelude -- Haskell98 Prelude (mostly just re-exports other parts of the tree). Control Exception -- (opt, inc. error & undefined) Concurrent -- as hslibs/concurrent CVar -- these could all be moved under Data Chan MVar Merge QSem QSemN SampleVar Semaphore Parallel -- as hslibs/concurrent/Parallel Strategies Monad -- Haskell 98 Monad library ST -- ST defaults to Strict variant? Strict -- renaming for ST Lazy -- renaming for LazyST Either -- monad libraries State Error etc. Data Bits Bool -- &&, ||, not, otherwise Tuple -- fst, snd Char -- H98 Complex -- H98 Dynamic Either Int Maybe -- H98 List -- H98 PackedString Ratio -- H98 Word IORef STRef Binary -- Haskell binary I/O Digest MD5 ... -- others (CRC ?) Array -- Haskell 98 Array library Overloaded -- (opt) IArray - GHC's overloaded arr libs Mutable -- (opt) MArray IO -- mutable arrays in the IO/ST monads ST Trees AVL RedBlack BTree Queue Bankers FIFO Collection Graphs FiniteMap Set Memo -- (opt) Unique Algebra DomainConstructor -- formerly DoCon Geometric -- formerly BasGeomAlg Database MySQL PostgreSQL ODBC Debug Trace Quickcheck Observe -- choose a default amongst the variants Textual -- Andy Gill's release 1 ToXmlFile -- Andy Gill's XML browser variant GHood -- Claus Reinke's animated variant Edison -- (opt, uses multi-param type classes) Prelude -- large self-contained packages should have Collection -- their own hierarchy? Like a vendor branch. Queue -- Or should the whole Edison tree be placed ... -- under [Data.]Structures? FileFormat -- 'Codec' might be a more accurate name? Compression Gzip Bzip2 Graphics Jpeg Ppm Png Audio Wav Mp3 Video Mpeg QuickTime Avi Foreign Ptr StablePtr ForeignPtr -- rename to FinalisedPtr? to void confusion with Foreign.Ptr Storable Marshal Alloc Array Errors Utils C Types Errors Strings Graphics UI Gtk FranTk Fudgets CleanIO Drawing HOpenGL Format -- use FileFormat.Graphics instead Language Haskell -- hslibs/hssource Syntax Abstract Core Lexer Parser Pretty Python? C? Numeric -- exports std. H98 numeric type classes DSP FastFourierTransform Noise Oscillator Network -- won't need to be optional (will use FFI only) Socket -- redesign (merged Socket, SocketPrim, BSD) URI -- general URI parsing CGI -- one in hslibs is ok? Protocol HTTP FTP SMTP System -- Interaction with the "system" IO -- H98 + IOExts - IOArray - IORef Directory Select Console GetOpt Readline Posix -- redesigned, use FFI only IO -- there was a suggestion to split Posix into Process -- separate chunks like IO + Process Win32 -- the full win32 operating system API Mem -- rename from cryptic 'GC' WeakPointer -- (opt) StableName -- (opt) Time -- H98 + extensions Locale -- H98 CPUTime -- H98 -- split H98 "System" (too generic) into: Exit Environment (getArgs, getProgName, getEnv ...) Text Read Show Functions -- optional instance of Show for functions. Regex -- previously RegexString PrettyPrinter -- default (HughesPJ?) HughesPJ Wadler ... Html -- HTML combinator lib Xml Combinators Parse Pretty Types ParserCombinators -- no default Parsec Hutton_Meijer ... GHC Primitives UnboxedTypes ... NHC Stuff

Wed, 23 May 2001 17:43:01 +0100, Simon Marlow
below I've appended a new version (relative to Malcolm's last version) with some changes which I've listed first.
I will only say that I like it. -- __("< Marcin Kowalczyk * qrczak@knm.org.pl http://qrczak.ids.net.pl/ \__/ ^^ SYGNATURA ZASTÊPCZA QRCZAK

Simon Marlow
below I've appended a new version (relative to Malcolm's last version) with some changes which I've listed first.
Marcin Kowalczyk:
I will only say that I like it.
Me: Likewise. I'm very happy with the broad structure now. There are only minor nits left (such as already mentioned by other people, e.g. FileFormat.Graphics -> FileFormat.Image). Regards, Malcolm

On Wed, May 23, 2001 at 05:43:01PM +0100, Simon Marlow wrote:
Data Complex -- H98
This seems decidedly out of place here: complex numbers are not interesting as a data structure, but as a mathematical construct. (I.e., the instance of Num is the interesting part.) Part of the motivation for a "Mathematics" hierarchy is to have a natural place where, e.g., "Complex" would belong.
Data Int
This seems somewhat less out of place, since, IIRC, the functions in this library deal with low-level representation.
Algebra DomainConstructor -- formerly DoCon Geometric -- formerly BasGeomAlg
Numeric -- exports std. H98 numeric type classes DSP FastFourierTransform Noise Oscillator
You make a good point that these current libraries do not fit all that well under "Mathematics". Maybe we should keep "Numeric" and create a new "Mathematics" hierarchy which would contain: - the std. H98 classes. [Incidentally, I agree with Sergio Mechvelliani that the "Num" class is badly named. I would vastly prefer to call the class "Ring".] - the Complex class - Mechvelliani's classes - arbitrary precision arithmetic - matrix classes Maybe "Computational" would be a better name than "Numeric"? Best, Dylan Thurston

Dylan mentions "matrix classes" and then concludes:
Maybe "Computational" would be a better name than "Numeric"?
I've been trying to demonstrate that there more to "matrix" than just data structure or just simple linear equation solver. It is easy to come up with the latter, or with a set of primitive product operations. But people seem to be ignoring the fact that there is much more to linear algebra than those trivialities. Eigenproblems for example... Dense cases, sparse cases and all that Pandorra box of non-trivial engineering tools. These things are still alive, thank you very much, and the papers are still written about them every day. Neither the problems nor the papers are trivial. Yet they are being ignored in your hierarchy. Please, mark at least some spot for them. And do not tie them to "matrix" because "matrix" is just one specific representation of operator algebra -- as I am showing in some modules of mine, for example. I do not care what name you choose. Computational? Linear algebra? But please do not ignore them altogether. Jan

On Wed, May 23, 2001 at 05:43:01PM +0100, Simon Marlow wrote:
Edison -- (opt, uses multi-param type classes) Prelude -- large self-contained packages should have Collection -- their own hierarchy? Like a vendor branch. Queue -- Or should the whole Edison tree be placed ... -- under [Data.]Structures?
In this particular instance, I'd like to use the Edison structures, and I'd like them to be available in a standard place. But maybe this just means there should be copies of some of these libraries under the Data hierarchy. Best, Dylan Thurston

Here's a concrete proposal for a Mathematics subtree. Following Jan Skibinski's comment about linear algebra and some thought, I convinced myself that there is no natural division between numerical/computational libraries and more "pure" mathematics. Description: Mathematics Libraries for mathematical structures or algorithms. ------ Mathematics Prelude -- The so-called "numeric" hierarchy from -- the H98 Prelude Complex -- Standard H98 Ratio -- Standard H98 DSP FastFourierTransform Noise Oscillator LinearAlgebra -- Provides a simple-minded matrix implementation Sparse LUDecomposition -- Various algorithms/representations go here, ... -- Somebody help me with a good hierarchy. DomainConstructor -- Mechveliani's library ExactReal -- Various people have done good work here. -- Unfortunately, no-one seems to have published -- their source code in library format.

Sorry for the lengthy discourse, but I was unable to cut it down even after I re-read it twice :-). On Fri, 25 May 2001, Dylan Thurston wrote:
Here's a concrete proposal for a Mathematics subtree. Following Jan Skibinski's comment about linear algebra and some thought, I convinced myself that there is no natural division between numerical/computational libraries and more "pure" mathematics.
Good! I'll start with some critical comments first, taking one of the items from your list as an example.
Description: Mathematics Libraries for mathematical structures or algorithms.
------ Mathematics Prelude -- The so-called "numeric" hierarchy from -- the H98 Prelude Complex -- Standard H98 Ratio -- Standard H98 DSP FastFourierTransform
Why here? Unless you are referring to one specific FFT implementation (Donatio?) the FFT can be looked upon as just one of many tools of linear algebra. Why FFT? What so sanctimonious about it? Nothing; it is popular because of its speed, and it is versatile because of its popularity (I admit, I exaggerate quite a bit here to make my point :-)). In fact, DFT is one of many possible vector transformations, and FFT is its fast implementation. In this case - the unitary transformation. But you can implement it on many levels: using streams of bits, blocks of complex numbers (vectors, matrices), etc. But other unitary transformations have their uses too. Have you ever heard about Walsh transformation, with its (orthogonal) basis vectors made of (+1, -1) components, such as: [1,1,1,1,1,1,1,1], [1,1,1,1,-1,-1,-1,-1], etc? Geophysists used to employ it (or still do it?) for analysis of their seismic data. Incidentally, there is also a Walsh-Hadamard transform, a basic mixing transformation for quantum computations. How about Hilbert transformation? You can use it to upgrade real signals to their complex equivalents, such as sin k*n -> exp i*k*n. Imagine a plot of your original real signal in X-T coordinates, suddenly augmented by Y-T "wing", as by some miracle. Very useful, believe me! Sorry for ruining the FFT altar, but I think we should keep certain things in a proper perspective. I believe there will be many implementations of FFT in Haskell libraries - all depending on the context. DSP, images, generic linear algebra, etc. would all want their own versions of FFT, because of specific representation they use. And that brings me to the second point I want to make. If I could describe an algorithm, such as FFT, in generic terms than I would have and example of a fine generic algorithm. Unfortunately, more often than not I must be datatype specific. And that means some choice of Matrix and Vector. A very controversial issue. How to design these two to satisfy everyone? I do not think it's possible at all. For a naive user of graphics those are 2D or 3D entities (or 4D for implementation reasons). And real too! Represented by arrows so to speak. Some other users would like to have it n-dimensional but Double, some need Complex Double, some would be only happy with a generic "a". Then there is a question of underlying implementations. Arrays? If yes, then what kind of arrays? Would lists be satisfactory and if yes with what size limitations? Or maybe I can forget about lists and arrays and choose something else altogether (as I do it in QuantumVector module)? Facing all those possible choices, should I be arrogant enough to claim the generic names Vector and Matrix for something that I choose to implement my own way? [We once sidestepped at least this controversial naming issue by calling the things BasicVector and BasicMatrix, respectively. For the same reason I have introduced the QuantumVector, not just THE Vector]. ------------------------------------------------------ But all those remarks were somewhat pessimistic and destructive. So here are some constructive ideas: + Keep low profile regarding Matrix and Vector concepts. Do not build anything around Matrix and Vector, but rather think algorithmically instead. If you must use the two, treat them more or less as your private stuff. For example, my module Eigenproblem casts several algoritms in terms of abstract Hilbert operators and Hilbert spaces. This is as far as I was able to go abstract-wise for a moment and I still might improve on it one day. But that module relies on the lower level implementation module LinearAlgoritms, which implements everything as lists. Why lists? Because I still do not like Haskell Arrays as they are and because I am lazy and lists are easy to work with. But that's not the point. What is important are the ideas implemented in those modules, which can be easily re-worked for any data structure, Haskell arrays included. + Expect that each proposed linear algebra library covers a significant chunk of such "territory" because only then it can be proven useful and consistent. If a library just defines a Matrix, it is useless by a virtue of arguments presented before. If it goes only as far as LUDecomposition it is naive and simple minded -- good only for solving linear equations. If it inverts the equations and also finds eigenvalues of symmetric matrices it is a step in right direction. If it handles eigenproblems (eigenvalues and eigenvectors - all or some) of Hermitian operators, nonsymmetric real operators, up to any kind of operator of small and medium size problems it is useful. If it handles big eigenproblems (n > 1000 or so) - no matter what technology (using "sparse" matrices perhaps) it uses, it is very good. If it provides Singular Value Decomposition on a top of all of the above than it should be considered excellent. + Of course, any good linear algebra library would provide many tools for handling the tasks which I outlined above. But those are not the goals in themselves, they are just the tools. What I have in mind here are all those Choleskys and LUDecompositions, and triangularizations and tridiagonalization procedures, and Hessenberg matrices, and QR or QL decompositions, etc. Do not measure the value of a library by a number of such tools appearing in it. Value its consistency and ability to achieve the final solution. ---------------------------------------------------------- Summarizing: LinearAlgebra appearing in hierarchy (after Dylan) Mathematics ... LinearAlgebra NonlinearAlgebra (perhaps) .... is just one element which should be clearly marked as very important. But I'd suggest not to mark any of its specifics, because that's very much the design and implementation choice. But we could specify what should be expected from a good linear algebra library, as I tried to describe it above. Jan
participants (5)
-
Dylan Thurston
-
Jan Skibinski
-
Malcolm Wallace
-
Marcin 'Qrczak' Kowalczyk
-
Simon Marlow