
There now follows a small grab-bag of miscelaneous but related thoughts. (Which probably means that this post will spawn a 2000 message thread discussing one tiny side-issue, rather than the main thrust of the message...) First of all, benchmarking. We have The Great Language Shootout, which tests how fast solutions implemented in various programming languages can be, given unbounded amounts of time, energy, expertise, and compiler modifications. Which is interesting, from a certain point of view. In another sense, it's less interesting. For example, a C program can be as fast as any Haskell program, since compiling Haskell entails transforming it *into* C. (Unless you're seriously going to suggest that GHC's native code generator is any match for the might of a half-decent C compiler...) That got me thinking about a more interesting question: Not *how fast* can it go, but *how easily* can it go fast? People have written fast C programs and fast Haskell programs, but how easy is it to write a "typical" program and have it go fast, without spending years optimising it? With that in mind, I set of with a plan to do some small-scale benchmarking. Pick a problem, solve it in both Haskell and C++, in the "simplest, most obvious way", and then apply "the simplest, most obvious optimisations". Measure performance and compare. There are benchmarks that just output some data. ("Find all the prime numbers less than 1000...") I dislike these for a couple of reasons, not least because you can precompute the correct answer and just write a program which outputs that. (!) So I decided that all my benchmark programs should take a variable-sized data file as input, and save the results as output. There's an old maxim of running benchmarks multiple times, to ensure that whatever time you got wasn't just a fluke. But then, maybe the first run pulls the input file into the file cache, and subsequent runs go faster. If your input data was randomly generated, perhaps you chose a particularly optimal or pessimal data set by accident. So I decided that each test run should use a different input file of approximately the same characteristics (size, random distribution, etc.) So I ended up picking benchmarks like "sort the lines of this file into ascending order", "count the number of unique words in this file", "produce a histogram of byte values", "compute some statistics of this list of numbers", etc. The tasks are all extremely simple, so that there is some hope of it being possible to also implement them in C++. One benchmark turned out to be particularly interesting: I call it "byte histogram". The task is simple: - Open a binary input file. - Read a stream of bytes from it. - Count how many times each of the 256 possible byte values appears. The test inputs are binary files of various sizes, some with a uniform distribution, some with variously skewed distributions (so that some bytes have a vastly higher count than others). Assuming we have some suitable import statements, it's quite easy to do this: bytes <- BS.readFile "Input.bin" let out = BS.foldl' (\ map byte -> MAP.insertWith (+) byte 1 map) MAP.empty bytes writeFile "Output.csv" (unlines $ map (\(k,v) -> show k ++ "," ++ show v) $ MAP.toAscList out) (There is some slight trickiness if you want bytes with zero frequency to still be listed.) All of this *works* perfectly - i.e., it produces the correct answers. It's also astronomically slow, and it's very easy to make it eat many gigabytes of RAM while it runs. (If the program starts actually swapping, then you will truly know what "slow" is!) OK, so what happens if we replace Data.Map with Data.IntMap? Well, it goes slightly faster, and consumes slightly less RAM. Inputs which didn't quite complete before will run to completion now. But performance is still abysmal. Enough with this tree sillyness. What happens if you use an array? Given that the entire program (apart from the last 0.02% of it) spends its time notionally mutating the data, a mutable array would be the logical choise. Dial up an IOArray Word8 Int and the structure of the program changes slightly, but it's still only a handful of lines of code. And the performance? Still glacial. In particular, suppose we have inc :: IOArray Word8 Int -> Word8 -> IO () inc array byte = do count <- readArray array byte let count' = count + 1 writeArray array byte count' in the main loop. The program is giving us the right answer, but using absurd amounts of time and space to get it. Now watch this: inc :: IOArray Word8 Int -> Word8 -> IO () inc array byte = do count <- readArray array byte let count' = count + 1 count' `seq` writeArray array byte count' And now, suddenly, memory usage becomes constant, regardless of input size, and run-time is slashed. The program goes from taking 50 seconds to process a certain file to taking only 0.02 seconds. And from taking 400 MB of RAM to taking... less RAM than I can actually measure properly. If we now replace the IOArray with an IOUArray then the seq call becomes superfluous, and there's a small performance increase, but nothing major. None of this should be any surprise to anybody who actually knows how Haskell works. For anybody not yet expert enough to see the problem: The "inc" function doesn't read the array, see that a given byte has a count of 5, and overwrite that with a 6. It sees a 5 and overwrites it with a 5+1. By the end of the main loop, every single array cell contains 1+(1+(1+(1+(1+(1+....)))), which takes a stackload of RAM to hold. That explains why RAM usage is so absurd. On top of all that, creating all this data takes time, and the spiralling RAM usage makes the GC go crazy trying to find garbage to get rid of (when of course there is none). And finally, at the end, you have to unwind all these expressions. (Curiously, I haven't seen a stack overflow yet...) Adding the seq call forces 5+1 to actually turn into 6, not eventualy but *right now*, and so each array cell ends up actually containing a fully-evaluated integer. (No doubt this change in strictness also triggers an avalanche of optimisation opportunities, but the basic thing of importants is that we're not uselessly delaying evaluation now.) As I say, none of this is a surprise to anybody who actually knows how to get performance out of Haskell. The strict version is actually faster than the C++ version. (I imagine I'm doing something horribly wrong with C++, of course...) Not drastically faster, but faster. Which probably just means that ByteString is doing nicer buffering than whatever the C++ standard library is called. The important obsevation is this: One tiny, almost insignificant change can transform a program from taking 50 seconds and 400 MB of RAM into one that takes 0.02 seconds and 0.1 MB of RAM. And, at least in this case, the simpler version is the slow one. To say that Haskell is "slow" is both a little bit vague, and not really backed up by facts. In about 5 minutes flat, I managed to write a Haskell program that's very simple, and yet faster than a comparably simple C++ program (and C++ is supposed to be "fast"). So it's not that Haskell is "slow". It's that Haskell is *tricky*. Tiny, tiny little changes that look innocuous can have vast effects on performance. And this is a nice little example of that effect. As we (probably?) all know, strictness is a subtle and fickle thing. Consider: or :: Bool -> Bool -> Bool or True True = True or True False = True or False True = True or False False = False Overlooking the obvious name clash, this seems like a perfectly legitimate definition for the logical-OR function. But now try this: xs `contains` x = foldr or False $ map (x ==) xs A nice, pretty Haskell definition. But oh dears - this does not have the performance you would expect at all! But if you change the definition above to the apparently equivilent or True _ = True or False x = x then "contains" becomes much faster. If you saw these two functions side by side, you might well wonder what the hell the difference is. You might even wonder if the compiler would transform one to the other (although I'm not sure in which direction). But there is, in fact, a very, very big difference: one is lazier than the other. For counting bytes, lazy = bad. For searching for matches, lazy = good. Similarly, if you take the byte histogram program and replace the lazy ByteString with a strict one, RAM usage spikes sharply. In all, laziness is a tricky, tricky little hobbitses. I wrote a C++ program, in the obvious way, and it was very, very fast. I wrote a Haskell program in several comparably obvious ways, and they were all cripplingly slow. Until I added the magic "seq", and suddenly got blinding speed. So while it is not correct to say "Haskell is slow", one could justifyably say "C++ is fast more reliably than Haskell". Consider for a moment the original implementation with Data.Map. Adding a seq or two here will do no good at all; seq reduces to WHNF. What we are wanting is NF, and I can see no way at all of doing that. (Other than doing something very expensive like looking up every possible key, which is far more operations than actually necessary.) I wrote my own little BST algorithm, customised to the fact that there are always exactly 256 keys of type Word8. It was also nausiatingly slow. And then I did something: I inserted a few strictness annotations on the constructor fields. And (as you already worked out), performance increased drastically. I can do that if *I* define the data structure. But if it's in a library, I am powerless to make it more strict (or less, if that were what I wanted). That got me thinking... What would happen if, instead of "Integer", we had two types, "evaluated Integer" and "possibly unevaluated Integer"? What if the strictness or otherwise of a data structure were exposed at the type level? The general idea here is that anything declared has having type "evaluated Integer" can never contain an unevaluated expression. It must always contain an actual integer. This of course implies that you might be able to unbox it or do other interesting things with it, but it also says something about evaluation. The principle question is "what happens if you assign an unevaluated Integer to an evaluated Integer?" Is that legal? Do you have to explicitly evaluate it first? Or does that happen implicitly? Presumably "evaluated Integer" means evaluated to WHNF only - which for an integer is the same thing as NF, but for some more complex structure it might not be. If the type system worked like this, I could do "Data.Map with evaluated integer keys and possibly evaluated integer values", and it would hold its values lazily. Or I could say "Data.Map with evaluated integer keys and evaluated integer values", and the structure would now magically be strict. Which is to say, the type system would ensure that any data passed into the structure was evaluated first, and the compiler could do any applicable optimisations based on the assumption of data being already evaluated. (For primitive types, that might be unboxing or passing in registors. For ADTs, it might just mean skipping an evaluation check.) Currently, if you want a strict list, you have to implement one yourself. But is that strict in the spine, or the elements, or what? You have to implement every combination that you want. And, let us not forget, then go implement all the functions in Data.List over your new data structure. Not fun. If, on the other hand, strictness where in the type system, data structures would be *parameterised* over the level of strictness required. I have no idea what the syntax for that would look like, and backwards compatibility looks like a nightmare. You would need to answer questions like "what does it mean for a function to return an evaluated type?" and "is evaluation implicit or explicit?" But it seems like a rather interesting idea. (Let us not forget: *Everything* is trivial for the person who doesn't have to write the implementation...)

To illustrate your prediction about the side-issues: On Thursday 03 February 2011 22:10:51, Andrew Coppin wrote:
Consider for a moment the original implementation with Data.Map. Adding a seq or two here will do no good at all; seq reduces to WHNF. What we are wanting is NF, and I can see no way at all of doing that.
Check out Data.Map.insertWith'

On 03/02/2011 09:37 PM, Daniel Fischer wrote:
To illustrate your prediction about the side-issues:
On Thursday 03 February 2011 22:10:51, Andrew Coppin wrote:
Consider for a moment the original implementation with Data.Map. Adding a seq or two here will do no good at all; seq reduces to WHNF. What we are wanting is NF, and I can see no way at all of doing that.
Check out Data.Map.insertWith'
*facepalm* Wouldn't that still mean that the spine of the map is still lazy though?

On 03/02/2011 09:37 PM, Daniel Fischer wrote:
To illustrate your prediction about the side-issues:
On Thursday 03 February 2011 22:10:51, Andrew Coppin wrote:
Consider for a moment the original implementation with Data.Map. Adding a seq or two here will do no good at all; seq reduces to WHNF. What we are wanting is NF, and I can see no way at all of doing that.
Check out Data.Map.insertWith'
Random fact: Data.Map.insertWith' exists. Data.IntMap.insertWith' does *not* exist.

On Sat, Feb 5, 2011 at 4:16 PM, Andrew Coppin
Random fact: Data.Map.insertWith' exists. Data.IntMap.insertWith' does *not* exist.
The containers library is a mess. For example, Data.Map has 10+ functions (e.g. adjust) that don't have strict counterparts even though the strict version is most likely what you want. Some functions do allow you to force the value before it's inserted into the map just because you can piggy-back on the evaluation of a constructor e.g. update :: Ord k => (a -> Maybe a) -> k -> Map k a -> Map k a f = update g someKey someMap where g = (\ v -> let v' = v + 1 in v' `seq` Just v') Since the implementation must evaluate the result of g to decide whether to remove the element or not. Johan

Random fact: Data.Map.insertWith' exists. Data.IntMap.insertWith' does *not* exist.
The containers library is a mess.
I'm inclined to agree. In particular, I get strange looks from people in the OOP community when I say I'm using a language that doesn't have any abstractions at all for dealing polymorphically with containers. In general, it's almost impossible to write a Haskell function that will work with a list, an (immutable) array, a hash table or a map, polymorphically. I guess it's the case that containers has been there so long now that changing it would break everything. Still, I find myself hungry for something better. Than again, the Prelude itself leaves several things to be desired...

On Sun, Feb 6, 2011 at 8:48 PM, Andrew Coppin
In particular, I get strange looks from people in the OOP community when I say I'm using a language that doesn't have any abstractions at all for dealing polymorphically with containers. In general, it's almost impossible to write a Haskell function that will work with a list, an (immutable) array, a hash table or a map, polymorphically.
I guess it's the case that containers has been there so long now that changing it would break everything. Still, I find myself hungry for something better.
Than again, the Prelude itself leaves several things to be desired...
I'm working on a new map data type at the moment, which is 2x faster than Data.Map for all key types I've tried (i.e. Ints, Strings, and ByteStrings). As part of that work I might try to define a map class using associated data types. Data.Map can be made an instance of that class without breaking any old code. I think one reason we haven't seen a type class for containers is that it isn't easy to create one using vanilla type classes (see Simon PJ's paper on the topic.) Johan

On 06.02.2011 23:29, Johan Tibell wrote:
On Sun, Feb 6, 2011 at 8:48 PM, Andrew Coppin
wrote: In particular, I get strange looks from people in the OOP community when I say I'm using a language that doesn't have any abstractions at all for dealing polymorphically with containers. In general, it's almost impossible to write a Haskell function that will work with a list, an (immutable) array, a hash table or a map, polymorphically.
I think one reason we haven't seen a type class for containers is that it isn't easy to create one using vanilla type classes (see Simon PJ's paper on the topic.)
Well Foldable and Traversable provide set of generic operations for containers. Although they are quite limited, containter must be polymorphic (e.g. no IntMap) and parameter must be of any type (e.g. no unboxed vectors) both are still quite useful. Also there is a container-classes package which provide set of type class for containers. [1] http://hackage.haskell.org/package/container-classes

On Sun, Feb 6, 2011 at 10:12 PM, Alexey Khudyakov
Well Foldable and Traversable provide set of generic operations for containers. Although they are quite limited, containter must be polymorphic (e.g. no IntMap) and parameter must be of any type (e.g. no unboxed vectors) both are still quite useful.
I looked into providing instances for these but IIRC the performance was really bad. Someone need to look at inlining/specialization for these.
Also there is a container-classes package which provide set of type class for containers.
I'd like to avoid MPTC and fundeps if possible. Johan

On 07.02.2011 00:37, Johan Tibell wrote:
Also there is a container-classes package which provide set of type class for containers.
I'd like to avoid MPTC and fundeps if possible.
I think haskell2010's type system is just not expressive enough to create interface generic enough. It's not possible to create type class which will work for both ByteStrings (or IntSet) and lists.

I think haskell2010's type system is just not expressive enough to create interface generic enough. It's not possible to create type class which will work for both ByteStrings (or IntSet) and lists.
It seems that most people agree: The reason why we don't have container classes is that it's difficult to define them in a completely type-safe mannar. (The OOP people, of course, just don't bother trying. They use typecasts everywhere...) Do associated types solve this? Or are there still problems?

On Mon, Feb 7, 2011 at 10:01 PM, Andrew Coppin
I think haskell2010's type system is just not expressive enough to create interface generic enough. It's not possible to create type class which will work for both ByteStrings (or IntSet) and lists.
It seems that most people agree: The reason why we don't have container classes is that it's difficult to define them in a completely type-safe mannar.
(The OOP people, of course, just don't bother trying. They use typecasts everywhere...)
Do associated types solve this? Or are there still problems?
Duncan showed me a definition using associated types, which I have unfortunately forgotten. Johan

On 7 February 2011 08:12, Alexey Khudyakov
Also there is a container-classes package which provide set of type class for containers.
Don't use that package: it's broken. I did start on a re-write that works (and there are similar libraries around that I believe do work), but gave up because it was a) too fiddly, and b) I decided that the use case I had for them wasn't worth it (as it would just make that library more complicated and difficult to use, possibly also less efficient). If anyone wants it I can certainly send them what I've got though. -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

On 2/6/11 4:53 PM, Ivan Lazar Miljenovic wrote:
On 7 February 2011 08:12, Alexey Khudyakov
wrote: Also there is a container-classes package which provide set of type class for containers.
Don't use that package: it's broken. I did start on a re-write that works (and there are similar libraries around that I believe do work), but gave up because it was a) too fiddly, and b) I decided that the use case I had for them wasn't worth it (as it would just make that library more complicated and difficult to use, possibly also less efficient). If anyone wants it I can certainly send them what I've got though.
Any chance you could push a 0.0.0.1 version which tells people this, so that they know to avoid it? -- Live well, ~wren

On 7 February 2011 12:30, wren ng thornton
On 2/6/11 4:53 PM, Ivan Lazar Miljenovic wrote:
On 7 February 2011 08:12, Alexey Khudyakov
wrote: Also there is a container-classes package which provide set of type class for containers.
Don't use that package: it's broken. I did start on a re-write that works (and there are similar libraries around that I believe do work), but gave up because it was a) too fiddly, and b) I decided that the use case I had for them wasn't worth it (as it would just make that library more complicated and difficult to use, possibly also less efficient). If anyone wants it I can certainly send them what I've got though.
Any chance you could push a 0.0.0.1 version which tells people this, so that they know to avoid it?
Didn't think about that; I'll try to do so tonight. -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Hi, Thanks for bringing up these issues. It's something we could be better at addressing as a community. First, we need to stop pretending that you can use Haskell effectively without first learning to reason about program evaluation order. Learning how is not terrible difficult, but there's very little material on how to do it [1]. Some of us have learnt it the hard way by experimentation or by talking to people who do understand lazy evaluation [2] (the Simons, Don, and Bryan to name a few). At the very least we need to teach people how to tell which arguments a pure function is strict in by looking at its definition. Second, many of our core data structures are lazy, but most uses are strict. That keeping a simple map of counters is tricky should tell us that something is wrong (you need to use insertWith'). It wouldn't be if Data.Map was strict in the values. Many strictness related problems people have are due to common data types like Maybe, tuples, and arrays being lazy. This is rarely what you want. 1. I tried to start creating some. For example, by giving a high performance Haskell talk at CUFP. Unfortunately the talk wasn't taped. 2. Haskell is non-strict, which doesn't necessarily imply lazy evaluation. However, lazy evaluation is what we actually deal with. Johan

On 03/02/2011 10:15 PM, Johan Tibell wrote:
First, we need to stop pretending that you can use Haskell effectively without first learning to reason about program evaluation order.
Writing a program in *any* language without understanding the performance implications of different language constructs is unlikely to produce performant code. OTOH, Haskell is unusual in just how easily seemingly trivial alternations of a few characters can cause utterly *giagintic* performance differences in both time and space.
Learning how is not terrible difficult, but there's very little material on how to do it [1]. Some of us have learnt it the hard way by experimentation or by talking to people who do understand lazy evaluation [2] (the Simons, Don, and Bryan to name a few).
I think perhaps the problem is that Haskell's execution model is so utterly *implicit*. In some imperative languag, if you call an expensive function, it will be expensive. In Haskell, if you call an expensive function and then never touch the result, it's cheap. Touch the result once, and you might get fusion. Touch it twice and suddenly the space complexity of the program changes. So simply adding a debug statement can utterly transform your program's runtime. What it comes down to is: Tiny changes sometimes have profound effects. Best of all, there is little tool support for detecting these effects. If you change your program and it suddenly slows down, you need to go look at what you just changed. But if you write a brand new program and it's drop-dead slow, where do you start looking? (Assuming you're not writing a micro-benchmark.)
At the very least we need to teach people how to tell which arguments a pure function is strict in by looking at its definition.
That's not necessarily tractible. It depends on what other functions you call. Many functions have obvious strictness properties, but very few have *documented* strictness properties.
That keeping a simple map of counters is tricky should tell us that something is wrong.
Agreed.
Many strictness related problems people have are due to common data types like Maybe, tuples, and arrays being lazy. This is rarely what you want.
I'm not sure that's 100% true. You might have a function that returns a tuple, and on occasion you're only actually interested in one of the two results. But that looks like the exception rather than the rule. Lazy lists are rather useful, but it looks like strict lists would be useful too. The number of times I've written !String and then thought "hey, wait, that's not helping me..."

On 05/02/2011 15:35, Andrew Coppin wrote:
At the very least we need to teach people how to tell which arguments a pure function is strict in by looking at its definition.
That's not necessarily tractible. It depends on what other functions you call. Many functions have obvious strictness properties, but very few have *documented* strictness properties.
It's often struck me that, this information is clearly part of the interface to a function, given that correct operation of calls to that function may depend on it, yet we (implicitly) pretend that it's not (by rarely documenting it). Would it not be both incredibly useful and possible to automatically document information of this sort using tools like haddock? Obviously, entirely accurate indicators of lazyness are not going to be computable, but it seems like we could at least get information of the type "argument never forced", "argument always forced", "argument may be forced". It seems like this would be a huge step forward - data structures like map which are pure containers for their elements would be clearly indicated as such in their documentation. Jimbo

On 7 February 2011 10:16, Jimbo Massive
It's often struck me that, this information is clearly part of the interface to a function, given that correct operation of calls to that function may depend on it, yet we (implicitly) pretend that it's not (by rarely documenting it).
Would it not be both incredibly useful and possible to automatically document information of this sort using tools like haddock?
Interesting point, but excepting that its adding more complexity Haskell type system, the Clean way of putting strictness information into the type system seems preferable don't you think?

On 07/02/2011 11:40, Stephen Tetley wrote:
Interesting point, but excepting that its adding more complexity Haskell type system, the Clean way of putting strictness information into the type system seems preferable don't you think?
If we were starting from a clean sheet (no pun intended) then yes, I would say this is unquestionably preferable. Given the amount of Haskell code out in the world, I'd expect people to argue against it, on the basis that fiddling with the types is quite a major change. (Though I would not necessarily be one of those people) Regards, Jimbo

On Mon, Feb 7, 2011 at 1:36 PM, Jimbo Massive
On 07/02/2011 11:40, Stephen Tetley wrote:
Interesting point, but excepting that its adding more complexity Haskell type system, the Clean way of putting strictness information into the type system seems preferable don't you think?
If we were starting from a clean sheet (no pun intended) then yes, I would say this is unquestionably preferable.
Given the amount of Haskell code out in the world, I'd expect people to argue against it, on the basis that fiddling with the types is quite a major change. (Though I would not necessarily be one of those people)
I dunno. As a language extension, would - let's call it BangTypes - be backwards-incompatible in any way? As far as I understand it, 'banged' types would accept only evaluated values, whereas 'unbanged' types would accept either evaluated or unevaluated ones -- exactly as it is now. So, given that none of the types in currently-existing Haskell code are banged, the effect on them of enabling the extension should be pretty more or less nil. And I think code with banged types would still be completely interoperable with code without (at least if evaluation happened implicitly wherever required) -- passing a value of a banged type to a function expecting an unbanged one would have no special effect, while passing a value of an unbanged type to a function expecting a banged one would merely result in it being evaluated. The potentially disruptive effect I can think of is that currently authors of data structures have full control over their strictness (for good or ill), whereas with this extension their users would be able to instantiate them to various varieties of strictness themselves. I'm not sure if the implementations of the data structures (or other external uses thereof) tend to depend on their assumed strictness, and would break were it different? If that were the case it might indeed be problematic, forcing people to distinguish between 'bang-safe' and 'bang-unsafe' code. But I don't know if this is the case (or, for that matter, whether it's even possible for it to be the case...). One thing I'm unclear on is what precisely the meaning of banging a type would be (or what the meaning is in Clean). Would !a indicate that values of type !a must be evaluated to WHNF, that constructors of !a which occur in a's definition recursively would be strict (as if they had been declared with a bang pattern), or both? (Or something else entirely?) You would need the second property to be able to specify a spine-strict but element-lazy list (as opposed to merely a non-bottom list) as ![a]; you would need the first for it to have any effect on non-recursive types. Or would it have the first meaning for type variables, and both meanings for concrete types?
Regards, Jimbo
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Work is punishment for failing to procrastinate effectively.

On 8/02/2011, at 3:47 AM, Gábor Lehel wrote:
I dunno. As a language extension, would - let's call it BangTypes - be backwards-incompatible in any way?
Let's look at an intermediate step first, BangFunctions. What this does is to say that there are two versions of "->": t1 -> t2 What we have right now, which might be evaluated or not. !t1 -> t2 The function wants its argument evaluated. Suppose f is a value of this type. Then a use of f is rather like a use of (\x -> x `seq` f x) and we can already write that. Now if you write f :: !t1 -> t2 f p1 = e1 ... f pn = en you're making the function strict whether it would have been strict or lazy. But again, with BangPatterns we can already do that: f !(p1) = e1 ... f !(pn) = en The advantage of BangPatterns is that they can be precisely and selectively located. The advantages of BangFunctions include - the forced strictness is part of the function's (published) *interface*, not its implementation - the question of what happens if some patterns for an argument are banged and some are not does not arise (I *think* that this can avoid some mistakes) - it's compatible with BangTypes but simpler. So in some sense there is (now) nothing new here *except* putting the information where people can easily see it. BangTypes could be rather more complicated. Clean 2 offers lazy, head strict spine lazy, head lazy spine strict, head and spine strict, head unboxed spine lazy, head unboxed spine strict for lists, which are all different types; it also offers strictness-polymorphic lists. I never actually made use of this because having so many kinds of list made my head spin. Roughly speading, Clean 1 had BangFunctions, Clean 2 BangTypes. One of the things that makes me wary of BangPatterns is that it seems as though it's headed in a BangTypes kind of direction. Oh, by the way, I got the Clean syntax wrong. ![a] means "lazy list evaluated to WHNF"; [a!] means "value is spine strict".

2011/2/7 Richard O'Keefe
On 8/02/2011, at 3:47 AM, Gábor Lehel wrote:
I dunno. As a language extension, would - let's call it BangTypes - be backwards-incompatible in any way?
Let's look at an intermediate step first, BangFunctions.
What this does is to say that there are two versions of "->":
t1 -> t2
What we have right now, which might be evaluated or not.
!t1 -> t2
The function wants its argument evaluated. Suppose f is a value of this type. Then a use of f is rather like a use of (\x -> x `seq` f x) and we can already write that.
Now if you write
f :: !t1 -> t2 f p1 = e1 ... f pn = en
you're making the function strict whether it would have been strict or lazy. But again, with BangPatterns we can already do that:
f !(p1) = e1 ... f !(pn) = en
The advantage of BangPatterns is that they can be precisely and selectively located.
The advantages of BangFunctions include - the forced strictness is part of the function's (published) *interface*, not its implementation - the question of what happens if some patterns for an argument are banged and some are not does not arise (I *think* that this can avoid some mistakes) - it's compatible with BangTypes but simpler.
So in some sense there is (now) nothing new here *except* putting the information where people can easily see it.
Perhaps the compiler could even do inference, showing bangs on those parameters in which the function is (detectably) always strict? This vaguely reminds me of type elaboration in Disciple (at least, that's where I encountered it), in that even if you manually supply a type signature for your function as Foo -> Bar without bangs, if in practice it is strict in the Foo parameter, the type is actually !Foo -> Bar. And presumably the compiler could infer that -- the strictness properties are orthogonal in a sense to the rest of the type, so underspecifying them is not an error, and the compiler could fill the additional information in automatically where available. (Again this is only if evaluation were to happen implicitly where required by the types, so that !Foo -> Bar would equivalently mean "applying this function to the argument results in the argument being evaluated" whether the bang was supplied or inferred.) Is there any sensible meaning for bangs on return types? I've been trying to think this through but not necessarily succeeding. Perhaps instead of all this inquiring and conjecturing it might be easier to just read up on how Clean does it... :-)
BangTypes could be rather more complicated. Clean 2 offers lazy, head strict spine lazy, head lazy spine strict, head and spine strict, head unboxed spine lazy, head unboxed spine strict for lists, which are all different types; it also offers strictness-polymorphic lists. I never actually made use of this because having so many kinds of list made my head spin. Roughly speading, Clean 1 had BangFunctions, Clean 2 BangTypes.
This does seem a bit excessive. As a start, I don't remember anyone asking for control over (un)boxedness, so hopefully we could jettison that part of it? The dichotomy between head-strict (WHNF) and spine-strict seems potentially more meaningful though. I could easily imagine wanting to specify a type as being WHNF (so as to avoid accumulating thunks) while still allowing it to be spine-lazy; are there any use cases for the reverse, not-necessarily-evaluated but spine-strict-once-evaluated? If there aren't, we could use ! for head-strict, and !! for both head- and spine-strict (to borrow syntax I saw proposed for 'hyperstrictness' somewhere). Maybe this is still more complexity than desirable (ideally there would be only !...), but the problems with insufficient predictability of and control over evaluation seem very real, so if (_if_) this would go a significant way towards alleviating that problem (in particular not having to duplicate code N times for each desired combination of strictness and laziness in container classes seems like it would be a win), then maybe the extra complexity would be worth it.
One of the things that makes me wary of BangPatterns is that it seems as though it's headed in a BangTypes kind of direction.
Oh, by the way, I got the Clean syntax wrong. ![a] means "lazy list evaluated to WHNF"; [a!] means "value is spine strict".
-- Work is punishment for failing to procrastinate effectively.

Gábor Lehel
Is there any sensible meaning for bangs on return types? I've been trying to think this through but not necessarily succeeding.
Not knowing Clean in any detail, I've always just thought that a type signature of, say: something :: !Foo -> Bar would mean the same as, in Haskell: something :: Foo -> Bar something foo = foo `seq` ... In this case, there's no point to a strict return type, since it would boil down to "x `seq` x", which is just "x". But it seems that a lot of these discussions are about considering Foo and !Foo distinct types, which would mean that you can no longer, say, add a strict and a lazy integer - at least not with the current Num instance. I find this line of thought very confusing.
This does seem a bit excessive. As a start, I don't remember anyone asking for control over (un)boxedness, so hopefully we could jettison that part of it?
Uh, you mean like in IOUArrays, the UNPACK pragma, or -funbox-strict-fields? Unboxing is an important optimization, but perhaps the current feature set suffices. -k -- If I haven't seen further, it is by standing in the footprints of giants

2011/2/8 Ketil Malde
Gábor Lehel
writes: Is there any sensible meaning for bangs on return types? I've been trying to think this through but not necessarily succeeding.
Not knowing Clean in any detail, I've always just thought that a type signature of, say:
something :: !Foo -> Bar
would mean the same as, in Haskell:
something :: Foo -> Bar something foo = foo `seq` ...
In this case, there's no point to a strict return type, since it would boil down to "x `seq` x", which is just "x".
Yeah, this is what I keep arriving at as well, I'm not sure if there's another option I might be missing...
But it seems that a lot of these discussions are about considering Foo and !Foo distinct types, which would mean that you can no longer, say, add a strict and a lazy integer - at least not with the current Num instance. I find this line of thought very confusing.
As I would ideally imagine it, again, strictness would be entirely orthogonal to the 'normal' part of the type. So you could combine Foo and !Foo completely freely as if the ! had never been there. !Foo would be a subtype of Foo, so to speak -- !Foo representing evaluated values, and Foo representing either evaluated or unevaluated values. The ! would be an instruction to the compiler/runtime, "make sure the Foo is evaluated by this point", and information for the programmer, "the Foo is certain to be evaluated by this point". Adding !s would only ever result in evaluation happening, and not ever a type error. The advantage over bang patterns would be that the time/place of evaluation could be controlled by the user rather than/in addition to the implementer of a function/type (or at least more flexibly and easily than it can be done now), it would more visible, obvious, and certain, and perhaps type inference/elaboration could even be done. But I'm minimally well-versed in compilers and type theory, so if someone sees something fundamentally wrong with this idea, please enlighten me.
This does seem a bit excessive. As a start, I don't remember anyone asking for control over (un)boxedness, so hopefully we could jettison that part of it?
Uh, you mean like in IOUArrays, the UNPACK pragma, or -funbox-strict-fields? Unboxing is an important optimization, but perhaps the current feature set suffices.
Yeah, I meant within the current context. I don't recall hearing complaints that control over unboxing is currently insufficient or that unboxing is insufficiently predictable. (But if there have been, feel free to fill me in...)
-k -- If I haven't seen further, it is by standing in the footprints of giants
-- Work is punishment for failing to procrastinate effectively.

Necroing this thread because I just noticed there's a (rather old!)
bug report which covers much of the same ground and doesn't seem to
have been mentioned by anyone:
http://hackage.haskell.org/trac/ghc/ticket/1349
2011/2/8 Gábor Lehel
2011/2/8 Ketil Malde
: Gábor Lehel
writes: Is there any sensible meaning for bangs on return types? I've been trying to think this through but not necessarily succeeding.
Not knowing Clean in any detail, I've always just thought that a type signature of, say:
something :: !Foo -> Bar
would mean the same as, in Haskell:
something :: Foo -> Bar something foo = foo `seq` ...
In this case, there's no point to a strict return type, since it would boil down to "x `seq` x", which is just "x".
Yeah, this is what I keep arriving at as well, I'm not sure if there's another option I might be missing...
But it seems that a lot of these discussions are about considering Foo and !Foo distinct types, which would mean that you can no longer, say, add a strict and a lazy integer - at least not with the current Num instance. I find this line of thought very confusing.
As I would ideally imagine it, again, strictness would be entirely orthogonal to the 'normal' part of the type. So you could combine Foo and !Foo completely freely as if the ! had never been there. !Foo would be a subtype of Foo, so to speak -- !Foo representing evaluated values, and Foo representing either evaluated or unevaluated values. The ! would be an instruction to the compiler/runtime, "make sure the Foo is evaluated by this point", and information for the programmer, "the Foo is certain to be evaluated by this point". Adding !s would only ever result in evaluation happening, and not ever a type error. The advantage over bang patterns would be that the time/place of evaluation could be controlled by the user rather than/in addition to the implementer of a function/type (or at least more flexibly and easily than it can be done now), it would more visible, obvious, and certain, and perhaps type inference/elaboration could even be done.
But I'm minimally well-versed in compilers and type theory, so if someone sees something fundamentally wrong with this idea, please enlighten me.
This does seem a bit excessive. As a start, I don't remember anyone asking for control over (un)boxedness, so hopefully we could jettison that part of it?
Uh, you mean like in IOUArrays, the UNPACK pragma, or -funbox-strict-fields? Unboxing is an important optimization, but perhaps the current feature set suffices.
Yeah, I meant within the current context. I don't recall hearing complaints that control over unboxing is currently insufficient or that unboxing is insufficiently predictable. (But if there have been, feel free to fill me in...)
-k -- If I haven't seen further, it is by standing in the footprints of giants
-- Work is punishment for failing to procrastinate effectively.
-- Work is punishment for failing to procrastinate effectively.

On 17/03/2011 12:11 PM, Gábor Lehel wrote:
Necroing this thread because I just noticed there's a (rather old!) bug report which covers much of the same ground and doesn't seem to have been mentioned by anyone:
Right. So somebody else came up with an idea similar to mine, but since nobody could agree on anything more than a rough idea, nothing actually got done...(?)

On Thu, Mar 17, 2011 at 10:37 PM, Andrew Coppin
On 17/03/2011 12:11 PM, Gábor Lehel wrote:
Necroing this thread because I just noticed there's a (rather old!) bug report which covers much of the same ground and doesn't seem to have been mentioned by anyone:
Right. So somebody else came up with an idea similar to mine, but since nobody could agree on anything more than a rough idea, nothing actually got done...(?)
Well, I also got the sense that it would be more than a little nontrivial to implement. Or at least, someone was talking about how it would be a good topic for writing a thesis about if someone managed to figure how to do it. I suspect that if someone were to actually put in the effort they would be afforded the privilege of getting to nail down the details. That's just completely baseless speculation, though.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Work is punishment for failing to procrastinate effectively.

Right. So somebody else came up with an idea similar to mine, but since nobody could agree on anything more than a rough idea, nothing actually got done...(?)
Well, I also got the sense that it would be more than a little nontrivial to implement. Or at least, someone was talking about how it would be a good topic for writing a thesis about if someone managed to figure how to do it. I suspect that if someone were to actually put in the effort they would be afforded the privilege of getting to nail down the details. That's just completely baseless speculation, though.
That sounds like a good description for just about every imaginable extension to Haskell that does something interesting! :-D

On Thu, Mar 17, 2011 at 11:14 PM, Andrew Coppin
Right. So somebody else came up with an idea similar to mine, but since nobody could agree on anything more than a rough idea, nothing actually got done...(?)
Well, I also got the sense that it would be more than a little nontrivial to implement. Or at least, someone was talking about how it would be a good topic for writing a thesis about if someone managed to figure how to do it. I suspect that if someone were to actually put in the effort they would be afforded the privilege of getting to nail down the details. That's just completely baseless speculation, though.
That sounds like a good description for just about every imaginable extension to Haskell that does something interesting! :-D
Turns out people *have* already written papers related to the topic. Stefan Holdermans: Making Stricterness More Relevant http://www.holdermans.nl/pubs/assets/holdermans10making-slides.pdf http://www.holdermans.nl/pubs/assets/holdermans10making.pdf Max Bolingbroke, Simon Peyton Jones: Types are Calling Conventions http://www.cl.cam.ac.uk/~mb566/papers/tacc-hs09.pdf The latter is particularly fascinating.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Work is punishment for failing to procrastinate effectively.

On 2/8/11 6:00 AM, Ketil Malde wrote:
This does seem a bit excessive. As a start, I don't remember anyone asking for control over (un)boxedness, so hopefully we could jettison that part of it?
Uh, you mean like in IOUArrays, the UNPACK pragma, or -funbox-strict-fields? Unboxing is an important optimization, but perhaps the current feature set suffices.
As another issue to bear in mind, there is a big difference between strict types and unpointed types (either as types themselves, or as function spaces). Semantically we have lots of reasons for wanting unpointed types--- that is, types which do not have bottom as an inhabitant. And it is clear that pointed and unpointed versions are different types[1]. One of the major issues looming here is the fact that unpointed types may not form domains, wreaks havoc for the domain theory semantics people often use for laziness. But at the same time, removing bottom can make it much easier to reason about things. Strict types are, semantically, the same as unpointed types. And since Haskell is non-total, strict types are the only possible exact implementation of unpointed types (though decent type-checkable approximations may exist). However, operationally, strict types and unpointed types are quite different. In strict types we know that the value has been evaluated to WHNF (or some other normal form) and therefore we can avoid the computational cost of checking to ensure that it's been evaluated. Whereas unpointed types may not have been evaluated already, we simply know that when we do evaluate them we won't get bottom. Thus, unpointed types can allow us to have our laziness and... er, eat it too. The arguments for having unpointed types (as opposed to strict types) are the same as the arguments for having laziness in the first place. Conversely, the arguments for having strict types (as opposed to unpointed types) are the same as the arguments for having strictness annotations of any kind. Both have their place, but we should be clear about what our goals are before choosing one or the other. Personally I'd like to have both, because they fill different needs, and because it's easy to automate the conversion between them[2]. [1] Though conversion between them is easy. From unpointed to pointed is just a forgetful functor; from pointed to unpointed is the monad of evaluation. [2] Functions of strict arguments can be lifted to functions of unpointed arguments by simple wrappers to force argument evaluation. With strictness analysis, it can be possible to optimize the wrapper away and to call the strict-typed version directly. This is sort of like the SPECIALIZE pragmas. We can also lift functions of unpointed arguments into functions of pointed arguments by changing the return type of the function to allow for bottom to be returned. Note that this requires that we distinguish pointed and unpointed types, not just pointed and unpointed function spaces. This is a natural consequence of the fact that evaluation is a monad, but it makes things get really hairy really quickly. Then again, that complexity may be unavoidable since we'd like to be able to have functions return strict and unpointed values just like they can return unboxed values. -- Live well, ~wren

On 2011-02-11 02:06, wren ng thornton wrote:
And it is clear that pointed and unpointed versions are different types[1]. ... [1] Though conversion between them is easy. From unpointed to pointed is just a forgetful functor; from pointed to unpointed is the monad of evaluation.
I'm unskilled with categories. For the monad of evaluation, don't the category's objects need to be strict types? There was an old thread in which Luke Palmer looked at an implementation of (>>=) that uses seq to evaluate the left operand. He showed that it's not a monad. It would be nice to use a language with rich monads like Haskell, but with an evaluation monad that fits together with a variety of monad transformers. I think this requires strict types. Adding them to Haskell may not be achievable.

Hi, For what it's worth I saw the problems in your counting examples right away, without reading the explanatory text below. Hopefully that means that it's possibly to learn how to spot such things without resorting to e.g. running the program or reading Core *gasp*. Johan

On Thursday 03 February 2011 23:19:31, Johan Tibell wrote:
Hi,
For what it's worth I saw the problems in your counting examples right away, without reading the explanatory text below.
Yes, they were pretty obvious with enough experience. For beginners I expect it to be a rather insidious trap.
Hopefully that means that it's possibly to learn how to spot such things without resorting to e.g. running the program or reading Core *gasp*.
Within limits. Detrimental laziness or strictness can be arbitrarily well hidden.

Daniel Fischer wrote:
On Thursday 03 February 2011 23:19:31, Johan Tibell wrote:
Hi,
For what it's worth I saw the problems in your counting examples right away, without reading the explanatory text below.
Yes, they were pretty obvious with enough experience. For beginners I expect it to be a rather insidious trap.
Hopefully that means that it's possibly to learn how to spot such things without resorting to e.g. running the program or reading Core *gasp*.
Within limits. Detrimental laziness or strictness can be arbitrarily well hidden.
I am a relative newcomer to Haskell, but I think I have a reasonable understanding of the executaion model. Enough to fix performance issues in simple code like the example given. However, one of the Haskell projects I work on is Ben Lippmeier's DDC compiler. Thats about 50000 lines of Haskell code and finding performance issues there is really difficult. Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/

On Fri, Feb 4, 2011 at 12:38 AM, Erik de Castro Lopo
However, one of the Haskell projects I work on is Ben Lippmeier's DDC compiler. Thats about 50000 lines of Haskell code and finding performance issues there is really difficult.
Right. It can still be tricky. I think we can get rid of a large number of strictness issues by using strict data structures more often, this should help beginners in particular. For the rest better tooling would help. For example, a lint tool that marked up code with the strictness information inferred by the compiler would be useful. I had time to write one I would make the output look like HPC html reports, with one color for strict function arguments and one color for lazy function arguments. Johan

On 04/02/2011 07:30 AM, Johan Tibell wrote:
Right. It can still be tricky. I think we can get rid of a large number of strictness issues by using strict data structures more often, this should help beginners in particular. For the rest better tooling would help. For example, a lint tool that marked up code with the strictness information inferred by the compiler would be useful. I had time to write one I would make the output look like HPC html reports, with one color for strict function arguments and one color for lazy function arguments.
There's the RTS watch that makes it spit out heap profiling information. However, determining what the hell this data actually means is well beyond my powers of comprehension. I keep hoping that eventually the mechanism used by ThreadScope will eventually allow you to compile a program with profiling, run it, and observe absolutely everything about its execution - how many cores it's using, how much RAM is allocated to each generation, etc. Then again, if you could actually single-step through a Haskell program's execution, most strictness issues would become quite shallow. Indeed, when I first learned Haskell, the very concept of lazyness ever being "detrimental" was incomprehensible to me. I couldn't imagine why you would ever want to turn it off. But then I built a simple program that single-steps through Haskell(ish) expressions, and suddenly discovered that foldl' needs to exist...

On 6 February 2011 02:40, Andrew Coppin
Then again, if you could actually single-step through a Haskell program's execution, most strictness issues would become quite shallow.
You can single-step through a Haskell program's execution with the GHCi debugger. It can provide considerable insight into evaluation order. A proper stack tracer would make it even more useful. Cheers, Bernie.

On Thu, Feb 3, 2011 at 3:38 PM, Erik de Castro Lopo
I am a relative newcomer to Haskell, but I think I have a reasonable understanding of the executaion model. Enough to fix performance issues in simple code like the example given.
However, one of the Haskell projects I work on is Ben Lippmeier's DDC compiler. Thats about 50000 lines of Haskell code and finding performance issues there is really difficult.
Lately I've been trying to go the other direction: make a large section of formerly strict code lazy. I fully agree that once code size gets big these problems get a lot harder. You have to be very careful passing around state that you don't do anything that causes too much to be evaluated at the wrong time. E.g., you can put back the final result of a mapAccumL into a StateT but you want to make sure it doesn't get looked at until the output of the mapAccumL would be evaluated. Even something seemingly innocuous like bundling it into a newtype will cause the mapAccumL to run over the entire list and evaluate a bunch of stuff too early and probably wind up with a bunch of lag. And the only way I can think of to find out what order these things are running is to throw and infinite list and see if they hang or put in a bunch of unsafePerformIO logging... not very satisfying in either case, especially when trying to inspect things can change the evaluation order. Sometimes I wonder what this would look like if I were generating incremental output with python yields... while getting the data dependencies right is essential in any case, I'm suspicious it would be easier to think about and understand in a strict language. But there's definitely a knack to be learned, and I think I might eventually get better at it. For example, I realized that the criteria to make something non-strict wrt data dependency are the same as trying to parallelize. Sometimes it's easier to think "what do I have to do to make these two processes run in parallel" and that's the same thing I have to do to make them interleave with each other lazily.

Lately I've been trying to go the other direction: make a large section of formerly strict code lazy.
There used to be a couple of tools trying to make suggestions when a function could be made less strict (Olaf Chitil's StrictCheck and another that escapes memory at the moment). Often, it comes down to some form of eta-expansion - making information available earlier [(\x->f x) tells us we have a function without having to evaluate f, (\p->(fst p,snd p)) marks a pair without needing to evaluate p, and so on].
I fully agree that once code size gets big these problems get a lot harder. You have to be very careful passing around state that you don't do anything that causes too much to be evaluated at the wrong time.
Generally, the trick is to develop an intuitition early, before growing the program in size;-) However, as long as you can run your big code with small data sets, you might want to try GHood, maintained on hackage thanks to Hugo Pacheco: http://hackage.haskell.org/package/GHood http://community.haskell.org/~claus/GHood/ (currently unavailable:-( The latter url has a few examples and a paper describing how GHood can be useful for observing relative strictness, ie, when data at one probe is forced relative to when it is forced at another. (it seems that citeseerx has a copy of the paper, at least: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.132.1397 ) For instance, if you put one probe on the output and one on the input, you can see which parts of the input are forced to produce which parts of the output. As I said, small data sets help, but since you put in probes manually and selectively, large code size should not interfere with observations (though it might hinder intuition:-).
But there's definitely a knack to be learned, and I think I might eventually get better at it. For example, I realized that the criteria to make something non-strict wrt data dependency are the same as trying to parallelize. Sometimes it's easier to think "what do I have to do to make these two processes run in parallel" and that's the same thing I have to do to make them interleave with each other lazily.
Sometimes, I think of non-strict evaluation as "spark a thread for everything, then process the threads with a single processor".. Claus

On 2/5/11 4:26 AM, Claus Reinke wrote:
Lately I've been trying to go the other direction: make a large section of formerly strict code lazy.
There used to be a couple of tools trying to make suggestions when a function could be made less strict (Olaf Chitil's StrictCheck and another that escapes memory at the moment).
Perhaps you're thinking of the chasing bottoms[1] library? [1] http://hackage.haskell.org/package/ChasingBottoms -- Live well, ~wren

For what it's worth I saw the problems in your counting examples right away, without reading the explanatory text below.
Yes, they were pretty obvious with enough experience. For beginners I expect it to be a rather insidious trap.
Beginners or anybody coding Haskell while not completely awake. ;-)
Hopefully that means that it's possibly to learn how to spot such things without resorting to e.g. running the program or reading Core *gasp*.
Within limits. Detrimental laziness or strictness can be arbitrarily well hidden.
I agree. Unfortunately... :-/

On 4/02/2011, at 10:10 AM, Andrew Coppin wrote:
The important obsevation is this: One tiny, almost insignificant change can transform a program from taking 50 seconds and 400 MB of RAM into one that takes 0.02 seconds and 0.1 MB of RAM. And, at least in this case, the simpler version is the slow one.
To say that Haskell is "slow" is both a little bit vague, and not really backed up by facts. In about 5 minutes flat, I managed to write a Haskell program that's very simple, and yet faster than a comparably simple C++ program (and C++ is supposed to be "fast"). So it's not that Haskell is "slow". It's that Haskell is *tricky*. Tiny, tiny little changes that look innocuous can have vast effects on performance. And this is a nice little example of that effect.
This seems to me to be the heart of the message, so maybe this reply is on-topic. Back in the days when systems other than Wintel and maybe sort of intel Linux were supported by Clean, I used to really love one of the features of the Clean compiler. One simple command line switch and the compiler would list the names of all your top level functions together with their types, and the types included strictness. (Also uniqueness, not relevant to Haskell.) The GHC documentation says the information is in the interface files, but they are binary now, and I can't find it there.
That got me thinking... What would happen if, instead of "Integer", we had two types, "evaluated Integer" and "possibly unevaluated Integer"? What if the strictness or otherwise of a data structure were exposed at the type level?
Oh, you mean like "!Int" and "Int" in Clean? I used to find bang *types* rather easier to deal with than I now do bang *patterns*.
Currently, if you want a strict list, you have to implement one yourself. But is that strict in the spine, or the elements, or what?
Spine strict: ![t]. Spine and element strict: ![!t].
I have no idea what the syntax for that would look like,
Clean?

On Thu, Feb 3, 2011 at 11:40 PM, Richard O'Keefe
Back in the days when systems other than Wintel and maybe sort of intel Linux were supported by Clean, I used to really love one of the features of the Clean compiler. One simple command line switch and the compiler would list the names of all your top level functions together with their types, and the types included strictness. (Also uniqueness, not relevant to Haskell.)
The GHC documentation says the information is in the interface files, but they are binary now, and I can't find it there.
ghc --show-iface HI_FILE The strictness signatures are a bit hard to parse though. Having a cheat sheet would be nice. Johan

On 4/02/2011, at 8:26 PM, Johan Tibell wrote:
--show-iface HI_FILE
sort.hs has 50 top level functions. One of them is main, and the others are all reachable from it. ghc -O sort.hs ghc --show-iface sort.hi The only functions listed are variants of main. Dropping -O leaves one variant of main only; the other 49 functions have disappeared. It is, by the way, something of a nuisance that if you do ghc sort.hs ... ghc -O sort.hs it insists "compilation NOT required" despite the fact that the previous compilation was with materially *different* options. (GHC 6.12.3) Given this clue I was able to get the information I wanted by exporting all the functions. This is the wrong interface. If a programmer is concerned that strictness analysis might not be finding something they thought was strict, they would ideally like to have *all* the functions reported, but at a minimum, all of the top level functions.

Johan Tibell
The GHC documentation says the information is in the interface files, but they are binary now, and I can't find it there.
ghc --show-iface HI_FILE
The strictness signatures are a bit hard to parse though. Having a cheat sheet would be nice.
Am I the only one who keep thinking it'd be great to have this incorporated in my Emacs haskell-mode? Along with a nice interface to core, please. And a pony. -k -- If I haven't seen further, it is by standing in the footprints of giants

That got me thinking... What would happen if, instead of "Integer", we had two types, "evaluated Integer" and "possibly unevaluated Integer"? What if the strictness or otherwise of a data structure were exposed at the type level?
Oh, you mean like "!Int" and "Int" in Clean? I used to find bang *types* rather easier to deal with than I now do bang *patterns*.
I have no idea what the syntax for that would look like,
Clean?
I didn't think Clean supported laziness at all? I thought it was a strict language.

On 5 February 2011 16:21, Andrew Coppin
I didn't think Clean supported laziness at all? I thought it was a strict language.
"CLEAN is a practical applicable general-purpose lazy pure functional programming language suited for the development of real world applications." [1] Haskell en Clean are very much alike. You can even compile Haskell 98 code with the latest (experimental) Clean compiler and having it interact with Clean code and vice-versa [2]. The main difference is Clean's use of uniqueness typing. 1 - http://clean.cs.ru.nl/download/Clean20/doc/CleanLangRep.2.1.pdf 2 - http://www.cs.ru.nl/~thomas/publications/groj10-exchanging-sources-between.p...

On 06/02/2011 09:13 AM, Roel van Dijk wrote:
Haskell en Clean are very much alike.
From what I could determine from a basic Clean introduction, Clean is very *unlike* Haskell, having a far more verbose and low-level syntax. (E.g., the compiler can't even determine whether a binding is recursive or not for itself. You have to say that manually.) It seems a very unecessarily complicated and messy language - which makes the name rather ironic.
You can even compile Haskell 98 code with the latest (experimental) Clean compiler and having it interact with Clean code and vice-versa [2]. The main difference is Clean's use of uniqueness typing.
As I say, I thought the main difference is that Clean is strict (which is why you can get good performance). Uniqueness typing is an interesting idea, that looks like it might be useful for more than mere I/O.

On 6 February 2011 19:41, Andrew Coppin
compiler can't even determine whether a binding is recursive or not for itself. You have to say that manually.) It seems a very unecessarily complicated and messy language - which makes the name rather ironic.
Erm - nope. Sure you haven't mixed OCaml and Clean to get OCClaeman? Clean is a very clean language. The only thing I got tripped up on were uniqueness and strict annotations (thanks to Richard O'Keefe above in the thread, for useful clarification).

On 7/02/2011, at 8:41 AM, Andrew Coppin wrote:
On 06/02/2011 09:13 AM, Roel van Dijk wrote:
Haskell en Clean are very much alike.
From what I could determine from a basic Clean introduction, Clean is very *unlike* Haskell, having a far more verbose and low-level syntax. (E.g., the compiler can't even determine whether a binding is recursive or not for itself. You have to say that manually.)
I have no idea what you are talking about here. Clean is _very_ Haskell-like, including typeclasses. Here's the first few lines of code from a Clean file I wrote in 1998. // This is a 'data' declaration. :: ArrTree a = ArrEmpty | ArrLeaf a | ArrNode a (ArrTree a) (ArrTree a) // The parentheses were not necessary empty :: (ArrTree a) empty = ArrEmpty asize :: (ArrTree a) -> Int asize (ArrEmpty) = 0 asize (ArrLeaf _) = 1 asize (ArrNode _ l r) = 1 + asize l + asize r // In Haskell it would be Int -> (ArrTree a) -> Bool. // Leaving the first arrow out means that both arguments // must be present in each rule. // 'if' is a function. known :: Int (ArrTree a) -> Bool known i ArrEmpty = False known i (ArrLeaf _) = i == 1 known i (ArrNode x l r) = i == 1 || known (i/2) (if (i mod 2 == 0) l r) fetch :: Int (ArrTree a) -> a fetch i (ArrLeaf x) | i == 1 = x fetch i (ArrNode x l r) | i == 1 = x | i mod 2 == 0 = fetch (i/2) l | otherwise = fetch (i/2) r As for the compiler being unable to determine whether a binding is recursive, I cannot find any such restriction in the Clean 2.1.1 manual and don't remember one in Clean 1. Here's an example straight out of the manual: primes :: [Int] primes = sieve [2..] where sieve :: [Int] -> [Int] sieve [pr:r] = [pr:sieve (filter pr r)] filter :: Int [Int] -> [Int] filter pr [n:r] | n mod pr == 0 = filter pr r | otherwise = [n:filter pr r] Clean uses [head : tail] where Haskell uses (head : tail). sieve and filter are both recursive (local) bindings, and the compiler manages just FINE.
It seems a very unecessarily complicated and messy language - which makes the name rather ironic.
It would be if true. There _are_ complexities in Clean, just as there are in Haskell. For the most part, they are the same complexities (laziness, type classes, type inference, generic programming).
As I say, I thought the main difference is that Clean is strict
Wrong.
(which is why you can get good performance). Uniqueness typing is an interesting idea, that looks like it might be useful for more than mere I/O.
It has been much used for arrays and records...

Haskell en Clean are very much alike.
From what I could determine from a basic Clean introduction, Clean is very *unlike* Haskell, having a far more verbose and low-level syntax. (E.g., the compiler can't even determine whether a binding is recursive or not for itself. You have to say that manually.)
I have no idea what you are talking about here. Clean is _very_ Haskell-like, including typeclasses.
Here's the first few lines of code from a Clean file I wrote in 1998.
I clearly have my languages mixed up. The language I'm thinking of required all variables (even top-level ones) to be declared with "let" - unless the definition is recursive, in which case you have to say "letrec" (i.e., the compiler it too stupid to deduce this automatically). Apparently that isn't Clean...

On 7 February 2011 22:00, Andrew Coppin
I clearly have my languages mixed up.
The language I'm thinking of required all variables (even top-level ones) to be declared with "let" - unless the definition is recursive, in which case you have to say "letrec" (i.e., the compiler it too stupid to deduce this automatically). Apparently that isn't Clean...
You are not necessarily wrong. Clean, like Haskell, is a moving target. To quote the paper "Exchanging Sources Between Clean and Haskell" [1]: The year of 1987 was a founding one for two pure, lazy, and strongly typed functional programming languages. Clean (Brus et al., 1987) was presented to the public for the first time and the first steps towards a common functional language, later named Haskell, were taken (Hudak et al., 2007). Clean was conceived at the Radboud University Nijmegen as a core language that is directly based on the computational model of functional term graph rewriting to generate efficient code. It also serves as an intermediate language for the compilation of other functional languages (Koopman and Nöcker, 1988; Plasmeijer and van Eekelen, 1993). For these reasons, it deliberately used a sparse syntax (van Eekelen et al., 1990): “... at some points one can clearly recognize that [..] Clean is a compromise between a functional programming language and an intermediate language used to produce efficient code. For instance, a minimal amount of syntactic sugar is added in [..] Clean.”. Later, the core language was sugared. The Clean of 1987—1994 sounds a lot like the language you are talking about. 1 - http://www.cs.ru.nl/~thomas/publications/groj10-exchanging-sources-between.p...

On 8/02/2011, at 10:43 AM, Roel van Dijk wrote:
On 7 February 2011 22:00, Andrew Coppin
wrote: I clearly have my languages mixed up.
The language I'm thinking of required all variables (even top-level ones) to be declared with "let" - unless the definition is recursive, in which case you have to say "letrec" (i.e., the compiler it too stupid to deduce this automatically). Apparently that isn't Clean...
You are not necessarily wrong. The Clean of 1987—1994 sounds a lot like the language you are talking about.
1 - http://www.cs.ru.nl/~thomas/publications/groj10-exchanging-sources-between.p...
No, it doesn't. Here's an example from "Clean - A Language for Functional Graph Rewriting", T.H. Brus, M.C.J.D. van Eekelen, M.O. van Leer, M.J. Plasmeijer, a 1987 paper which I _think_ is the first one about Clean: Start stdin -> Double (Add (Succ Zero) Zero); Double a -> Add a a; Add Zero n ->n | Add (Succ m) n ->Succ (Add m n); You will notice - an entire absence of 'let' - an entire absence of any 'letrec' You'll also discover that Clean was originally - thought of as an intermediate language - a subset of something called LEAN Clean 1 adopted Haskell-like syntax, but it was lazy from the beginning.

On 8/02/2011, at 10:00 AM, Andrew Coppin wrote:
I clearly have my languages mixed up.
The language I'm thinking of required all variables (even top-level ones) to be declared with "let" - unless the definition is recursive, in which case you have to say "letrec" (i.e., the compiler it too stupid to deduce this automatically). Apparently that isn't Clean...
That sounds like an ML-family language, possibly OCAML. However, this is *not* a question of stupidity. It's a question of scope rules. Example 1: let f x = 1;; let f x = if x = 0 then 0 else f x;; f 3;; This answers 1. Example 2: let f x = 1;; let rec f x = if x = 0 then 0 else f x;; f 3;; This goes into an infinite loop. If you don't like redeclaration, which is rather useful in an interactive top level, try nested: let f x = 1;; let g y = let f x = if x = 0 then 0 else f x in f (f y);; vs let f x = 1;; let g y = let rec f x = if x = 0 then 0 else f x in f (f y);; The distinction between let and letrec predates OCAML. Scheme does it too.

The distinction between let and letrec predates OCAML. Scheme does it too.
Haskell's choice of recursive everywhere is nice for the syntax, but occasionally error prone. I don't actually use explicit recursion too often because there are functions for that, but I still occasionally typo a variable name and make a circular reference, say naming a function's output the same as one of its inputs. If they're the same type then you get that least-friendly of crashes: an error-less hang with no indications about where it is. It means you sometimes can't reuse variable names when it would be better to shadow the old one and sometimes wind up with tricks like 'x <- return $ f x'. So there's a case to be made for letrec too.

On 6/02/2011, at 4:21 AM, Andrew Coppin wrote:
I didn't think Clean supported laziness at all? I thought it was a strict language.
Absolutely not. Back in the Good Old Days the one thing that used to keep on tripping me up is that Haskell if e0 then e1 else e2 <=> Clean if e0 e1 e2 Now, in what kind of language can 'if' be a perfectly ordinary function?

On Thu, Feb 3, 2011 at 11:40 PM, Richard O'Keefe
On 4/02/2011, at 10:10 AM, Andrew Coppin wrote:
The important obsevation is this: One tiny, almost insignificant change can transform a program from taking 50 seconds and 400 MB of RAM into one that takes 0.02 seconds and 0.1 MB of RAM. And, at least in this case, the simpler version is the slow one.
To say that Haskell is "slow" is both a little bit vague, and not really backed up by facts. In about 5 minutes flat, I managed to write a Haskell program that's very simple, and yet faster than a comparably simple C++ program (and C++ is supposed to be "fast"). So it's not that Haskell is "slow". It's that Haskell is *tricky*. Tiny, tiny little changes that look innocuous can have vast effects on performance. And this is a nice little example of that effect.
This seems to me to be the heart of the message, so maybe this reply is on-topic.
Back in the days when systems other than Wintel and maybe sort of intel Linux were supported by Clean, I used to really love one of the features of the Clean compiler. One simple command line switch and the compiler would list the names of all your top level functions together with their types, and the types included strictness. (Also uniqueness, not relevant to Haskell.)
The GHC documentation says the information is in the interface files, but they are binary now, and I can't find it there.
That got me thinking... What would happen if, instead of "Integer", we had two types, "evaluated Integer" and "possibly unevaluated Integer"? What if the strictness or otherwise of a data structure were exposed at the type level?
Oh, you mean like "!Int" and "Int" in Clean? I used to find bang *types* rather easier to deal with than I now do bang *patterns*.
Currently, if you want a strict list, you have to implement one yourself. But is that strict in the spine, or the elements, or what?
Spine strict: ![t]. Spine and element strict: ![!t].
This does seem like a very appealing idea (to me), especially if evaluation were to happen implicitly wherever the types require it. Are there any major obstacles or philosophical objections to it other than available time and manpower?
I have no idea what the syntax for that would look like,
Clean?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Work is punishment for failing to procrastinate effectively.

On 02/03/2011 03:10 PM, Andrew Coppin wrote:
(Unless you're seriously going to suggest that GHC's native code generator is any match for the might of a half-decent C compiler...)
I don't know enough about the native code generator to make a claim like that, but we're not comparing the native code generator against a C compiler; we're comparing it against a C code generator whose output is fed through a C compiler. These are very different things, and I think it gives the native code generator an edge. My own observations from immediately before the C backend was deprecated was that the native code generator averaged slightly better performing code. - Jake
participants (18)
-
Alexey Khudyakov
-
Andrew Coppin
-
Bernie Pope
-
Claus Reinke
-
Daniel Fischer
-
Erik de Castro Lopo
-
Evan Laforge
-
Gábor Lehel
-
Ivan Lazar Miljenovic
-
Jake McArthur
-
Jimbo Massive
-
Johan Tibell
-
Ketil Malde
-
Richard O'Keefe
-
Roel van Dijk
-
Scott Turner
-
Stephen Tetley
-
wren ng thornton