Re: [Haskell-cafe] Batteries included (Was: GHC is a monopoly compiler)

Can someone please define what exactly a "batteries included" standard library is? IMHO that Python-Haskell comparison is unfair. Although both claim to be general-purpose languages, the focus in Haskell certainly has been on language research for most of its life. I recently hacked together a web client in python, my first project in that language. Documentation is excellent. Yet I am still horrified I had to use a language that provides so few static guarantees to control megawatt machines. What puts me off Haskell nowadays is the direct result of Haskell's roots in language research: Often when I come across a package that does what I need, it uses the conduit, lens or another idiom, which are like a language in a language to learn. In milder ways Python seems to suffer the same problem. So please, developers: Write more batteries, but make them expose a neat lambda calculus interface if possible that can be combined freely with other batteries. Regards, Olaf

Hi Olaf! I believe I was the one to recently bring up the subject of "batteries included". I think wikipedia has a good treatment of it - https://en.wikipedia.org/wiki/Python_(programming_language)#Libraries
Python has a large standard library, commonly cited as one of Python's greatest strengths,[77] providing tools suited to many tasks. This is deliberate and has been described as a "batteries included"[29] Python philosophy. For Internet-facing applications, many standard formats and protocols (such as MIME and HTTP) are supported. Modules for creating graphical user interfaces, connecting to relational databases, pseudorandom number generators, arithmetic with arbitrary precision decimals,[78] manipulating regular expressions, and doing unit testing are also included.
Happily, there are efforts underway to fix this problem for Haskell.
In particular, I think the team of people working on the new
foundation library has a very good chance on delivering on the promise
of having batteries included for Haskell. I implore the community to
give foundation a try and pitch in to make it happen. Want to make
sure we don't screw up new-base? Help guide foundation's development.
Here is the link:
https://github.com/haskell-foundation/foundation/
Why is the Python <-> Haskell comparison unfair? We need to be
realistic in our self analysis, and comparing to a very successful
language can indicate what we need to do to make Haskell even more
popular.
Fairness does not matter. The world is not fair, it is made up of
people who have a wide range of opinions and attitude. Many in the
community would love to share Haskell with more people. Why? Because
it is such a valuable thing to learn, and one of the best general
purpose programming languages out there.
Why identify Haskell as a research language, when it has so clearly
grown far beyond that vision? Even if it is considered a research
language, isn't it good for your research to reach the widest audience
possible? This means that the fantastic ideas that make up Haskell /
GHC can reach a wider audience. Some members of that audience are
going to be designing future languages. Do we really want these
future language designers to do that without learning Haskell? I
would prefer that they learn it.
I have recently had a discussion, which I find quite disturbing, where
prominent community members are essentially saying that they are
content with (avoiding success (at all costs)), rather than (avoiding
(success at all costs)). Take a look:
https://www.reddit.com/r/haskell/comments/54gm70/haskell_respect_spj/d83otar
Frankly, these statements frighten and confuse me. It seems blatantly
unreasonable to me, to make the statement that we should be apathetic
towards marketing and lowering the barrier to entry for Haskell.
Sincerely,
Michael
On Tue, Sep 27, 2016 at 2:25 PM, Olaf Klinke
Can someone please define what exactly a "batteries included" standard library is? IMHO that Python-Haskell comparison is unfair. Although both claim to be general-purpose languages, the focus in Haskell certainly has been on language research for most of its life. I recently hacked together a web client in python, my first project in that language. Documentation is excellent. Yet I am still horrified I had to use a language that provides so few static guarantees to control megawatt machines.
What puts me off Haskell nowadays is the direct result of Haskell's roots in language research: Often when I come across a package that does what I need, it uses the conduit, lens or another idiom, which are like a language in a language to learn. In milder ways Python seems to suffer the same problem. So please, developers: Write more batteries, but make them expose a neat lambda calculus interface if possible that can be combined freely with other batteries.
Regards, Olaf
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

There is nothing of merit in Python libraries to be learned. Please don't ruin Haskell to the point of Python. On 28/09/16 08:25, Michael Sloan wrote:
Hi Olaf!
I believe I was the one to recently bring up the subject of "batteries included". I think wikipedia has a good treatment of it - https://en.wikipedia.org/wiki/Python_(programming_language)#Libraries
Python has a large standard library, commonly cited as one of Python's greatest strengths,[77] providing tools suited to many tasks. This is deliberate and has been described as a "batteries included"[29] Python philosophy. For Internet-facing applications, many standard formats and protocols (such as MIME and HTTP) are supported. Modules for creating graphical user interfaces, connecting to relational databases, pseudorandom number generators, arithmetic with arbitrary precision decimals,[78] manipulating regular expressions, and doing unit testing are also included. Happily, there are efforts underway to fix this problem for Haskell. In particular, I think the team of people working on the new foundation library has a very good chance on delivering on the promise of having batteries included for Haskell. I implore the community to give foundation a try and pitch in to make it happen. Want to make sure we don't screw up new-base? Help guide foundation's development. Here is the link:
https://github.com/haskell-foundation/foundation/
Why is the Python <-> Haskell comparison unfair? We need to be realistic in our self analysis, and comparing to a very successful language can indicate what we need to do to make Haskell even more popular.
Fairness does not matter. The world is not fair, it is made up of people who have a wide range of opinions and attitude. Many in the community would love to share Haskell with more people. Why? Because it is such a valuable thing to learn, and one of the best general purpose programming languages out there.
Why identify Haskell as a research language, when it has so clearly grown far beyond that vision? Even if it is considered a research language, isn't it good for your research to reach the widest audience possible? This means that the fantastic ideas that make up Haskell / GHC can reach a wider audience. Some members of that audience are going to be designing future languages. Do we really want these future language designers to do that without learning Haskell? I would prefer that they learn it.
I have recently had a discussion, which I find quite disturbing, where prominent community members are essentially saying that they are content with (avoiding success (at all costs)), rather than (avoiding (success at all costs)). Take a look: https://www.reddit.com/r/haskell/comments/54gm70/haskell_respect_spj/d83otar
Frankly, these statements frighten and confuse me. It seems blatantly unreasonable to me, to make the statement that we should be apathetic towards marketing and lowering the barrier to entry for Haskell.
Sincerely, Michael
On Tue, Sep 27, 2016 at 2:25 PM, Olaf Klinke
wrote: Can someone please define what exactly a "batteries included" standard library is? IMHO that Python-Haskell comparison is unfair. Although both claim to be general-purpose languages, the focus in Haskell certainly has been on language research for most of its life. I recently hacked together a web client in python, my first project in that language. Documentation is excellent. Yet I am still horrified I had to use a language that provides so few static guarantees to control megawatt machines.
What puts me off Haskell nowadays is the direct result of Haskell's roots in language research: Often when I come across a package that does what I need, it uses the conduit, lens or another idiom, which are like a language in a language to learn. In milder ways Python seems to suffer the same problem. So please, developers: Write more batteries, but make them expose a neat lambda calculus interface if possible that can be combined freely with other batteries.
Regards, Olaf
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Am 28.09.2016 um 08:29 schrieb Tony Morris:
There is nothing of merit in Python libraries to be learned.
That's almost true, but not 100%. E.g. does Haskell have doctests? I.e. you can write example code in the API-level docs, and there is tooling that can extract them, run them, and report whether the examples still work. There's also the old motto of "nothing is completely useless, it can still serve as a bad example".
Please don't ruin Haskell to the point of Python.
The Python stdlib is a collection of things people needed. So if you want a list of batteries that Haskell might be missing, Python's stdlib is a good shopping list. You don't want to copy the library API structure, but that danger is negligible. Python is even more imperative than C++ or Java, it's dynamically typed, and with these concept differences, what's a good library design in Python that leverages all the things that Python is good at is almost automatically neither desirable nor even possible in Haskell.

On 28/09/16 17:06, Joachim Durchholz wrote:
Am 28.09.2016 um 08:29 schrieb Tony Morris:
There is nothing of merit in Python libraries to be learned.
That's almost true, but not 100%. E.g. does Haskell have doctests? I.e. you can write example code in the API-level docs, and there is tooling that can extract them, run them, and report whether the examples still work.
I've been doing that for years, with the exception that doing so in Haskell is far superior than in Python for reasons too long to list. https://hackage.haskell.org/package/doctest
There's also the old motto of "nothing is completely useless, it can still serve as a bad example".
Please don't ruin Haskell to the point of Python.
The Python stdlib is a collection of things people needed. So if you want a list of batteries that Haskell might be missing, Python's stdlib is a good shopping list.
Where is the useful bit? I have only heard of it, never actually seen it.
You don't want to copy the library API structure, but that danger is negligible. Python is even more imperative than C++ or Java, it's dynamically typed, and with these concept differences, what's a good library design in Python that leverages all the things that Python is good at is almost automatically neither desirable nor even possible in Haskell. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

What are the advantages of batteries included? 1) It's easier for newcomers to find the appropriate library for common tasks. 2) Consistency between different programs, as they will mostly use the same standard libraries for common tasks. Do these require the batteries to be installed with the compiler? Do they require the batteries to be installed as a monolithic unit at all? An "official" wiki page pointing to the recommended packages for common tasks would seem to achieve the purpose almost as well, but without the accompanying problems. As for sets of specific library versions that are known to work well together, Stackage has already solved that. -- View this message in context: http://haskell.1045720.n5.nabble.com/Re-Batteries-included-Was-GHC-is-a-mono... Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

Am 28.09.2016 um 09:09 schrieb Tony Morris:
On 28/09/16 17:06, Joachim Durchholz wrote:
Am 28.09.2016 um 08:29 schrieb Tony Morris:
There is nothing of merit in Python libraries to be learned.
That's almost true, but not 100%. E.g. does Haskell have doctests? I.e. you can write example code in the API-level docs, and there is tooling that can extract them, run them, and report whether the examples still work.
I've been doing that for years, with the exception that doing so in Haskell is far superior than in Python for reasons too long to list.
Ah, sweet.
for reasons too long to list.
Can somebody with a similar long-time working experience in Haskell doctests provide such a list? It would be helpful in more than one way: It would help advocate Haskell, and to the kind of audience that is interested in high quality so it's a double win; and it would help other language communities improve their doctest ecosystem, and I think that's what most multi-language people would very much like to happen. Regards, Jo

m% cabal install doctest cabal: /usr/bin/ar: permission denied m% file /usr/bin/ar; ls -l /usr/bin/ar /usr/bin/ar: Mach-O 64-bit executable x86_64 -rwxr-xr-x 1 root wheel 18160 14 Jan 2016 /usr/bin/ar What am I doing wrong?

On Wed, Sep 28, 2016 at 6:14 PM, Richard A. O'Keefe
m% cabal install doctest cabal: /usr/bin/ar: permission denied
https://github.com/haskell/cabal/issues/2653#issuecomment-121997407 You have a cabal-install built against an older version of the unix package, which was doing an access() call that runs afoul of System Integrity Protection. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

On Tue, Sep 27, 2016 at 5:25 PM, Olaf Klinke
Can someone please define what exactly a "batteries included" standard library is? IMHO that Python-Haskell comparison is unfair. Although both claim to be general-purpose languages, the focus in Haskell certainly has been on language research for most of its life.
It means that useful libraries that most projects use come with it: in a Haskell context, that would be things like text, unordered-containers, and split. It's also a bikeshed, though: for example, H-P insisted on adding OpenGL because it was felt to be important to include some kind of GUI, but the Haskell GUI story is so fragmented and messed up that it doesn't seem really worth it --- and OpenGL basically got the nod solely on being readily portable to Linux/Windows/macOS without having to hunt down extra native libraries, not because it was actually being used by itself. This led to endless discussion on the libraries list. The complications are: - libraries change quickly and dependents tend to start requiring the new versions just as quickly, rendering the batteries included obsolete almost immediately; - in the opposite direction, some libraries that come with the compiler because it uses them (notably, containers) are effectively frozen because ghc-api or TH or etc. will break. Additionally, and causing the above complications to often be fatal, ghc's own library story --- specifically cross-module inlining, which is essential for performance --- results in things which would be sensibly hidden internal details and therefore ABI-breakage-safe often leaking out into the .hi file for cross-module inlining, making the internals part of the public ABI. This is really the primary cause of ghc's library nightmares, and why other languages usually don't have these kinds of problems (but older C and C++ code sometimes did, by exposing internals in header file macros, again for speed; original mh (not nmh) and KDE 1/2 (but not 3.x or later) are examples). But you can't really generate code from Haskell with any reasonable performance unless you resort to either cross-module inlining (ghc) or whole-program compilation (jhc). :( Tools like cabal-install and stack have to do all sorts of otherwise "nonsensical extra work" to try to avoid running headlong into this. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

On 2016-09-28 00:32, Brandon Allbery wrote:
Can someone please define what exactly a "batteries included" standard library is?
The complications are:
- libraries change quickly and dependents tend to start requiring the new versions just as quickly, rendering the batteries included obsolete almost immediately; - in the opposite direction, some libraries that come with the compiler because it uses them (notably, containers) are effectively frozen because ghc-api or TH or etc. will break.
This makes me think the "batteries included" metaphor is the wrong line of thinking for Haskell. Our environment is more like "direct access to the power lines", thanks to hackage and cabal. The potential downside is that "powerlines at your fingertips" can lead to cases of "plug and fry (your brain)". But then why should the environment be dumbed down to low-voltage batteries when the high-voltage-network is part of what got it to where it is? MarLinn

why should the environment be dumbed down to low-voltage batteries
well "Hello world" should be simple also for someone who never heard of Haskell before. So no harm in providing the batteries. Open the box, skip the instruction, turn it on. On the other hand, it is reasonable to expect to be able to switch between the batteries and the mains. Replace power supply unit, even. I don't see why all these options should not be available or why they conflict.

Am 28.09.2016 um 07:47 schrieb MarLinn via Haskell-Cafe:
The potential downside is that "powerlines at your fingertips" can lead to cases of "plug and fry (your brain)". But then why should the environment be dumbed down to low-voltage batteries when the high-voltage-network is part of what got it to where it is?
You already said it: Because people will fry their brain. "Powerlines at your fingertips" doesn't apply to Haskell, that's C++ and (if you will) C as well. Haskell is more like household AC: All the power that you need on a day-to-day basis, but carefully shielded, safeguarded and geared towards useful work instead of fry-your-brain accidents. You can disable the safeguards (UnsafeIO) and you better know what you're doing, though it's still not a case of nasal demons.

Am 27.09.2016 um 23:25 schrieb Olaf Klinke:
Can someone please define what exactly a "batteries included" standard library is?
Anything you'll typically need is already available. For some value of "typically need", so it's slightly squishy - here's a list of batteries I'd like included: - Reading/writing files - Reading over HTTP (reliably - HTTP is surprisingly complex) - Search&replace in test streams - Easy-to-use string->string maps - JSON parsing and printing (bonus points for YAML) - GUI stuff - Website stuff - Sending mails - Solid ecosystem: - build system, - library directory, - no-brainer automated testing support. (Complicated testing means more bugs in test code than in production code - this diverges.)
IMHO that Python-Haskell comparison is unfair. Although both claim to be general-purpose languages, the focus in Haskell certainly has been on language research for most of its life.
I do not think that Python actually comes with all batteries included. And in some areas support is pretty bad.
I recently hacked together a web client in python, my first project in that language. Documentation is excellent. Yet I am still horrified I had to use a language that provides so few static guarantees to control megawatt machines.
That, and the idea that class and function declarations are executable statements. Circular dependencies are "handled" by passing partially-initialized objects around; the Python interpreter handles this with no problems, but programmers have fun because nobody assumes incomplete initialization. I.e. Python's language semantics is broken by design in pretty fundamental areas. That said, it's good for banging something together quickly. Been there, done that, got the t-shirt. Just don't do anything that you need a team for with it, the lack of guarantees will really start to hurt.
What puts me off Haskell nowadays is the direct result of Haskell's roots in language research: Often when I come across a package that does what I need, it uses the conduit, lens or another idiom, which are like a language in a language to learn. In milder ways Python seems to suffer the same problem.
I think Python shares that problem with Lisp: it's so easy to add another meta-idiom that too many people actually do this, and most don't even think about composability or guarantees.
So please, developers: Write more batteries, but make them expose a neat lambda calculus interface if possible that can be combined freely with other batteries.
I sense a conflict of objectives here. Having many batteries pushes you towards wide APIs. However, the wider an API, the harder it is to make it combinable. More surface that must be made to match. Making an API that's feature-complete *and* narrow is really hard and takes a huge amount of designer and programmer time, plus the willingness to lose most of your existing user base for an unproven idea of improvement. This road is a really hard one, and you need corporate backing or personal obsession to follow it.

Batteries included is a bad idea when the community is this divided.
Relative no-brainer topics in other communities like how text should
be represented are highly contentious.
I'd sooner see some basic test cases hammered out and integrated into
base before a real attempt at this is launched. Something that would
help the Cabal devs.
On Tue, Sep 27, 2016 at 6:14 PM, Joachim Durchholz
Am 27.09.2016 um 23:25 schrieb Olaf Klinke:
Can someone please define what exactly a "batteries included" standard library is?
Anything you'll typically need is already available. For some value of "typically need", so it's slightly squishy - here's a list of batteries I'd like included: - Reading/writing files - Reading over HTTP (reliably - HTTP is surprisingly complex) - Search&replace in test streams - Easy-to-use string->string maps - JSON parsing and printing (bonus points for YAML) - GUI stuff - Website stuff - Sending mails - Solid ecosystem: - build system, - library directory, - no-brainer automated testing support. (Complicated testing means more bugs in test code than in production code - this diverges.)
IMHO that Python-Haskell comparison is unfair.
Although both claim to be general-purpose languages, the focus in Haskell certainly has been on language research for most of its life.
I do not think that Python actually comes with all batteries included. And in some areas support is pretty bad.
I recently hacked together a web client in python, my first project in that language. Documentation is excellent. Yet I am still horrified I had to use a language that provides so few static guarantees to control megawatt machines.
That, and the idea that class and function declarations are executable statements. Circular dependencies are "handled" by passing partially-initialized objects around; the Python interpreter handles this with no problems, but programmers have fun because nobody assumes incomplete initialization.
I.e. Python's language semantics is broken by design in pretty fundamental areas.
That said, it's good for banging something together quickly. Been there, done that, got the t-shirt. Just don't do anything that you need a team for with it, the lack of guarantees will really start to hurt.
What puts me off Haskell nowadays is the direct result of Haskell's roots in language research: Often when I come across a package that does what I need, it uses the conduit, lens or another idiom, which are like a language in a language to learn. In milder ways Python seems to suffer the same problem.
I think Python shares that problem with Lisp: it's so easy to add another meta-idiom that too many people actually do this, and most don't even think about composability or guarantees.
So please, developers: Write more
batteries, but make them expose a neat lambda calculus interface if possible that can be combined freely with other batteries.
I sense a conflict of objectives here. Having many batteries pushes you towards wide APIs. However, the wider an API, the harder it is to make it combinable. More surface that must be made to match.
Making an API that's feature-complete *and* narrow is really hard and takes a huge amount of designer and programmer time, plus the willingness to lose most of your existing user base for an unproven idea of improvement. This road is a really hard one, and you need corporate backing or personal obsession to follow it.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
-- Chris Allen Currently working on http://haskellbook.com

"CA" == Christopher Allen
writes:
CA> Batteries included is a bad idea when the community is this divided. CA> Relative no-brainer topics in other communities like how text should be CA> represented are highly contentious. Our community is not "divided", it simply has a large number of interests, meaning there is rarely a single standard library choice that benefits all groups equally: academic, pedagogic, hobbyist, commercial, etc. And exactly because we're not divided, we naturally avoid favoring the needs of one group over another. -- John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2

| Our community is not "divided", it simply has a large number of
| interests, meaning there is rarely a single standard library choice
| that benefits all groups equally: academic, pedagogic, hobbyist,
| commercial, etc. And exactly because we're not divided, we naturally
| avoid favoring the needs of one group over another.
Bravo John! You have re-framed the challenge (for challenge it is) as in a constructive way, one that acknowledges or even celebrates our differences, and encourages us to work together. Thank you.
Simon
| -----Original Message-----
| From: Haskell-Cafe [mailto:haskell-cafe-bounces@haskell.org] On Behalf
| Of John Wiegley
| Sent: 28 September 2016 01:16
| To: Christopher Allen

I am new to this mailing list and this was the first conversation I took note of. I study computer science at university and have become very interested in Haskell privately, devoting a large portion of my free time to it. I dream of finding a job with Haskell for my industry placement next year or at least after I finish university. I also have hopes of some day working in a world where the ideas and knowledge incubated in Haskell break the shackles of current OOrthodoxy. While I am aware that I haven't contributed anything to the community yet, I still thought it might be helpful if I made you aware that the tone of some parts of this conversation were rather bewildering to me, because to date I have perceived the community around Haskell to be made up of mainly extremely devoted and respectful individuals. I believe many newcomers would feel similar. The style of this conversation honestly cast a temporary doubt over the noble notions that Haskell has been standing for in my mind, as I came under the impression that this must be the usual tone around here. I dearly hope that this is not the case. I am somewhat relieved with John Wiegley's and SPJ's words, as well as some other conversations on this mailing list that seem to be rather more respectful. This may seem like another good reason to “avoid success at all cost” — to keep away trolls from this amazing project; but please don't let bashers endanger Haskell's success. Haskell needs an open-minded and educated community, which I am striving to be part of. Vilem
On 28 Sep 2016, at 13:19, Simon Peyton Jones via Haskell-Cafe
wrote: | Our community is not "divided", it simply has a large number of | interests, meaning there is rarely a single standard library choice | that benefits all groups equally: academic, pedagogic, hobbyist, | commercial, etc. And exactly because we're not divided, we naturally | avoid favoring the needs of one group over another.
Bravo John! You have re-framed the challenge (for challenge it is) as in a constructive way, one that acknowledges or even celebrates our differences, and encourages us to work together. Thank you.
Simon
| -----Original Message----- | From: Haskell-Cafe [mailto:haskell-cafe-bounces@haskell.org] On Behalf | Of John Wiegley | Sent: 28 September 2016 01:16 | To: Christopher Allen
| Cc: Haskell Cafe | Subject: Re: [Haskell-cafe] Batteries included | | >>>>> "CA" == Christopher Allen writes: | | CA> Batteries included is a bad idea when the community is this | divided. | CA> Relative no-brainer topics in other communities like how text | should | CA> be represented are highly contentious. | | Our community is not "divided", it simply has a large number of | interests, meaning there is rarely a single standard library choice | that benefits all groups equally: academic, pedagogic, hobbyist, | commercial, etc. And exactly because we're not divided, we naturally | avoid favoring the needs of one group over another. | | -- | John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B | B80F | https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fnewart | isans.com&data=01%7C01%7Csimonpj%40microsoft.com%7C9ca50ffb913e4bd8d85 | e08d3e734b644%7C72f988bf86f141af91ab2d7cd011db47%7C1&sdata=1cmogRP206x | YBhVlCDmjtd9LWhASaSAVCVkYxEhYAak%3D&reserved=0 | 60E1 46C4 BD1A 7AC1 4BA2 | _______________________________________________ | Haskell-Cafe mailing list | To (un)subscribe, modify options or view archives go to: | https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail.h | askell.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fhaskell- | cafe&data=01%7C01%7Csimonpj%40microsoft.com%7C9ca50ffb913e4bd8d85e08d3 | e734b644%7C72f988bf86f141af91ab2d7cd011db47%7C1&sdata=r9Yrp3gngUbqhn0u | qgpfNaKEEhFwukLZBIeKcvDPrrc%3D&reserved=0 | Only members subscribed via the mailman list are allowed to post. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Hi, Am Mittwoch, den 28.09.2016, 15:06 +0100 schrieb Vilem-Benjamin Liepelt:
The style of this conversation honestly cast a temporary doubt over the noble notions that Haskell has been standing for in my mind, as I came under the impression that this must be the usual tone around here.
I dearly hope that this is not the case.
it is not. We are usually very helpful and friendly people. Occasionally thread like this pops up, but I guess many start ignoring it (if only due to the sheer size). A bit unfortunate that this is your first impression, but I’m confident that you’ll get better impressions as time passes on. Greetings and enjoy Haskell, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • https://www.joachim-breitner.de/ XMPP: nomeata@joachim-breitner.de • OpenPGP-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org

That's so great to hear, thanks, Joachim! Best wishes, Vilem
On 28 Sep 2016, at 15:59, Joachim Breitner
wrote: Hi,
Am Mittwoch, den 28.09.2016, 15:06 +0100 schrieb Vilem-Benjamin Liepelt:
The style of this conversation honestly cast a temporary doubt over the noble notions that Haskell has been standing for in my mind, as I came under the impression that this must be the usual tone around here.
I dearly hope that this is not the case.
it is not. We are usually very helpful and friendly people. Occasionally thread like this pops up, but I guess many start ignoring it (if only due to the sheer size).
A bit unfortunate that this is your first impression, but I’m confident that you’ll get better impressions as time passes on.
Greetings and enjoy Haskell, Joachim
-- Joachim “nomeata” Breitner mail@joachim-breitner.de • https://www.joachim-breitner.de/ XMPP: nomeata@joachim-breitner.de • OpenPGP-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Relative no-brainer topics in other communities like how text should be represented are highly contentious.
Actually, choosing a good representation for text / strings is not a no-brainer, and the difficulty can be traced to several features of the Haskell language that are simply not present in other languages. Some of these features are: 1) Pattern matching. We can extract the first character and subsequent characters of a `String` and distinguish cases while doing so: isPrefixOf [] _ = True isPrefixOf (x:xs) [] = False isPrefixOf (x:xs) (y:ys) = isPrefixOf xs ys This is not possible in, say, Python. 2) Parametric polymorphism. The `isPrefixOf` function above is actually polymorphic. It works not only for `String`, but for any kind of list. 3) Lazy data structures. We can represent text of *infinite* length! cycle "Haskell" :: String And we can use them in a streaming fashion interact (take 10 . lines) :: IO () See also https://wiki.haskell.org/Simple_Unix_tools You cannot do this with a standard Python string. Also, it's not like other languages all agree on their preferred method of representing strings: NULL-terminated (C) vs "length-byte-first" (Pascal) comes to mind. Best regard Heinrich Apfelmus -- http://apfelmus.nfshost.com Christopher Allen wrote:
Batteries included is a bad idea when the community is this divided. Relative no-brainer topics in other communities like how text should be represented are highly contentious.
I'd sooner see some basic test cases hammered out and integrated into base before a real attempt at this is launched. Something that would help the Cabal devs.
On Tue, Sep 27, 2016 at 6:14 PM, Joachim Durchholz
wrote: Am 27.09.2016 um 23:25 schrieb Olaf Klinke:
Can someone please define what exactly a "batteries included" standard library is?
Anything you'll typically need is already available. For some value of "typically need", so it's slightly squishy - here's a list of batteries I'd like included: - Reading/writing files - Reading over HTTP (reliably - HTTP is surprisingly complex) - Search&replace in test streams - Easy-to-use string->string maps - JSON parsing and printing (bonus points for YAML) - GUI stuff - Website stuff - Sending mails - Solid ecosystem: - build system, - library directory, - no-brainer automated testing support. (Complicated testing means more bugs in test code than in production code - this diverges.)
IMHO that Python-Haskell comparison is unfair.
Although both claim to be general-purpose languages, the focus in Haskell certainly has been on language research for most of its life.
I do not think that Python actually comes with all batteries included. And in some areas support is pretty bad.
I recently hacked together a web client in python, my first project in that language. Documentation is excellent. Yet I am still horrified I had to use a language that provides so few static guarantees to control megawatt machines.
That, and the idea that class and function declarations are executable statements. Circular dependencies are "handled" by passing partially-initialized objects around; the Python interpreter handles this with no problems, but programmers have fun because nobody assumes incomplete initialization.
I.e. Python's language semantics is broken by design in pretty fundamental areas.
That said, it's good for banging something together quickly. Been there, done that, got the t-shirt. Just don't do anything that you need a team for with it, the lack of guarantees will really start to hurt.
What puts me off Haskell nowadays is the direct result of Haskell's roots in language research: Often when I come across a package that does what I need, it uses the conduit, lens or another idiom, which are like a language in a language to learn. In milder ways Python seems to suffer the same problem.
I think Python shares that problem with Lisp: it's so easy to add another meta-idiom that too many people actually do this, and most don't even think about composability or guarantees.
So please, developers: Write more
batteries, but make them expose a neat lambda calculus interface if possible that can be combined freely with other batteries.
I sense a conflict of objectives here. Having many batteries pushes you towards wide APIs. However, the wider an API, the harder it is to make it combinable. More surface that must be made to match.
Making an API that's feature-complete *and* narrow is really hard and takes a huge amount of designer and programmer time, plus the willingness to lose most of your existing user base for an unproven idea of improvement. This road is a really hard one, and you need corporate backing or personal obsession to follow it.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

When parametricity isn't an option, utf-8 is taken for granted, the
language is ambiently strict, it's quite a bit more straight-forward
what you want to do, even if the actual implementation is not trivial.
I'm not saying the decision should be trivial for us, I'm saying that
any attempt to unify the disparate things Haskell programmers want is
going to be much harder than it would be with other languages and
ecosystems.
Also, I use this library for my work regularly:
https://github.com/snoyberg/mono-traversable/blob/17eebcfa2e96e923270e128467...
With Text (Strict | Lazy) and ByteString (Strict | Lazy)
This suits me fine, but I know people for whom mono-traversable is a hard-no.
On Thu, Sep 29, 2016 at 9:43 AM, Heinrich Apfelmus
Relative no-brainer topics in other communities like how text should be represented are highly contentious.
Actually, choosing a good representation for text / strings is not a no-brainer, and the difficulty can be traced to several features of the Haskell language that are simply not present in other languages. Some of these features are:
1) Pattern matching.
We can extract the first character and subsequent characters of a `String` and distinguish cases while doing so:
isPrefixOf [] _ = True isPrefixOf (x:xs) [] = False isPrefixOf (x:xs) (y:ys) = isPrefixOf xs ys
This is not possible in, say, Python.
2) Parametric polymorphism.
The `isPrefixOf` function above is actually polymorphic. It works not only for `String`, but for any kind of list.
3) Lazy data structures.
We can represent text of *infinite* length!
cycle "Haskell" :: String
And we can use them in a streaming fashion
interact (take 10 . lines) :: IO ()
See also
https://wiki.haskell.org/Simple_Unix_tools
You cannot do this with a standard Python string.
Also, it's not like other languages all agree on their preferred method of representing strings: NULL-terminated (C) vs "length-byte-first" (Pascal) comes to mind.
Best regard Heinrich Apfelmus
-- http://apfelmus.nfshost.com
Christopher Allen wrote:
Batteries included is a bad idea when the community is this divided. Relative no-brainer topics in other communities like how text should be represented are highly contentious.
I'd sooner see some basic test cases hammered out and integrated into base before a real attempt at this is launched. Something that would help the Cabal devs.
On Tue, Sep 27, 2016 at 6:14 PM, Joachim Durchholz
wrote: Am 27.09.2016 um 23:25 schrieb Olaf Klinke:
Can someone please define what exactly a "batteries included" standard library is?
Anything you'll typically need is already available. For some value of "typically need", so it's slightly squishy - here's a list of batteries I'd like included: - Reading/writing files - Reading over HTTP (reliably - HTTP is surprisingly complex) - Search&replace in test streams - Easy-to-use string->string maps - JSON parsing and printing (bonus points for YAML) - GUI stuff - Website stuff - Sending mails - Solid ecosystem: - build system, - library directory, - no-brainer automated testing support. (Complicated testing means more bugs in test code than in production code - this diverges.)
IMHO that Python-Haskell comparison is unfair.
Although both claim to be general-purpose languages, the focus in Haskell certainly has been on language research for most of its life.
I do not think that Python actually comes with all batteries included. And in some areas support is pretty bad.
I recently hacked together a web client in python, my first project in that language. Documentation is excellent. Yet I am still horrified I had to use a language that provides so few static guarantees to control megawatt machines.
That, and the idea that class and function declarations are executable statements. Circular dependencies are "handled" by passing partially-initialized objects around; the Python interpreter handles this with no problems, but programmers have fun because nobody assumes incomplete initialization.
I.e. Python's language semantics is broken by design in pretty fundamental areas.
That said, it's good for banging something together quickly. Been there, done that, got the t-shirt. Just don't do anything that you need a team for with it, the lack of guarantees will really start to hurt.
What puts me off Haskell nowadays is the direct result of Haskell's roots in language research: Often when I come across a package that does what I need, it uses the conduit, lens or another idiom, which are like a language in a language to learn. In milder ways Python seems to suffer the same problem.
I think Python shares that problem with Lisp: it's so easy to add another meta-idiom that too many people actually do this, and most don't even think about composability or guarantees.
So please, developers: Write more
batteries, but make them expose a neat lambda calculus interface if possible that can be combined freely with other batteries.
I sense a conflict of objectives here. Having many batteries pushes you towards wide APIs. However, the wider an API, the harder it is to make it combinable. More surface that must be made to match.
Making an API that's feature-complete *and* narrow is really hard and takes a huge amount of designer and programmer time, plus the willingness to lose most of your existing user base for an unproven idea of improvement. This road is a really hard one, and you need corporate backing or personal obsession to follow it.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
-- Chris Allen Currently working on http://haskellbook.com

Am 29.09.2016 um 16:43 schrieb Heinrich Apfelmus:
Also, it's not like other languages all agree on their preferred method of representing strings: NULL-terminated (C) vs "length-byte-first" (Pascal) comes to mind.
Each language does define its preferred string representation.

On 30/09/16 4:18 AM, Joachim Durchholz wrote:
Each language does define its preferred string representation.
Java again: it has *two* string representations baked into the language. The Smalltalk system I use most has - read-only strings (preferred) - unique read-only strings - mutable strings - substrings (positionable read-only slices) - extensible strings - streams over strings - lazy concatenations of strings - read-only byte arrays viewed as strings - mutable byte arrays viewed as strings Other Smalltalks typically have four or more concrete kinds of string plus streams over strings; the substring, extensible string, and lazy concatenation libraries I use could be ported to them.

Am 30.09.2016 um 04:16 schrieb Richard A. O'Keefe:
On 30/09/16 4:18 AM, Joachim Durchholz wrote:
Each language does define its preferred string representation.
Java again: it has *two* string representations baked into the language.
There is a single standard representation. I'm not even aware of a second one, and I've been programming Java for quite a while now. Unless you mean StringBuilder/StringBuffer (that would be three String types then). However, these classes are by no means "preferred" in practice: the vast majority of APIs demands and returns String objects. Even then, Java has its preferred string representation nailed down pretty strongly: a hidden array of 16-bit Unicode code points, referenced by a descriptor object (the actual String), immutable.
The Smalltalk system I use most has - read-only strings (preferred) - unique read-only strings - mutable strings - substrings (positionable read-only slices) - extensible strings - streams over strings - lazy concatenations of strings - read-only byte arrays viewed as strings - mutable byte arrays viewed as strings
Ah, Smalltalk. I haven't looked at that in ages. I'll give you that these classes all exist, but I am not sure whether a Smalltalk programmer would consider them all equivalent or not.

FWIW, C++ has:
- char* and const char*, inherited from C
- wchar_t*, const wchar_t*
- the above, but with an explicit length passed along as a separate argument
- std::string
- std::wstring (is that what it's called?)
- various string implementations, provided by platform APIs and frameworks
(QString, LPTCHAR, and other nonsense)
And they all suck - most are really just byte arrays, some try to implement
Unicode but fall short, and the ones that do it mostly right are specific
to a sub-ecosystem. It's a mess.
And do I need to mention PHP? That one doesn't have a useful string type at
all, and also lacks the language feature to build it yourself - you're
stuck with broken semantics either way, best you can hope for is that they
are only mildly broken and you can get away with it.
C, by the way, shares C++'s problem, except that it doesn't even come with
a string type that does bounds checking.
And finally: while Haskell makes you choose between "byte array", "string",
and "list of code points", this isn't really awfully different from
languages like Java or C#, where you make a similar choice (string?
StringBuilder? byte[]?), except that the default is saner (for historical
reasons). Well, that, and that there are lazy flavors of the packed string
amd bytestring types, which has nothing to do with string type choices and
everything with defaulting to and leveraging non-strict semantics.
On Sep 30, 2016 8:17 AM, "Joachim Durchholz"
Am 30.09.2016 um 04:16 schrieb Richard A. O'Keefe:
On 30/09/16 4:18 AM, Joachim Durchholz wrote:
Each language does define its preferred string representation.
Java again: it has *two* string representations baked into the language.
There is a single standard representation. I'm not even aware of a second one, and I've been programming Java for quite a while now.
Unless you mean StringBuilder/StringBuffer (that would be three String types then). However, these classes are by no means "preferred" in practice: the vast majority of APIs demands and returns String objects.
Even then, Java has its preferred string representation nailed down pretty strongly: a hidden array of 16-bit Unicode code points, referenced by a descriptor object (the actual String), immutable.
The Smalltalk system I use most has
- read-only strings (preferred) - unique read-only strings - mutable strings - substrings (positionable read-only slices) - extensible strings - streams over strings - lazy concatenations of strings - read-only byte arrays viewed as strings - mutable byte arrays viewed as strings
Ah, Smalltalk. I haven't looked at that in ages. I'll give you that these classes all exist, but I am not sure whether a Smalltalk programmer would consider them all equivalent or not. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Am 30.09.2016 um 08:44 schrieb Tobias Dammers:
FWIW, C++ has:
- char* and const char*, inherited from C - wchar_t*, const wchar_t* - the above, but with an explicit length passed along as a separate argument - std::string - std::wstring (is that what it's called?) - various string implementations, provided by platform APIs and frameworks (QString, LPTCHAR, and other nonsense)
And they all suck - most are really just byte arrays, some try to implement Unicode but fall short, and the ones that do it mostly right are specific to a sub-ecosystem. It's a mess.
Yep - preferred type was the zero-terminated byte array, and after that things have diverged.
And do I need to mention PHP? That one doesn't have a useful string type at all,
It's still a string type :-)
And finally: while Haskell makes you choose between "byte array", "string", and "list of code points", this isn't really awfully different from languages like Java or C#, where you make a similar choice (string? StringBuilder? byte[]?),
At least in Java, you don't really choose, circumstances dictate. String are immutable. Nice semantics, O(N^2) for N concatenations. Vast majority of APIs uses this, most strongly preferred. StringBuilder is mutable. In practice, people use it as a scratchpad to construct Strings if they need a loop. Majority of cases is local variables, libraries with the purpose of constructing a large output string tend to have a collect-the-output buffer and pass that around internally but don't expose it to callers (maybe to callbacks, haven't seen that done though). byte[] for string manipulation is a really itchy hair shirt, you don't do that unless very strong reasons compel you to. I am aware of exactly two use cases: Password storage (to be able to wipe the data ASAP), and converting from and to external byte streams that carry text. So it's all straightforward, and String is really the preferred use case. There's a lot of things that Java doesn't get quite right, but string handling is not one of these :-)

On 30/09/16 7:17 PM, Joachim Durchholz wrote:
There is a single standard representation. [for strings in Java] I'm not even aware of a second one, and I've been programming Java for quite a while now Unless you mean StringBuilder/StringBuffer (that would be three String types then).
StringBuffer is just a synchronized version of StringBuilder. However, these classes are by no means "preferred" in
practice: the vast majority of APIs demands and returns String objects.
The Java *compiler* prefers StringBuilder: when you write a string concatenation expression in Java the compiler creates a StringBuilder behind the scenes. I'm counting a class as "preferred" if the compiler *has* to know about it and generates code involving it without the programmer explicitly mentioning it.
Even then, Java has its preferred string representation nailed down pretty strongly: a hidden array of 16-bit Unicode code points, referenced by a descriptor object (the actual String), immutable.
As already noted, that representation changed internally. And that change is actually relevant to this thread. The representation that _used_ to be used was (char[] array, offset, length, hash) Amongst other things, this meant that taking a substring cost O(1) time and O(1) space, because you just had to allocate and initialise a new "descriptor object" sharing the underlying array. Since Java 1.7 the representation is (char[] array, hash) Amongst other things, this means that taking a substring n characters long now costs O(n) time and O(n) space. If you are working in a loop like while (there is more input) { read a chunk of input split it into substrings process some of the substrings } the pre-Java-1.7 representation is perfect. If you *retain* some of the substrings, however, you retain the whole chunk. That was easy to fix by doing retain(new String(someSubstring)) instead of retain(someSubstring) but you had to *know* to do it. (Another solution would be to have a smarter garbage collector that knew about string sharing and could compact strings. I wrote such a collector for XPL many years ago. It's quite easy to do a stop-and- copy garbage collector that does that. But that's not the state of the art in Java garbage collection, and I'm not sure how well string compaction would fit into a more advanced collector.) The Java 1.7-and-later representation is *safer*. Depending on your usage, it may either save a lot of memory or bloat your memory use. The point is that there is no one-size-fits-all string representation; being given only one forces you to either write your own additional representation(s) or to use a representation which is not really suited to your particular purpose.

Am 03.10.2016 um 01:20 schrieb Richard A. O'Keefe:
The Java *compiler* prefers StringBuilder: when you write a string concatenation expression in Java the compiler creates a StringBuilder behind the scenes. I'm counting a class as "preferred" if the compiler *has* to know about it and generates code involving it without the programmer explicitly mentioning it.
Then Haskell's preferred representation of additive types would be the updatable record. Or machine integers are preferably stored in registers because that's where every new integer is created, RAM is second class... I think that's stretching things too far. There are more indicators against your theory: 1) During the lifetime of a program, the vast majority of textual data is stored in String objects. StringBuilders are just temporary and are discarded once the String object is built. (That's quantitative, not qualitative.) 2) The compiler does NOT have to know. Straight from the Java spec:
15.18.1. [...] To increase the performance of repeated string concatenation, a Java compiler may use the StringBuffer class or a similar technique to reduce the number of intermediate String objects that are created by evaluation of an expression. Moreover, the entire paragraph is a non-authoritative remark.
Even then, Java has its preferred string representation nailed down pretty strongly: a hidden array of 16-bit Unicode code points, referenced by a descriptor object (the actual String), immutable.
As already noted, that representation changed internally.
Yes, Java 7 changed that to prevent memory leaks from happening.
And that change is actually relevant to this thread.
I have been thinking about that argument and do not think it is valid in a Java context. Java programmers are used to unexpected performance changes, mostly due to changes in the garbage collector. It's also just a single function that changed behaviour, and definitely not the most common one even if it's pretty important.
The representation that _used_ to be used was (char[] array, offset, length, hash) Amongst other things,
Not really...
this meant that taking a substring cost O(1) time and O(1) space, because you just had to allocate and initialise a new "descriptor object" sharing the underlying array.
"You" never had. This all happened behind the scenes, an implementation detail.
If you are working in a loop like while (there is more input) { read a chunk of input split it into substrings process some of the substrings } the pre-Java-1.7 representation is perfect. If you *retain* some of the substrings, however, you retain the whole chunk. That was easy to fix by doing retain(new String(someSubstring)) instead of retain(someSubstring) but you had to *know* to do it.
Okay, now i get the point. It's a pretty specialized kind of code though. Usually you don't care much about how much of some input you retain, because more than 50% of the input strings are retained anyway (if you even do retain strings). It did have the potential for a memory leak, but now we're getting into a pretty special corner case here. Plus it still does not change a bit about that String is the standard representation in Java, not StringBuffer nor byte[]. The programmer(!) isn't confused about selecting which one, and that was the point originally made. Diving into implementation details just to prove that wrong isn't going to change that the impression that Java's string representations are confusing was just the result of first impressions without actual practice.
(Another solution would be to have a smarter garbage collector that knew about string sharing and could compact strings. I wrote such a collector for XPL many years ago. It's quite easy to do a stop-and- copy garbage collector that does that. But that's not the state of the art in Java garbage collection,
Agreed.
and I'm not sure how well string compaction would fit into a more advanced collector.)
Since Java's standard use case is long-running server programs, most if not all Java GCs are copying collectors nowadays. So, this would be a good fit in principle. It might have unfavorable trade-offs with other use cases though. It's quite possible that they implemented this, benchmarked it, and found they couldn't get it up to competitive speed.
The point is that there is no one-size-fits-all string representation; being given only one forces you to either write your own additional representation(s) or to use a representation which is not really suited to your particular purpose.
I haven't read anybody complain about Java's string representation yet. That does not mean that nobody does (I'm pretty sure that there are complaints), it just doesn't concern people much in practice. Most Java programmers don't deal with this, they use a library like JAXML or Jackson for parsing (XML resp. JSON), get good-enough performance, and move on. Some people used to complain that 16-bit characters are a waste of memory, but even that isn't considered a big problem - essentially, the alternatives are out of sight and out of mind. (It would be interesting to see what happened in a language where the standard string representation is UTF-8. Given that Unicode requires a minimum of three bytes for a codepoint nowadays, the UTF-16 advantage of "character count = storage cell count" has vanished anyway.)

On 30/09/16 3:43 AM, Heinrich Apfelmus wrote:
Also, it's not like other languages all agree on their preferred method of representing strings: NULL-terminated (C) vs "length-byte-first" (Pascal) comes to mind.
Just in support of that claim, Java *changed* its implementation of strings. Originally, someString.substr(beginIndex, endIndex) took O(1) time and space whatever the values of someString, beginIndex, and endIndex. These days it takes O(endIndex - beginIndex) time and space. And yes, that DID mean that the performance characteristics of many Java programs changed without their authors knowing or intending it.
participants (16)
-
Brandon Allbery
-
Christopher Allen
-
Heinrich Apfelmus
-
Imants Cekusins
-
Joachim Breitner
-
Joachim Durchholz
-
John Wiegley
-
MarLinn
-
Michael Sloan
-
Olaf Klinke
-
Peter
-
Richard A. O'Keefe
-
Simon Peyton Jones
-
Tobias Dammers
-
Tony Morris
-
Vilem-Benjamin Liepelt