
Hey all, In the earlier haskell-cafe discussion of IsString, someone mentioned that it would be nice to abandon [Char] as the blessed string type in Haskell. I've thought about this on and off for a while now, and think that the fact that [Char] is the default string type is a really big issue (for example, it gives beginners the idea that Haskell is incredibly slow, because everything that involves string processing is using linked lists). I am not proposing anything, but am curious as to what already has been discussed: 1. Has the possibility of migrating away from [Char] been investigated before? 2. What gains could we see in ease of use, performance, etc, if [Char] was deprecated? 3. What could replace [Char], while retaining the same ease of use for writing string manipulation functions (pattern matching, etc)? 4. Is there any sort of migration path that would make this change feasible in mainline Haskell in the medium term (2-5 years)? Thanks! I would welcome any references or links that my cursory googling has failed to find. -- Andrew

On 05/18/2015 11:44 PM, Andrew Gibiansky wrote:
Hey all,
In the earlier haskell-cafe discussion of IsString, someone mentioned that it would be nice to abandon [Char] as the blessed string type in Haskell. I've thought about this on and off for a while now, and think that the fact that [Char] is the default string type is a really big issue (for example, it gives beginners the idea that Haskell is incredibly slow, because everything that involves string processing is using linked lists).
I am not proposing anything, but am curious as to what already has been discussed:
Just me few cents below.
1. Has the possibility of migrating away from [Char] been investigated before?
Migrating away to what?
2. What gains could we see in ease of use, performance, etc, if [Char] was deprecated?
Deprecated in favour of what?
3. What could replace [Char], while retaining the same ease of use for writing string manipulation functions (pattern matching, etc)?
ViewPatterns could let us imitate the pattern match at least but you'd still have to reconstruct from a list on RHS. Basically to me you're asking whether we can work with lists, using usual list things including constructors and presumably all list-y functions but without using lists… We either want one or another ;). But depending on what we want and if you're willing to give up the : and [] syntax, one could probably do well here anyway. But in any scenario, you'd be breaking every piece of code using String ever anyway.
4. Is there any sort of migration path that would make this change feasible in mainline Haskell in the medium term (2-5 years)?
I don't think we'll ever see String changed in any significant way by default unless great pains are taken to do so. As I mention, you'd probably be breaking everything using String. If you don't want to work with a list of characters then use a different thing, most likely Text. Honestly, if your only motivation is that beginners might get a wrong idea, I don't think anything along your questions is even worth considering. Usually it takes few minutes top to tell a newbie in #haskell that they probably want Text or whatever, if they even care. Trying very hard to change what we already have and still make it as accessible as it is today is simply something I can't justify in any way I try.
Thanks! I would welcome any references or links that my cursory googling has failed to find.
-- Andrew
-- Mateusz K.

I've noticed improved list performance in GHC 7.10.1. In GHC 7.8, a simple
"sum" function worked faster on a "Stream" than a List, a Stream being a
data type with a state and successor function. The List version was around
10 times slower than the stream version when it came to summing Ints.
However GHC 7.10.1 compiles the list away, so the list version, stream
version, and accumulating parameter recursive function version now all run
in the same time.
If GHC continues to learn to optimise away lists effectively, [Char] may
not be a performance issue after all.
On Tue, May 19, 2015 at 2:47 PM, Mateusz Kowalczyk
Hey all,
In the earlier haskell-cafe discussion of IsString, someone mentioned
On 05/18/2015 11:44 PM, Andrew Gibiansky wrote: that
it would be nice to abandon [Char] as the blessed string type in Haskell. I've thought about this on and off for a while now, and think that the fact that [Char] is the default string type is a really big issue (for example, it gives beginners the idea that Haskell is incredibly slow, because everything that involves string processing is using linked lists).
I am not proposing anything, but am curious as to what already has been discussed:
Just me few cents below.
1. Has the possibility of migrating away from [Char] been investigated before?
Migrating away to what?
2. What gains could we see in ease of use, performance, etc, if [Char] was deprecated?
Deprecated in favour of what?
3. What could replace [Char], while retaining the same ease of use for writing string manipulation functions (pattern matching, etc)?
ViewPatterns could let us imitate the pattern match at least but you'd still have to reconstruct from a list on RHS. Basically to me you're asking whether we can work with lists, using usual list things including constructors and presumably all list-y functions but without using lists… We either want one or another ;). But depending on what we want and if you're willing to give up the : and [] syntax, one could probably do well here anyway. But in any scenario, you'd be breaking every piece of code using String ever anyway.
4. Is there any sort of migration path that would make this change feasible in mainline Haskell in the medium term (2-5 years)?
I don't think we'll ever see String changed in any significant way by default unless great pains are taken to do so. As I mention, you'd probably be breaking everything using String. If you don't want to work with a list of characters then use a different thing, most likely Text.
Honestly, if your only motivation is that beginners might get a wrong idea, I don't think anything along your questions is even worth considering. Usually it takes few minutes top to tell a newbie in #haskell that they probably want Text or whatever, if they even care. Trying very hard to change what we already have and still make it as accessible as it is today is simply something I can't justify in any way I try.
Thanks! I would welcome any references or links that my cursory googling has failed to find.
-- Andrew
-- Mateusz K. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

Hi, Am Dienstag, den 19.05.2015, 15:19 +1000 schrieb Clinton Mead:
However GHC 7.10.1 compiles the list away, so the list version, stream version, and accumulating parameter recursive function version now all run in the same time.
glad to hear that (I believe I am partly responsible for that). But
If GHC continues to learn to optimise away lists effectively, [Char] may not be a performance issue after all.
is too optimistic. [Char] will never be a good choice for efficient string manipulation; the optimizations you mention only work in very specific circumstances (at least: lists used exactly once, by a “good consumer” and produced by a “good producer”). The sufficiently smart compiler continues to be an utopia. (Which does not stop us from working towards it.) Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org

On 15-05-18 06:44 PM, Andrew Gibiansky wrote:
Hey all,
In the earlier haskell-cafe discussion of IsString, someone mentioned that it would be nice to abandon [Char] as the blessed string type in Haskell. I've thought about this on and off for a while now, and think that the fact that [Char] is the default string type is a really big issue (for example, it gives beginners the idea that Haskell is incredibly slow, because everything that involves string processing is using linked lists).
I am not proposing anything, but am curious as to what already has been discussed:
1. Has the possibility of migrating away from [Char] been investigated before?
No, not seriously as far as I'm aware. That ship has sailed a long time ago. Still, as I have actually thought about that, I'll give you an outline of a possible process.
2. What gains could we see in ease of use, performance, etc, if [Char] was deprecated?
They could be very significant for any code that took advantage of the new type, but the existing code would not benefit that much. But then, any new Haskell code can already use Text where performance matters.
3. What could replace [Char], while retaining the same ease of use for writing string manipulation functions (pattern matching, etc)?
You would not have the same ease of use exactly. The options would lie between two extremes. At one end, you can have a completely opaque String type with fromChars/toChars operations and nothing else. At the other end, you'd implement all operations useful on strings so there would never be any need to convert between String and [Char]. The first extreme would be mostly useless from the performance point of view, but with some GHC magic perhaps it could be made a viable upgrade path. The compiler would have to automatically insert the implicit fromChars/toChars conversion whenever necessary, and I expect that some of the existing Haskell code would still be broken. Once you have an opaque String type, you can think about improving the performance. A more efficient instance of Monoid String would be a good start, especially since it wouldn't break backward compatibility. Unfortunately that is the only [Char] instance in wide use that can be easily optimized. Perhaps Foldable could be made to work with even more compiler magic, but I doubt it would be worth the effort. If you add more operations on String that don't require
4. Is there any sort of migration path that would make this change feasible in mainline Haskell in the medium term (2-5 years)?
Suppose GHC 7.12 were to bring Text into the core libraries, change Prelude to declare type String = Text, and sprinkle some magic compiler dust to make the explicit Text <-> Char conversions unnecessary. The existing Haskell code would almost certainly perform worse overall. The only improved operations would be mappend on String, and possibly the string literal instantiation. I don't think there's any chance to get this kind of change proposal accepted today. You'd have to make the pain worth the gain. The only viable path is to ensure beforehand that the change improves more than just the mappend operation. In other words, you'd have to get today's String to instantiate more classes in common with tomorrow's String, and you'd have to get the everyday Haskell code to use those classes instead of list manipulations. The first tentative step towards the String type change would then be either the mono-traversable or my own monoid-subclasses package. They both define new type classes that are instantiated by both [Char] and Text. The main difference is that the former builds upon the Foldable foundation, the latter upon Monoid. They are both far from being a complete replacement for list manipulations. But any new code that used their operations would see a big improvement from the String type change. Here, then, is the five-year plan you're asking for: Year one: Agree on the ideal set of type classes to bridge the gap between [Char] and Text. Year two: Bring the new type classes into the Prelude. Have all relevant types instantiate them. Everybody's updating their code in delight to use the new class methods. Year three: GHC issues warnings about using List-specific [], ++, null, take, drop, span, drop, etc, on String. Everybody's furiously updating their code. Year four: Add Text to the core libraries. The GHC magic to make the Text <-> [Char] convertions implicit is implemented and ready for testing but requires a pragma. Year five: Update Haskell language report. Flip the switch. So there. How feasible does that sound?

Mario, thanks for that great writeup.
The switch can only happen if there's a way to make the old code somehow
transparently work the same or better in the new setup.
Maybe some GHC magic could bring the string operations to Prim Ops and
transparently switch the underlying representation to Text from [Char].
Basically, Text would have to become a built in primitive, not a library.
Michał
On Fri, May 22, 2015 at 10:29 AM, Mario Blažević
On 15-05-18 06:44 PM, Andrew Gibiansky wrote:
Hey all,
In the earlier haskell-cafe discussion of IsString, someone mentioned that it would be nice to abandon [Char] as the blessed string type in Haskell. I've thought about this on and off for a while now, and think that the fact that [Char] is the default string type is a really big issue (for example, it gives beginners the idea that Haskell is incredibly slow, because everything that involves string processing is using linked lists).
I am not proposing anything, but am curious as to what already has been discussed:
1. Has the possibility of migrating away from [Char] been investigated before?
No, not seriously as far as I'm aware. That ship has sailed a long time ago. Still, as I have actually thought about that, I'll give you an outline of a possible process.
2. What gains could we see in ease of use, performance, etc, if [Char]
was deprecated?
They could be very significant for any code that took advantage of the new type, but the existing code would not benefit that much. But then, any new Haskell code can already use Text where performance matters.
3. What could replace [Char], while retaining the same ease of use for
writing string manipulation functions (pattern matching, etc)?
You would not have the same ease of use exactly. The options would lie between two extremes. At one end, you can have a completely opaque String type with fromChars/toChars operations and nothing else. At the other end, you'd implement all operations useful on strings so there would never be any need to convert between String and [Char].
The first extreme would be mostly useless from the performance point of view, but with some GHC magic perhaps it could be made a viable upgrade path. The compiler would have to automatically insert the implicit fromChars/toChars conversion whenever necessary, and I expect that some of the existing Haskell code would still be broken.
Once you have an opaque String type, you can think about improving the performance. A more efficient instance of Monoid String would be a good start, especially since it wouldn't break backward compatibility. Unfortunately that is the only [Char] instance in wide use that can be easily optimized. Perhaps Foldable could be made to work with even more compiler magic, but I doubt it would be worth the effort.
If you add more operations on String that don't require
4. Is there any sort of migration path that would make this change
feasible in mainline Haskell in the medium term (2-5 years)?
Suppose GHC 7.12 were to bring Text into the core libraries, change Prelude to declare type String = Text, and sprinkle some magic compiler dust to make the explicit Text <-> Char conversions unnecessary.
The existing Haskell code would almost certainly perform worse overall. The only improved operations would be mappend on String, and possibly the string literal instantiation.
I don't think there's any chance to get this kind of change proposal accepted today. You'd have to make the pain worth the gain. The only viable path is to ensure beforehand that the change improves more than just the mappend operation.
In other words, you'd have to get today's String to instantiate more classes in common with tomorrow's String, and you'd have to get the everyday Haskell code to use those classes instead of list manipulations.
The first tentative step towards the String type change would then be either the mono-traversable or my own monoid-subclasses package. They both define new type classes that are instantiated by both [Char] and Text. The main difference is that the former builds upon the Foldable foundation, the latter upon Monoid. They are both far from being a complete replacement for list manipulations. But any new code that used their operations would see a big improvement from the String type change.
Here, then, is the five-year plan you're asking for:
Year one: Agree on the ideal set of type classes to bridge the gap between [Char] and Text.
Year two: Bring the new type classes into the Prelude. Have all relevant types instantiate them. Everybody's updating their code in delight to use the new class methods.
Year three: GHC issues warnings about using List-specific [], ++, null, take, drop, span, drop, etc, on String. Everybody's furiously updating their code.
Year four: Add Text to the core libraries. The GHC magic to make the Text <-> [Char] convertions implicit is implemented and ready for testing but requires a pragma.
Year five: Update Haskell language report. Flip the switch.
So there. How feasible does that sound?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

Mario, Thank you for that detailed write-up. That's exactly the sort of thing I was looking for. I imagine a path like the one you describe is possible, but very, very difficult, and likely the effort could be better spent elsewhere. I imagine an alternate route (that would have immediate gains in the near future, and wouldn't be a long-term transition plan) would be to have a `text-base` package, which exports everything `base` does, exporting `Text` instead of `String`. Then base packages off that instead of `base`, thus ensuring you do not rely on []-manipulation for `String` (you should still have full compatibility with normal `base`). Anyway, hard choices all around, for no 100% clear gain, so I personally do not envision this happening any time soon. Oh well... -- Andrew On Fri, May 22, 2015 at 6:07 PM, Michal Antkiewicz < mantkiew@gsd.uwaterloo.ca> wrote:
Mario, thanks for that great writeup.
The switch can only happen if there's a way to make the old code somehow transparently work the same or better in the new setup.
Maybe some GHC magic could bring the string operations to Prim Ops and transparently switch the underlying representation to Text from [Char]. Basically, Text would have to become a built in primitive, not a library.
Michał
On Fri, May 22, 2015 at 10:29 AM, Mario Blažević
wrote: On 15-05-18 06:44 PM, Andrew Gibiansky wrote:
Hey all,
In the earlier haskell-cafe discussion of IsString, someone mentioned that it would be nice to abandon [Char] as the blessed string type in Haskell. I've thought about this on and off for a while now, and think that the fact that [Char] is the default string type is a really big issue (for example, it gives beginners the idea that Haskell is incredibly slow, because everything that involves string processing is using linked lists).
I am not proposing anything, but am curious as to what already has been discussed:
1. Has the possibility of migrating away from [Char] been investigated before?
No, not seriously as far as I'm aware. That ship has sailed a long time ago. Still, as I have actually thought about that, I'll give you an outline of a possible process.
2. What gains could we see in ease of use, performance, etc, if [Char]
was deprecated?
They could be very significant for any code that took advantage of the new type, but the existing code would not benefit that much. But then, any new Haskell code can already use Text where performance matters.
3. What could replace [Char], while retaining the same ease of use for
writing string manipulation functions (pattern matching, etc)?
You would not have the same ease of use exactly. The options would lie between two extremes. At one end, you can have a completely opaque String type with fromChars/toChars operations and nothing else. At the other end, you'd implement all operations useful on strings so there would never be any need to convert between String and [Char].
The first extreme would be mostly useless from the performance point of view, but with some GHC magic perhaps it could be made a viable upgrade path. The compiler would have to automatically insert the implicit fromChars/toChars conversion whenever necessary, and I expect that some of the existing Haskell code would still be broken.
Once you have an opaque String type, you can think about improving the performance. A more efficient instance of Monoid String would be a good start, especially since it wouldn't break backward compatibility. Unfortunately that is the only [Char] instance in wide use that can be easily optimized. Perhaps Foldable could be made to work with even more compiler magic, but I doubt it would be worth the effort.
If you add more operations on String that don't require
4. Is there any sort of migration path that would make this change
feasible in mainline Haskell in the medium term (2-5 years)?
Suppose GHC 7.12 were to bring Text into the core libraries, change Prelude to declare type String = Text, and sprinkle some magic compiler dust to make the explicit Text <-> Char conversions unnecessary.
The existing Haskell code would almost certainly perform worse overall. The only improved operations would be mappend on String, and possibly the string literal instantiation.
I don't think there's any chance to get this kind of change proposal accepted today. You'd have to make the pain worth the gain. The only viable path is to ensure beforehand that the change improves more than just the mappend operation.
In other words, you'd have to get today's String to instantiate more classes in common with tomorrow's String, and you'd have to get the everyday Haskell code to use those classes instead of list manipulations.
The first tentative step towards the String type change would then be either the mono-traversable or my own monoid-subclasses package. They both define new type classes that are instantiated by both [Char] and Text. The main difference is that the former builds upon the Foldable foundation, the latter upon Monoid. They are both far from being a complete replacement for list manipulations. But any new code that used their operations would see a big improvement from the String type change.
Here, then, is the five-year plan you're asking for:
Year one: Agree on the ideal set of type classes to bridge the gap between [Char] and Text.
Year two: Bring the new type classes into the Prelude. Have all relevant types instantiate them. Everybody's updating their code in delight to use the new class methods.
Year three: GHC issues warnings about using List-specific [], ++, null, take, drop, span, drop, etc, on String. Everybody's furiously updating their code.
Year four: Add Text to the core libraries. The GHC magic to make the Text <-> [Char] convertions implicit is implemented and ready for testing but requires a pragma.
Year five: Update Haskell language report. Flip the switch.
So there. How feasible does that sound?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

Having just finished converting my Haskell shell-scripting tool from Strjng to Text/ByteString, might I suggest that such a change would create fewer problems after a Prelude rework to something like ClassyPrelude? Using ClassyPrelude meant that a lot of the code that worked with String worked just fine with Text and ByteString. I had more fixes due to having used partial function than with no longer having List's of chars. On Fri, May 22, 2015 at 12:37 PM, Andrew Gibiansky < andrew.gibiansky@gmail.com> wrote:
Mario,
Thank you for that detailed write-up. That's exactly the sort of thing I was looking for.
I imagine a path like the one you describe is possible, but very, very difficult, and likely the effort could be better spent elsewhere.
I imagine an alternate route (that would have immediate gains in the near future, and wouldn't be a long-term transition plan) would be to have a `text-base` package, which exports everything `base` does, exporting `Text` instead of `String`. Then base packages off that instead of `base`, thus ensuring you do not rely on []-manipulation for `String` (you should still have full compatibility with normal `base`).
Anyway, hard choices all around, for no 100% clear gain, so I personally do not envision this happening any time soon. Oh well...
-- Andrew
On Fri, May 22, 2015 at 6:07 PM, Michal Antkiewicz < mantkiew@gsd.uwaterloo.ca> wrote:
Mario, thanks for that great writeup.
The switch can only happen if there's a way to make the old code somehow transparently work the same or better in the new setup.
Maybe some GHC magic could bring the string operations to Prim Ops and transparently switch the underlying representation to Text from [Char]. Basically, Text would have to become a built in primitive, not a library.
Michał
On Fri, May 22, 2015 at 10:29 AM, Mario Blažević
wrote: On 15-05-18 06:44 PM, Andrew Gibiansky wrote:
Hey all,
In the earlier haskell-cafe discussion of IsString, someone mentioned that it would be nice to abandon [Char] as the blessed string type in Haskell. I've thought about this on and off for a while now, and think that the fact that [Char] is the default string type is a really big issue (for example, it gives beginners the idea that Haskell is incredibly slow, because everything that involves string processing is using linked lists).
I am not proposing anything, but am curious as to what already has been discussed:
1. Has the possibility of migrating away from [Char] been investigated before?
No, not seriously as far as I'm aware. That ship has sailed a long time ago. Still, as I have actually thought about that, I'll give you an outline of a possible process.
2. What gains could we see in ease of use, performance, etc, if [Char]
was deprecated?
They could be very significant for any code that took advantage of the new type, but the existing code would not benefit that much. But then, any new Haskell code can already use Text where performance matters.
3. What could replace [Char], while retaining the same ease of use for
writing string manipulation functions (pattern matching, etc)?
You would not have the same ease of use exactly. The options would lie between two extremes. At one end, you can have a completely opaque String type with fromChars/toChars operations and nothing else. At the other end, you'd implement all operations useful on strings so there would never be any need to convert between String and [Char].
The first extreme would be mostly useless from the performance point of view, but with some GHC magic perhaps it could be made a viable upgrade path. The compiler would have to automatically insert the implicit fromChars/toChars conversion whenever necessary, and I expect that some of the existing Haskell code would still be broken.
Once you have an opaque String type, you can think about improving the performance. A more efficient instance of Monoid String would be a good start, especially since it wouldn't break backward compatibility. Unfortunately that is the only [Char] instance in wide use that can be easily optimized. Perhaps Foldable could be made to work with even more compiler magic, but I doubt it would be worth the effort.
If you add more operations on String that don't require
4. Is there any sort of migration path that would make this change
feasible in mainline Haskell in the medium term (2-5 years)?
Suppose GHC 7.12 were to bring Text into the core libraries, change Prelude to declare type String = Text, and sprinkle some magic compiler dust to make the explicit Text <-> Char conversions unnecessary.
The existing Haskell code would almost certainly perform worse overall. The only improved operations would be mappend on String, and possibly the string literal instantiation.
I don't think there's any chance to get this kind of change proposal accepted today. You'd have to make the pain worth the gain. The only viable path is to ensure beforehand that the change improves more than just the mappend operation.
In other words, you'd have to get today's String to instantiate more classes in common with tomorrow's String, and you'd have to get the everyday Haskell code to use those classes instead of list manipulations.
The first tentative step towards the String type change would then be either the mono-traversable or my own monoid-subclasses package. They both define new type classes that are instantiated by both [Char] and Text. The main difference is that the former builds upon the Foldable foundation, the latter upon Monoid. They are both far from being a complete replacement for list manipulations. But any new code that used their operations would see a big improvement from the String type change.
Here, then, is the five-year plan you're asking for:
Year one: Agree on the ideal set of type classes to bridge the gap between [Char] and Text.
Year two: Bring the new type classes into the Prelude. Have all relevant types instantiate them. Everybody's updating their code in delight to use the new class methods.
Year three: GHC issues warnings about using List-specific [], ++, null, take, drop, span, drop, etc, on String. Everybody's furiously updating their code.
Year four: Add Text to the core libraries. The GHC magic to make the Text <-> [Char] convertions implicit is implemented and ready for testing but requires a pragma.
Year five: Update Haskell language report. Flip the switch.
So there. How feasible does that sound?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

one direction that this thread has *COMPLETELY* overlooked is the following:
could we use pattern synonyms or something along those lines to make the
migration to Text or the like more seemless?
On Fri, May 22, 2015 at 1:55 PM, Mike Meyer
Having just finished converting my Haskell shell-scripting tool from Strjng to Text/ByteString, might I suggest that such a change would create fewer problems after a Prelude rework to something like ClassyPrelude? Using ClassyPrelude meant that a lot of the code that worked with String worked just fine with Text and ByteString. I had more fixes due to having used partial function than with no longer having List's of chars.
On Fri, May 22, 2015 at 12:37 PM, Andrew Gibiansky < andrew.gibiansky@gmail.com> wrote:
Mario,
Thank you for that detailed write-up. That's exactly the sort of thing I was looking for.
I imagine a path like the one you describe is possible, but very, very difficult, and likely the effort could be better spent elsewhere.
I imagine an alternate route (that would have immediate gains in the near future, and wouldn't be a long-term transition plan) would be to have a `text-base` package, which exports everything `base` does, exporting `Text` instead of `String`. Then base packages off that instead of `base`, thus ensuring you do not rely on []-manipulation for `String` (you should still have full compatibility with normal `base`).
Anyway, hard choices all around, for no 100% clear gain, so I personally do not envision this happening any time soon. Oh well...
-- Andrew
On Fri, May 22, 2015 at 6:07 PM, Michal Antkiewicz < mantkiew@gsd.uwaterloo.ca> wrote:
Mario, thanks for that great writeup.
The switch can only happen if there's a way to make the old code somehow transparently work the same or better in the new setup.
Maybe some GHC magic could bring the string operations to Prim Ops and transparently switch the underlying representation to Text from [Char]. Basically, Text would have to become a built in primitive, not a library.
Michał
On Fri, May 22, 2015 at 10:29 AM, Mario Blažević
wrote: On 15-05-18 06:44 PM, Andrew Gibiansky wrote:
Hey all,
In the earlier haskell-cafe discussion of IsString, someone mentioned that it would be nice to abandon [Char] as the blessed string type in Haskell. I've thought about this on and off for a while now, and think that the fact that [Char] is the default string type is a really big issue (for example, it gives beginners the idea that Haskell is incredibly slow, because everything that involves string processing is using linked lists).
I am not proposing anything, but am curious as to what already has been discussed:
1. Has the possibility of migrating away from [Char] been investigated before?
No, not seriously as far as I'm aware. That ship has sailed a long time ago. Still, as I have actually thought about that, I'll give you an outline of a possible process.
2. What gains could we see in ease of use, performance, etc, if [Char]
was deprecated?
They could be very significant for any code that took advantage of the new type, but the existing code would not benefit that much. But then, any new Haskell code can already use Text where performance matters.
3. What could replace [Char], while retaining the same ease of use for
writing string manipulation functions (pattern matching, etc)?
You would not have the same ease of use exactly. The options would lie between two extremes. At one end, you can have a completely opaque String type with fromChars/toChars operations and nothing else. At the other end, you'd implement all operations useful on strings so there would never be any need to convert between String and [Char].
The first extreme would be mostly useless from the performance point of view, but with some GHC magic perhaps it could be made a viable upgrade path. The compiler would have to automatically insert the implicit fromChars/toChars conversion whenever necessary, and I expect that some of the existing Haskell code would still be broken.
Once you have an opaque String type, you can think about improving the performance. A more efficient instance of Monoid String would be a good start, especially since it wouldn't break backward compatibility. Unfortunately that is the only [Char] instance in wide use that can be easily optimized. Perhaps Foldable could be made to work with even more compiler magic, but I doubt it would be worth the effort.
If you add more operations on String that don't require
4. Is there any sort of migration path that would make this change
feasible in mainline Haskell in the medium term (2-5 years)?
Suppose GHC 7.12 were to bring Text into the core libraries, change Prelude to declare type String = Text, and sprinkle some magic compiler dust to make the explicit Text <-> Char conversions unnecessary.
The existing Haskell code would almost certainly perform worse overall. The only improved operations would be mappend on String, and possibly the string literal instantiation.
I don't think there's any chance to get this kind of change proposal accepted today. You'd have to make the pain worth the gain. The only viable path is to ensure beforehand that the change improves more than just the mappend operation.
In other words, you'd have to get today's String to instantiate more classes in common with tomorrow's String, and you'd have to get the everyday Haskell code to use those classes instead of list manipulations.
The first tentative step towards the String type change would then be either the mono-traversable or my own monoid-subclasses package. They both define new type classes that are instantiated by both [Char] and Text. The main difference is that the former builds upon the Foldable foundation, the latter upon Monoid. They are both far from being a complete replacement for list manipulations. But any new code that used their operations would see a big improvement from the String type change.
Here, then, is the five-year plan you're asking for:
Year one: Agree on the ideal set of type classes to bridge the gap between [Char] and Text.
Year two: Bring the new type classes into the Prelude. Have all relevant types instantiate them. Everybody's updating their code in delight to use the new class methods.
Year three: GHC issues warnings about using List-specific [], ++, null, take, drop, span, drop, etc, on String. Everybody's furiously updating their code.
Year four: Add Text to the core libraries. The GHC magic to make the Text <-> [Char] convertions implicit is implemented and ready for testing but requires a pragma.
Year five: Update Haskell language report. Flip the switch.
So there. How feasible does that sound?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

Carter, that is a very good suggestion! I imagine that a combination of PatternSynonyms and OverloadedLists could be used to completely abstract the list notation. This would have to happen to *all* Haskell source code, though. Right now we have the following elements of list syntax: - Types, e.g. [Char] - Construction via literals, [1, 2, 3] - Construction via pattern matching, 1 : 2 : [] - Enumerations, [1..3], [1..], [1,3..] - Pattern matching, let (x:xs) = [1..] These are currently somewhat handled by - OverloadedLists: Construction via literals - Enum typeclass: Enumerations We could handle the others partially, by allowing PatternSynonyms to replace (:) somehow, both for construction and pattern matching. The [Char] usage does not need to be replaced. Then String could be changed to something else easily. Looking at it this way, it makes the proposal seem much more doable. These could be bundled into a single extension -XPackedString, or something like that. It would be interesting to formulate this into a full-fledged proposal, if only as an exploratory venture (I certainly do not have the time to follow through on this myself). -- Andrew On Fri, May 22, 2015 at 11:57 PM, Carter Schonwald < carter.schonwald@gmail.com> wrote:
one direction that this thread has *COMPLETELY* overlooked is the following:
could we use pattern synonyms or something along those lines to make the migration to Text or the like more seemless?
On Fri, May 22, 2015 at 1:55 PM, Mike Meyer
wrote: Having just finished converting my Haskell shell-scripting tool from Strjng to Text/ByteString, might I suggest that such a change would create fewer problems after a Prelude rework to something like ClassyPrelude? Using ClassyPrelude meant that a lot of the code that worked with String worked just fine with Text and ByteString. I had more fixes due to having used partial function than with no longer having List's of chars.
On Fri, May 22, 2015 at 12:37 PM, Andrew Gibiansky < andrew.gibiansky@gmail.com> wrote:
Mario,
Thank you for that detailed write-up. That's exactly the sort of thing I was looking for.
I imagine a path like the one you describe is possible, but very, very difficult, and likely the effort could be better spent elsewhere.
I imagine an alternate route (that would have immediate gains in the near future, and wouldn't be a long-term transition plan) would be to have a `text-base` package, which exports everything `base` does, exporting `Text` instead of `String`. Then base packages off that instead of `base`, thus ensuring you do not rely on []-manipulation for `String` (you should still have full compatibility with normal `base`).
Anyway, hard choices all around, for no 100% clear gain, so I personally do not envision this happening any time soon. Oh well...
-- Andrew
On Fri, May 22, 2015 at 6:07 PM, Michal Antkiewicz < mantkiew@gsd.uwaterloo.ca> wrote:
Mario, thanks for that great writeup.
The switch can only happen if there's a way to make the old code somehow transparently work the same or better in the new setup.
Maybe some GHC magic could bring the string operations to Prim Ops and transparently switch the underlying representation to Text from [Char]. Basically, Text would have to become a built in primitive, not a library.
Michał
On Fri, May 22, 2015 at 10:29 AM, Mario Blažević
wrote: On 15-05-18 06:44 PM, Andrew Gibiansky wrote:
Hey all,
In the earlier haskell-cafe discussion of IsString, someone mentioned that it would be nice to abandon [Char] as the blessed string type in Haskell. I've thought about this on and off for a while now, and think that the fact that [Char] is the default string type is a really big issue (for example, it gives beginners the idea that Haskell is incredibly slow, because everything that involves string processing is using linked lists).
I am not proposing anything, but am curious as to what already has been discussed:
1. Has the possibility of migrating away from [Char] been investigated before?
No, not seriously as far as I'm aware. That ship has sailed a long time ago. Still, as I have actually thought about that, I'll give you an outline of a possible process.
2. What gains could we see in ease of use, performance, etc, if [Char]
was deprecated?
They could be very significant for any code that took advantage of the new type, but the existing code would not benefit that much. But then, any new Haskell code can already use Text where performance matters.
3. What could replace [Char], while retaining the same ease of use for
writing string manipulation functions (pattern matching, etc)?
You would not have the same ease of use exactly. The options would lie between two extremes. At one end, you can have a completely opaque String type with fromChars/toChars operations and nothing else. At the other end, you'd implement all operations useful on strings so there would never be any need to convert between String and [Char].
The first extreme would be mostly useless from the performance point of view, but with some GHC magic perhaps it could be made a viable upgrade path. The compiler would have to automatically insert the implicit fromChars/toChars conversion whenever necessary, and I expect that some of the existing Haskell code would still be broken.
Once you have an opaque String type, you can think about improving the performance. A more efficient instance of Monoid String would be a good start, especially since it wouldn't break backward compatibility. Unfortunately that is the only [Char] instance in wide use that can be easily optimized. Perhaps Foldable could be made to work with even more compiler magic, but I doubt it would be worth the effort.
If you add more operations on String that don't require
4. Is there any sort of migration path that would make this change
feasible in mainline Haskell in the medium term (2-5 years)?
Suppose GHC 7.12 were to bring Text into the core libraries, change Prelude to declare type String = Text, and sprinkle some magic compiler dust to make the explicit Text <-> Char conversions unnecessary.
The existing Haskell code would almost certainly perform worse overall. The only improved operations would be mappend on String, and possibly the string literal instantiation.
I don't think there's any chance to get this kind of change proposal accepted today. You'd have to make the pain worth the gain. The only viable path is to ensure beforehand that the change improves more than just the mappend operation.
In other words, you'd have to get today's String to instantiate more classes in common with tomorrow's String, and you'd have to get the everyday Haskell code to use those classes instead of list manipulations.
The first tentative step towards the String type change would then be either the mono-traversable or my own monoid-subclasses package. They both define new type classes that are instantiated by both [Char] and Text. The main difference is that the former builds upon the Foldable foundation, the latter upon Monoid. They are both far from being a complete replacement for list manipulations. But any new code that used their operations would see a big improvement from the String type change.
Here, then, is the five-year plan you're asking for:
Year one: Agree on the ideal set of type classes to bridge the gap between [Char] and Text.
Year two: Bring the new type classes into the Prelude. Have all relevant types instantiate them. Everybody's updating their code in delight to use the new class methods.
Year three: GHC issues warnings about using List-specific [], ++, null, take, drop, span, drop, etc, on String. Everybody's furiously updating their code.
Year four: Add Text to the core libraries. The GHC magic to make the Text <-> [Char] convertions implicit is implemented and ready for testing but requires a pragma.
Year five: Update Haskell language report. Flip the switch.
So there. How feasible does that sound?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
participants (8)
-
Andrew Gibiansky
-
Carter Schonwald
-
Clinton Mead
-
Joachim Breitner
-
Mario Blažević
-
Mateusz Kowalczyk
-
Michal Antkiewicz
-
Mike Meyer