Proposal: unsnoc, unsafeInit and unsafeLast for the bytestring library

Hi. I propose to add the missing counterparts of `uncons` and `unsafeHead/Tail` functions to the bytestring library. Discussion deadline: 2 weeks The patch is attached.

On Mon, Apr 2, 2012 at 12:42 PM, Mikhail Vorozhtsov
Hi.
I propose to add the missing counterparts of `uncons` and `unsafeHead/Tail` functions to the bytestring library.
Discussion deadline: 2 weeks
The patch is attached.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
+1 on unsnoc in particular, I know I've been surprised a few times that it's not there. I haven't specifically missed the other two, but I agree that they should be added. Michael

Hi Mikhail,
I've never used these functions in my code. So I cannot judge their
usefulness. Having inspected your patch, I see the following points
for optimization:
1. The clarity of the code of Data.ByteString.unsnoc would profit from
using your 'unsafeInit and unsafeLast' functions.
2. It might well be that some other places could profit from your
'unsafeInit' and 'unsafeLast' functions; e.g., the 'init' and 'last'
functions. It would be great if you also updated other related
definitions.
3. The implementation of Data.ByteString.Lazy.unsnoc is too strict in
my opinion. I would expect it to return 'Just
<combined-init-and-last-computation>', as soon as it has proven that
the ByteString is non-empty. To avoid unnecessary code duplication, it
might make sense to inline only the wrapper and let GHC decide what it
wants to do for the actual worker.
best regards,
Simon
2012/4/2 Mikhail Vorozhtsov
Hi.
I propose to add the missing counterparts of `uncons` and `unsafeHead/Tail` functions to the bytestring library.
Discussion deadline: 2 weeks
The patch is attached.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

On 3 April 2012 13:35, Simon Meier
Hi Mikhail,
I've never used these functions in my code. So I cannot judge their usefulness. Having inspected your patch, I see the following points for optimization:
3. The implementation of Data.ByteString.Lazy.unsnoc is too strict in my opinion. I would expect it to return 'Just <combined-init-and-last-computation>', as soon as it has proven that the ByteString is non-empty. To avoid unnecessary code duplication, it might make sense to inline only the wrapper and let GHC decide what it wants to do for the actual worker.
Actually, I think it doesn't make sense to provdide unsnoc for lazy bytestring at all. It's a bit of a silly operation for lazy bytestrings for the same reason as for lists. It's reasonable for strict bytestrings because you have the same access to the end of the string as the beginning, but the same is not true of lazy bytestrings. Duncan

2012/4/3 Duncan Coutts
On 3 April 2012 13:35, Simon Meier
wrote: Hi Mikhail,
I've never used these functions in my code. So I cannot judge their usefulness. Having inspected your patch, I see the following points for optimization:
3. The implementation of Data.ByteString.Lazy.unsnoc is too strict in my opinion. I would expect it to return 'Just <combined-init-and-last-computation>', as soon as it has proven that the ByteString is non-empty. To avoid unnecessary code duplication, it might make sense to inline only the wrapper and let GHC decide what it wants to do for the actual worker.
Actually, I think it doesn't make sense to provdide unsnoc for lazy bytestring at all. It's a bit of a silly operation for lazy bytestrings for the same reason as for lists. It's reasonable for strict bytestrings because you have the same access to the end of the string as the beginning, but the same is not true of lazy bytestrings.
I agree. I don't see a usecase for unsnoc on lazy bytestrings in production code. It nevertheless might make sense to provide it for one-off scripting code and for interface consistency. The necessarily bad implementation shows in the runtime given in the comment. Hence, I'd marginally vote for inclusion of the lazy bytestring unsnoc. @mikkhail: Note that at least the runtime for Data.ByteString.Lazy.Char8 is wrong. It should be O(n/c) instead of O(1). best regards, Simon

On Tue, Apr 3, 2012 at 2:50 PM, Simon Meier
2012/4/3 Duncan Coutts
: Actually, I think it doesn't make sense to provdide unsnoc for lazy bytestring at all. It's a bit of a silly operation for lazy bytestrings for the same reason as for lists. It's reasonable for strict bytestrings because you have the same access to the end of the string as the beginning, but the same is not true of lazy bytestrings.
I agree. I don't see a usecase for unsnoc on lazy bytestrings in production code.
It nevertheless might make sense to provide it for one-off scripting code and for interface consistency. The necessarily bad implementation shows in the runtime given in the comment. Hence, I'd marginally vote for inclusion of the lazy bytestring unsnoc.
Agreed on the interface consistency comment; also it's worth remembering that what type of bytestring you use is sometimes the choice of the library you want to work with, rather than yours :) An explicit warning in the docs, maybe, but keep it in there nonetheless. yrs, Ben

On 04/03/2012 08:50 PM, Simon Meier wrote:
2012/4/3 Duncan Coutts
: [snip] @mikkhail: Note that at least the runtime for Data.ByteString.Lazy.Char8 is wrong. It should be O(n/c) instead of O(1). Fixed.

On Tue, 2012-04-03 at 21:48 +0700, Mikhail Vorozhtsov wrote:
On 04/03/2012 08:50 PM, Simon Meier wrote:
2012/4/3 Duncan Coutts
: [snip] @mikkhail: Note that at least the runtime for Data.ByteString.Lazy.Char8 is wrong. It should be O(n/c) instead of O(1). Fixed.
Ok, I've applied this but with a simpler implementation of unsnoc for lazy bytestrings. The proposed implementation tries to work using a single traversal of the list of chunks, but it doesn't actually achieve this since it builds a chain of closures that's equivalent to the list of chunks. Doing it by accumulating and reversing would be equivalent. Given that we cannot actually avoid making two traversals I've gone for the simplest implementation: just call init and last. Duncan

On 3 April 2012 13:35, Simon Meier
wrote: Hi Mikhail,
I've never used these functions in my code. So I cannot judge their usefulness. Having inspected your patch, I see the following points for optimization:
3. The implementation of Data.ByteString.Lazy.unsnoc is too strict in my opinion. I would expect it to return 'Just <combined-init-and-last-computation>', as soon as it has proven that the ByteString is non-empty. To avoid unnecessary code duplication, it might make sense to inline only the wrapper and let GHC decide what it wants to do for the actual worker.
Actually, I think it doesn't make sense to provdide unsnoc for lazy bytestring at all. It's a bit of a silly operation for lazy bytestrings for the same reason as for lists. It's reasonable for strict bytestrings because you have the same access to the end of the string as the beginning, but the same is not true of lazy bytestrings. Hm, why `(,) <$> init <*> last` makes less sense than `init` and `last`
On 04/03/2012 08:00 PM, Duncan Coutts wrote: themselves?

On 04/03/2012 07:35 PM, Simon Meier wrote:
Hi Mikhail,
I've never used these functions in my code. So I cannot judge their usefulness. Having inspected your patch, I see the following points for optimization:
1. The clarity of the code of Data.ByteString.unsnoc would profit from using your 'unsafeInit and unsafeLast' functions. The same can be said about `uncons`. `unsnoc` is pretty much a product of copy-and-paste.
2. It might well be that some other places could profit from your 'unsafeInit' and 'unsafeLast' functions; e.g., the 'init' and 'last' functions. It would be great if you also updated other related definitions. Good point. I updated the patch.
3. The implementation of Data.ByteString.Lazy.unsnoc is too strict in my opinion. I would expect it to return 'Just <combined-init-and-last-computation>', as soon as it has proven that the ByteString is non-empty. To avoid unnecessary code duplication, it might make sense to inline only the wrapper and let GHC decide what it wants to do for the actual worker. Done. Thanks for the review!

Hi Duncan. On 04/02/2012 04:42 PM, Mikhail Vorozhtsov wrote:
Hi.
I propose to add the missing counterparts of `uncons` and `unsafeHead/Tail` functions to the bytestring library.
Discussion deadline: 2 weeks
The deadline has passed. The responses are positive, with the exception of your comment on lazy `uncons` (2:1 votes in favor). I attached the last version of the patch.
participants (5)
-
Ben Millwood
-
Duncan Coutts
-
Michael Snoyman
-
Mikhail Vorozhtsov
-
Simon Meier