
Would this be an appropriate place to propose adding mapM_ (and then possibly mapM) to bytestring library? Was it suggested before? If yes, why was it rejected?

Artyom Kazak wrote:
Would this be an appropriate place to propose adding mapM_ (and then possibly mapM) to bytestring library?
Err, mapM is defined as: mapM :: Monad m => (a -> m b) -> [a] -> m [b] in that the second parameter is a list. ByteStrings are not lists, they're chunks of bytes. Besides, you can just do: let bs = "This is a bytestring" mapM someFUnctions $ BS.unpack bs or define a function: mapBSM :: Monad m => (Char -> m a) -> ByteString -> m [a] mapBSM f bs = mapM f $ BS.unpack bs and similar for mapBSM_. Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/

On Sun, 01 Sep 2013 15:18:32 +0400, Erik de Castro Lopo
Err, mapM is defined as:
mapM :: Monad m => (a -> m b) -> [a] -> m [b]
Yeah, the signatures for mapM_ and mapM would respectively be mapM_ :: Monad m => (Word8 -> m b) -> ByteString -> m () mapM :: Monad m => (Word8 -> m Word8) -> ByteString -> m ByteString After all, the signature for ByteString-y `map` too requires a function of type `Word8 -> Word8`, so it seems perfectly reasonable for mapM to exist.
in that the second parameter is a list. ByteStrings are not lists, they're chunks of bytes. Besides, you can just do:
let bs = "This is a bytestring" mapM someFUnctions $ BS.unpack bs
or define a function:
mapBSM :: Monad m => (Char -> m a) -> ByteString -> m [a] mapBSM f bs = mapM f $ BS.unpack bs
and similar for mapBSM_.
There’s even a RULE which would optimise this. However, it still would be *four times slower* than necessary if the monad in question is IO or ST.

On Sun, 1 Sep 2013, Artyom Kazak wrote:
On Sun, 01 Sep 2013 15:18:32 +0400, Erik de Castro Lopo
wrote: Err, mapM is defined as:
mapM :: Monad m => (a -> m b) -> [a] -> m [b]
Yeah, the signatures for mapM_ and mapM would respectively be
mapM_ :: Monad m => (Word8 -> m b) -> ByteString -> m ()
This one should not be a big problem, but could be done by calling mapM_ on the result of ByteString.unpack.
mapM :: Monad m => (Word8 -> m Word8) -> ByteString -> m ByteString
This one can become very inefficient. Think of m = [], then many ByteStrings have to be constructed by repreated ByteString.cons which requires a lot of copying. For lists "cons" aka (:) is efficient because the tails are shared.

On Sun, Sep 1, 2013 at 6:16 AM, Henning Thielemann
mapM :: Monad m => (Word8 -> m Word8) -> ByteString -> m ByteString
This one can become very inefficient. Think of m = [], then many ByteStrings have to be constructed by repreated ByteString.cons which requires a lot of copying. For lists "cons" aka (:) is efficient because the tails are shared.
You don't need to cons -- you can allocate one ByteString and write into it unsafely (just like map does). See http://hackage.haskell.org/packages/archive/lens/3.9.0.3/doc/html/src/Contro.... Shachaf

On Sun, 01 Sep 2013 17:16:49 +0400, Henning Thielemann
Yeah, the signatures for mapM_ and mapM would respectively be
mapM_ :: Monad m => (Word8 -> m b) -> ByteString -> m ()
This one should not be a big problem, but could be done by calling mapM_ on the result of ByteString.unpack.
As I’ve said, mapM_ through ByteString.unpack is four times slower than the hand-written version. I find it unacceptable that a simple counting sort *can’t* be written efficiently for ByteString without importing Data.ByteString.Unsafe, Foreign.Ptr and friends.

On Sun, 1 Sep 2013, Artyom Kazak wrote:
On Sun, 01 Sep 2013 17:16:49 +0400, Henning Thielemann
wrote: Yeah, the signatures for mapM_ and mapM would respectively be
mapM_ :: Monad m => (Word8 -> m b) -> ByteString -> m ()
This one should not be a big problem, but could be done by calling mapM_ on the result of ByteString.unpack.
As I’ve said, mapM_ through ByteString.unpack is four times slower than the hand-written version. I find it unacceptable that a simple counting sort *can’t* be written efficiently for ByteString without importing Data.ByteString.Unsafe, Foreign.Ptr and friends.
A possible solution might be fusion rules for ByteString.unpack and mapM_.

On Sun, 01 Sep 2013 22:55:10 +0400, Henning Thielemann
As I’ve said, mapM_ through ByteString.unpack is four times slower than the hand-written version. I find it unacceptable that a simple counting sort *can’t* be written efficiently for ByteString without importing Data.ByteString.Unsafe, Foreign.Ptr and friends.
A possible solution might be fusion rules for ByteString.unpack and mapM_.
Except that such rules would require a hand-written version of mapM_ anyway. I agree that it can be written, it’s just that I don’t see why it should be in some obscure place instead of Data.ByteString.

On Sun, 1 Sep 2013, Artyom Kazak wrote:
On Sun, 01 Sep 2013 22:55:10 +0400, Henning Thielemann
wrote: A possible solution might be fusion rules for ByteString.unpack and mapM_.
Except that such rules would require a hand-written version of mapM_ anyway.
I agree that it can be written, it’s just that I don’t see why it should be in some obscure place instead of Data.ByteString.
This rule should of course be part of Data.ByteString. RULES are similar to class instances and like orphan instances, orphan rules are a bad idea.

On Sun, 01 Sep 2013 23:27:20 +0400, Henning Thielemann
On Sun, 1 Sep 2013, Artyom Kazak wrote:
On Sun, 01 Sep 2013 22:55:10 +0400, Henning Thielemann
wrote: A possible solution might be fusion rules for ByteString.unpack and mapM_.
Except that such rules would require a hand-written version of mapM_ anyway.
I agree that it can be written, it’s just that I don’t see why it should be in some obscure place instead of Data.ByteString.
This rule should of course be part of Data.ByteString. RULES are similar to class instances and like orphan instances, orphan rules are a bad idea.
It still doesn’t solve the problem quite like adding mapM_ does. Rules aren’t documented anywhere; why would the programmer expect `mapM_ . unpack` to fuse? Moreover, it violates the existing structure of bytestring package. For instance, functions like `last` and `maximum` could be implemented as rules in the same way, but they’re included in Data.ByteString (for good, I’d say).

a simpler subproblem is theres no Bystring.*.Unsafe.Internal module that
lets Bytestring library clients to (at their own risk) add new fusion rules.
Isn't this "private fusion framework" shenanigans where lib clients can't
add new things that fuse well a recurrent problem making it difficult to
have down stream users have their own performant extensions?
On Sun, Sep 1, 2013 at 3:42 PM, Artyom Kazak
On Sun, 01 Sep 2013 23:27:20 +0400, Henning Thielemann < lemming@henning-thielemann.de**> wrote:
On Sun, 1 Sep 2013, Artyom Kazak wrote:
On Sun, 01 Sep 2013 22:55:10 +0400, Henning Thielemann <
lemming@henning-thielemann.de**> wrote:
A possible solution might be fusion rules for ByteString.unpack and
mapM_.
Except that such rules would require a hand-written version of mapM_ anyway.
I agree that it can be written, it’s just that I don’t see why it should be in some obscure place instead of Data.ByteString.
This rule should of course be part of Data.ByteString. RULES are similar to class instances and like orphan instances, orphan rules are a bad idea.
It still doesn’t solve the problem quite like adding mapM_ does. Rules aren’t documented anywhere; why would the programmer expect `mapM_ . unpack` to fuse?
Moreover, it violates the existing structure of bytestring package. For instance, functions like `last` and `maximum` could be implemented as rules in the same way, but they’re included in Data.ByteString (for good, I’d say).
______________________________**_________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/**mailman/listinfo/librarieshttp://www.haskell.org/mailman/listinfo/libraries

On Sun, Sep 1, 2013 at 3:58 AM, Artyom Kazak
Would this be an appropriate place to propose adding mapM_ (and then possibly mapM) to bytestring library?
Was it suggested before? If yes, why was it rejected?
For what it's worth, this is exported by lens in Data.ByteString.Lens.bytes (or Data.ByteString.{Lazy,Strict}.Lens.bytes). The version exported by lens by default traverses each chunk in a balanced tree shape rather than left-to-right (which is probably not relevant to most use cases, actually -- the result is the same). It also works around http://ghc.haskell.org/trac/ghc/ticket/7556, which you'll want to watch for if you're writing this yourself. Note that you can construct mapM_ from mapM as discussed in another libraries@ thread recently (or by using e.g. mapMOf_). Shachaf

Personally, I would support its inclusion.
It is an annoying repetition -- but then so is the entire bytestring API --
so there is no real point in just randomly omitting this one combinator.
-Edward
On Sun, Sep 1, 2013 at 6:58 AM, Artyom Kazak
Would this be an appropriate place to propose adding mapM_ (and then possibly mapM) to bytestring library?
Was it suggested before? If yes, why was it rejected?
______________________________**_________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/**mailman/listinfo/librarieshttp://www.haskell.org/mailman/listinfo/libraries

So, I have written several implementations of mapM_: * bsMapM_gen — generic, works for any monad * bsMapM_short — essentially (\f s -> mapM_ f $ unpack s) * bsMapM_IO — hand-written version specifically for IO Generic and hand-written versions don’t differ much. The overhead seems to be coming from inlinePerformIO (am I right here? Also, am I using inlinePerformIO legitimately?), which is only needed when we’re not in the IO monad. {-# SPECIALISE ... IO #-} {-# SPECIALISE ... ST #-} bsMapM_gen :: Monad m => (Word8 -> m a) -> ByteString -> m () bsMapM_gen f s = unsafePerformIO $ unsafeUseAsCStringLen s mapp where mapp (ptr, len) = return $ go 0 where go i | i == len = return () | otherwise = let !b = inlinePerformIO $ peekByteOff ptr i in f b >> go (i+1) The short version relies on fusion of `unpack` and `mapM_`. Its advantage is that even when compiled without optimisations, it’s still fast. (Question: would the same happen to other versions, when put into Data.ByteString module? I suppose packages like bytestring are compiled with optimisations, so it probably would.) {-# SPECIALISE ... IO #-} {-# SPECIALISE ... ST #-} bsMapM_shortIO :: (Word8 -> IO a) -> ByteString -> IO () bsMapM_shortIO f s = mapM_ f (unpack s) Finally, the IO-specialised version. It’s faster than generic version (and, similarly, an ST-specialised version using unsafeIOToST would be just as fast), so I assume a SPECIALISE pragma involving bsMapM_IO and bsMapM_ST should be present. (Question: are there other monads for which unsafeIOToMonad exists?) bsMapM_IO :: (Word8 -> IO a) -> ByteString -> IO () bsMapM_IO f s = unsafeUseAsCStringLen s mapp where mapp (ptr, len) = go 0 where go i | i == len = return () | otherwise = peekByteOff ptr i >>= f >> go (i+1) A-and here’s a table comparing performance of all three functions. All timings are in milliseconds. ghci ghc ghc -O ghc -O2 +----------+----------+----------+----------+ gen | 380 | 85 | 4.1 | 4.0 | short | 45 | 46 | 17.2 | 16.5 | IO | 434 | 92 | 2.4 | 2.4 | +----------+----------+----------+----------+ Here’s the code I used. (Question: have I messed up anything?) import qualified Data.ByteString as BS import Data.Random import System.Random import System.IO.Unsafe import Control.Monad import Data.IORef import Criterion.Main import BSMaps --a bytestring consisting of 65536 random bytes testCase = BS.pack $ fst $ flip sampleState (mkStdGen 8) $ replicateM (2^16) stdUniform --sums elements of a bytestring, using given mapM_ sumIO :: ((Word8 -> IO ()) -> BS.ByteString -> IO ()) -> BS.ByteString -> Word8 sumIO f s = unsafePerformIO $ do sm <- newIORef 0 f (modifyIORef' sm . (+)) s readIORef sm --runs the tests main = defaultMain [ bench "IO" $ whnf (sumIO bsMapM_IO) testCase, bench "short" $ whnf (sumIO bsMapM_short) testCase, bench "gen" $ whnf (sumIO bsMapM_gen) testCase] Finally, if there isn’t anything wrong, what are my next steps to see this included into next version of bytestring?

The maintainers for bytestring are still listed as Don Stewart and Duncan
Coutts on the package, and it doesn't seem to fall the list of core
packages per http://www.haskell.org/haskellwiki/Library_submissions so I
suppose it would come down to talking one of them into taking the patch.
It seems odd that a fundamental package like this is omitted from the
Library_submissions page though, as the older
http://trac.haskell.org/haskell-platform/wiki/PackageMaintainers page on
the trac shows it as maintained by GHC Central.
-Edward
On Thu, Sep 5, 2013 at 4:38 PM, Artyom Kazak
So, I have written several implementations of mapM_: * bsMapM_gen — generic, works for any monad * bsMapM_short — essentially (\f s -> mapM_ f $ unpack s) * bsMapM_IO — hand-written version specifically for IO
Generic and hand-written versions don’t differ much. The overhead seems to be coming from inlinePerformIO (am I right here? Also, am I using inlinePerformIO legitimately?), which is only needed when we’re not in the IO monad.
{-# SPECIALISE ... IO #-} {-# SPECIALISE ... ST #-} bsMapM_gen :: Monad m => (Word8 -> m a) -> ByteString -> m () bsMapM_gen f s = unsafePerformIO $ unsafeUseAsCStringLen s mapp where mapp (ptr, len) = return $ go 0 where go i | i == len = return () | otherwise = let !b = inlinePerformIO $ peekByteOff ptr i in f b >> go (i+1)
The short version relies on fusion of `unpack` and `mapM_`. Its advantage is that even when compiled without optimisations, it’s still fast. (Question: would the same happen to other versions, when put into Data.ByteString module? I suppose packages like bytestring are compiled with optimisations, so it probably would.)
{-# SPECIALISE ... IO #-} {-# SPECIALISE ... ST #-} bsMapM_shortIO :: (Word8 -> IO a) -> ByteString -> IO () bsMapM_shortIO f s = mapM_ f (unpack s)
Finally, the IO-specialised version. It’s faster than generic version (and, similarly, an ST-specialised version using unsafeIOToST would be just as fast), so I assume a SPECIALISE pragma involving bsMapM_IO and bsMapM_ST should be present. (Question: are there other monads for which unsafeIOToMonad exists?)
bsMapM_IO :: (Word8 -> IO a) -> ByteString -> IO () bsMapM_IO f s = unsafeUseAsCStringLen s mapp where mapp (ptr, len) = go 0 where go i | i == len = return () | otherwise = peekByteOff ptr i >>= f >> go (i+1)
A-and here’s a table comparing performance of all three functions. All timings are in milliseconds.
ghci ghc ghc -O ghc -O2 +----------+----------+-------**---+----------+ gen | 380 | 85 | 4.1 | 4.0 | short | 45 | 46 | 17.2 | 16.5 | IO | 434 | 92 | 2.4 | 2.4 | +----------+----------+-------**---+----------+
Here’s the code I used. (Question: have I messed up anything?)
import qualified Data.ByteString as BS import Data.Random import System.Random import System.IO.Unsafe import Control.Monad import Data.IORef import Criterion.Main import BSMaps
--a bytestring consisting of 65536 random bytes testCase = BS.pack $ fst $ flip sampleState (mkStdGen 8) $ replicateM (2^16) stdUniform
--sums elements of a bytestring, using given mapM_ sumIO :: ((Word8 -> IO ()) -> BS.ByteString -> IO ()) -> BS.ByteString -> Word8 sumIO f s = unsafePerformIO $ do sm <- newIORef 0 f (modifyIORef' sm . (+)) s readIORef sm
--runs the tests main = defaultMain [ bench "IO" $ whnf (sumIO bsMapM_IO) testCase, bench "short" $ whnf (sumIO bsMapM_short) testCase, bench "gen" $ whnf (sumIO bsMapM_gen) testCase]
Finally, if there isn’t anything wrong, what are my next steps to see this included into next version of bytestring?
______________________________**_________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/**mailman/listinfo/librarieshttp://www.haskell.org/mailman/listinfo/libraries

I think the important issue here is whether we want to add monadic
versions of functions in bytestring (where it makes sense)? It wield
yield better performance than using `unpack`, but at the cost of lots
of code duplication.
We have the same issue in containers where you could for example want
an `updateM` function, that would let you decide whether to update a
value by performing some side effect. You could simulate this using a
combination of `lookup` and `insert`, but that requires twice the
number of traversals of the data structure.
Right now I'm of the mind that the extra traversals (and using unpack
in the case of ByteString) is better than the code duplication.
On Thu, Sep 5, 2013 at 3:51 PM, Edward Kmett
The maintainers for bytestring are still listed as Don Stewart and Duncan Coutts on the package, and it doesn't seem to fall the list of core packages per http://www.haskell.org/haskellwiki/Library_submissions so I suppose it would come down to talking one of them into taking the patch.
It seems odd that a fundamental package like this is omitted from the Library_submissions page though, as the older http://trac.haskell.org/haskell-platform/wiki/PackageMaintainers page on the trac shows it as maintained by GHC Central.
-Edward
On Thu, Sep 5, 2013 at 4:38 PM, Artyom Kazak
wrote: So, I have written several implementations of mapM_: * bsMapM_gen — generic, works for any monad * bsMapM_short — essentially (\f s -> mapM_ f $ unpack s) * bsMapM_IO — hand-written version specifically for IO
Generic and hand-written versions don’t differ much. The overhead seems to be coming from inlinePerformIO (am I right here? Also, am I using inlinePerformIO legitimately?), which is only needed when we’re not in the IO monad.
{-# SPECIALISE ... IO #-} {-# SPECIALISE ... ST #-} bsMapM_gen :: Monad m => (Word8 -> m a) -> ByteString -> m () bsMapM_gen f s = unsafePerformIO $ unsafeUseAsCStringLen s mapp where mapp (ptr, len) = return $ go 0 where go i | i == len = return () | otherwise = let !b = inlinePerformIO $ peekByteOff ptr i in f b >> go (i+1)
The short version relies on fusion of `unpack` and `mapM_`. Its advantage is that even when compiled without optimisations, it’s still fast. (Question: would the same happen to other versions, when put into Data.ByteString module? I suppose packages like bytestring are compiled with optimisations, so it probably would.)
{-# SPECIALISE ... IO #-} {-# SPECIALISE ... ST #-} bsMapM_shortIO :: (Word8 -> IO a) -> ByteString -> IO () bsMapM_shortIO f s = mapM_ f (unpack s)
Finally, the IO-specialised version. It’s faster than generic version (and, similarly, an ST-specialised version using unsafeIOToST would be just as fast), so I assume a SPECIALISE pragma involving bsMapM_IO and bsMapM_ST should be present. (Question: are there other monads for which unsafeIOToMonad exists?)
bsMapM_IO :: (Word8 -> IO a) -> ByteString -> IO () bsMapM_IO f s = unsafeUseAsCStringLen s mapp where mapp (ptr, len) = go 0 where go i | i == len = return () | otherwise = peekByteOff ptr i >>= f >> go (i+1)
A-and here’s a table comparing performance of all three functions. All timings are in milliseconds.
ghci ghc ghc -O ghc -O2 +----------+----------+----------+----------+ gen | 380 | 85 | 4.1 | 4.0 | short | 45 | 46 | 17.2 | 16.5 | IO | 434 | 92 | 2.4 | 2.4 | +----------+----------+----------+----------+
Here’s the code I used. (Question: have I messed up anything?)
import qualified Data.ByteString as BS import Data.Random import System.Random import System.IO.Unsafe import Control.Monad import Data.IORef import Criterion.Main import BSMaps
--a bytestring consisting of 65536 random bytes testCase = BS.pack $ fst $ flip sampleState (mkStdGen 8) $ replicateM (2^16) stdUniform
--sums elements of a bytestring, using given mapM_ sumIO :: ((Word8 -> IO ()) -> BS.ByteString -> IO ()) -> BS.ByteString -> Word8 sumIO f s = unsafePerformIO $ do sm <- newIORef 0 f (modifyIORef' sm . (+)) s readIORef sm
--runs the tests main = defaultMain [ bench "IO" $ whnf (sumIO bsMapM_IO) testCase, bench "short" $ whnf (sumIO bsMapM_short) testCase, bench "gen" $ whnf (sumIO bsMapM_gen) testCase]
Finally, if there isn’t anything wrong, what are my next steps to see this included into next version of bytestring?
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

I personally find myself on the other side of the divide, especially if
someone else is willing to do the work to write and test the method for two
reasons.
1.) The bytestring API is already fairly noisy in that it really needs to
be imported qualified already. Finding such a common function *missing* from
an API that is so all-inclusive is pretty jarring from a user perspective,
so it would seem to follow the principle of least surprise to include it if
possible.
2.) Bytestring was built from the ground up for speed. A factor of 4-8x
speed difference even before dipping into inlinePerformIO tricks isn't
negligible here.
-Edward
On Thu, Sep 5, 2013 at 7:07 PM, Johan Tibell
I think the important issue here is whether we want to add monadic versions of functions in bytestring (where it makes sense)? It wield yield better performance than using `unpack`, but at the cost of lots of code duplication.
We have the same issue in containers where you could for example want an `updateM` function, that would let you decide whether to update a value by performing some side effect. You could simulate this using a combination of `lookup` and `insert`, but that requires twice the number of traversals of the data structure.
Right now I'm of the mind that the extra traversals (and using unpack in the case of ByteString) is better than the code duplication.
The maintainers for bytestring are still listed as Don Stewart and Duncan Coutts on the package, and it doesn't seem to fall the list of core
On Thu, Sep 5, 2013 at 3:51 PM, Edward Kmett
wrote: packages per http://www.haskell.org/haskellwiki/Library_submissions so I suppose it would come down to talking one of them into taking the patch.
It seems odd that a fundamental package like this is omitted from the Library_submissions page though, as the older http://trac.haskell.org/haskell-platform/wiki/PackageMaintainers page on the trac shows it as maintained by GHC Central.
-Edward
On Thu, Sep 5, 2013 at 4:38 PM, Artyom Kazak
wrote: So, I have written several implementations of mapM_: * bsMapM_gen — generic, works for any monad * bsMapM_short — essentially (\f s -> mapM_ f $ unpack s) * bsMapM_IO — hand-written version specifically for IO
Generic and hand-written versions don’t differ much. The overhead seems to be coming from inlinePerformIO (am I right here? Also, am I using inlinePerformIO legitimately?), which is only needed when we’re not in the IO monad.
{-# SPECIALISE ... IO #-} {-# SPECIALISE ... ST #-} bsMapM_gen :: Monad m => (Word8 -> m a) -> ByteString -> m () bsMapM_gen f s = unsafePerformIO $ unsafeUseAsCStringLen s mapp where mapp (ptr, len) = return $ go 0 where go i | i == len = return () | otherwise = let !b = inlinePerformIO $ peekByteOff ptr i in f b >> go (i+1)
The short version relies on fusion of `unpack` and `mapM_`. Its advantage is that even when compiled without optimisations, it’s still fast. (Question: would the same happen to other versions, when put into Data.ByteString module? I suppose packages like bytestring are compiled with optimisations, so it probably would.)
{-# SPECIALISE ... IO #-} {-# SPECIALISE ... ST #-} bsMapM_shortIO :: (Word8 -> IO a) -> ByteString -> IO () bsMapM_shortIO f s = mapM_ f (unpack s)
Finally, the IO-specialised version. It’s faster than generic version (and, similarly, an ST-specialised version using unsafeIOToST would be just as fast), so I assume a SPECIALISE pragma involving bsMapM_IO and bsMapM_ST should be present. (Question: are there other monads for which unsafeIOToMonad exists?)
bsMapM_IO :: (Word8 -> IO a) -> ByteString -> IO () bsMapM_IO f s = unsafeUseAsCStringLen s mapp where mapp (ptr, len) = go 0 where go i | i == len = return () | otherwise = peekByteOff ptr i >>= f >> go (i+1)
A-and here’s a table comparing performance of all three functions. All timings are in milliseconds.
ghci ghc ghc -O ghc -O2 +----------+----------+----------+----------+ gen | 380 | 85 | 4.1 | 4.0 | short | 45 | 46 | 17.2 | 16.5 | IO | 434 | 92 | 2.4 | 2.4 | +----------+----------+----------+----------+
Here’s the code I used. (Question: have I messed up anything?)
import qualified Data.ByteString as BS import Data.Random import System.Random import System.IO.Unsafe import Control.Monad import Data.IORef import Criterion.Main import BSMaps
--a bytestring consisting of 65536 random bytes testCase = BS.pack $ fst $ flip sampleState (mkStdGen 8) $ replicateM (2^16) stdUniform
--sums elements of a bytestring, using given mapM_ sumIO :: ((Word8 -> IO ()) -> BS.ByteString -> IO ()) -> BS.ByteString -> Word8 sumIO f s = unsafePerformIO $ do sm <- newIORef 0 f (modifyIORef' sm . (+)) s readIORef sm
--runs the tests main = defaultMain [ bench "IO" $ whnf (sumIO bsMapM_IO) testCase, bench "short" $ whnf (sumIO bsMapM_short) testCase, bench "gen" $ whnf (sumIO bsMapM_gen) testCase]
Finally, if there isn’t anything wrong, what are my next steps to see this included into next version of bytestring?
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

I can see that argument. I did the opposite in unordered-containers,
which have many near identical traversals in the name of speed. I must
admit it's a pain to maintain. Changes typically have to happen in at
least 2 places and the code feels cut-n-pasty.
On Thu, Sep 5, 2013 at 4:36 PM, Edward Kmett
I personally find myself on the other side of the divide, especially if someone else is willing to do the work to write and test the method for two reasons.
1.) The bytestring API is already fairly noisy in that it really needs to be imported qualified already. Finding such a common function missing from an API that is so all-inclusive is pretty jarring from a user perspective, so it would seem to follow the principle of least surprise to include it if possible.
2.) Bytestring was built from the ground up for speed. A factor of 4-8x speed difference even before dipping into inlinePerformIO tricks isn't negligible here.
-Edward
On Thu, Sep 5, 2013 at 7:07 PM, Johan Tibell
wrote: I think the important issue here is whether we want to add monadic versions of functions in bytestring (where it makes sense)? It wield yield better performance than using `unpack`, but at the cost of lots of code duplication.
We have the same issue in containers where you could for example want an `updateM` function, that would let you decide whether to update a value by performing some side effect. You could simulate this using a combination of `lookup` and `insert`, but that requires twice the number of traversals of the data structure.
Right now I'm of the mind that the extra traversals (and using unpack in the case of ByteString) is better than the code duplication.
On Thu, Sep 5, 2013 at 3:51 PM, Edward Kmett
wrote: The maintainers for bytestring are still listed as Don Stewart and Duncan Coutts on the package, and it doesn't seem to fall the list of core packages per http://www.haskell.org/haskellwiki/Library_submissions so I suppose it would come down to talking one of them into taking the patch.
It seems odd that a fundamental package like this is omitted from the Library_submissions page though, as the older http://trac.haskell.org/haskell-platform/wiki/PackageMaintainers page on the trac shows it as maintained by GHC Central.
-Edward
On Thu, Sep 5, 2013 at 4:38 PM, Artyom Kazak
wrote: So, I have written several implementations of mapM_: * bsMapM_gen — generic, works for any monad * bsMapM_short — essentially (\f s -> mapM_ f $ unpack s) * bsMapM_IO — hand-written version specifically for IO
Generic and hand-written versions don’t differ much. The overhead seems to be coming from inlinePerformIO (am I right here? Also, am I using inlinePerformIO legitimately?), which is only needed when we’re not in the IO monad.
{-# SPECIALISE ... IO #-} {-# SPECIALISE ... ST #-} bsMapM_gen :: Monad m => (Word8 -> m a) -> ByteString -> m () bsMapM_gen f s = unsafePerformIO $ unsafeUseAsCStringLen s mapp where mapp (ptr, len) = return $ go 0 where go i | i == len = return () | otherwise = let !b = inlinePerformIO $ peekByteOff ptr i in f b >> go (i+1)
The short version relies on fusion of `unpack` and `mapM_`. Its advantage is that even when compiled without optimisations, it’s still fast. (Question: would the same happen to other versions, when put into Data.ByteString module? I suppose packages like bytestring are compiled with optimisations, so it probably would.)
{-# SPECIALISE ... IO #-} {-# SPECIALISE ... ST #-} bsMapM_shortIO :: (Word8 -> IO a) -> ByteString -> IO () bsMapM_shortIO f s = mapM_ f (unpack s)
Finally, the IO-specialised version. It’s faster than generic version (and, similarly, an ST-specialised version using unsafeIOToST would be just as fast), so I assume a SPECIALISE pragma involving bsMapM_IO and bsMapM_ST should be present. (Question: are there other monads for which unsafeIOToMonad exists?)
bsMapM_IO :: (Word8 -> IO a) -> ByteString -> IO () bsMapM_IO f s = unsafeUseAsCStringLen s mapp where mapp (ptr, len) = go 0 where go i | i == len = return () | otherwise = peekByteOff ptr i >>= f >> go (i+1)
A-and here’s a table comparing performance of all three functions. All timings are in milliseconds.
ghci ghc ghc -O ghc -O2 +----------+----------+----------+----------+ gen | 380 | 85 | 4.1 | 4.0 | short | 45 | 46 | 17.2 | 16.5 | IO | 434 | 92 | 2.4 | 2.4 | +----------+----------+----------+----------+
Here’s the code I used. (Question: have I messed up anything?)
import qualified Data.ByteString as BS import Data.Random import System.Random import System.IO.Unsafe import Control.Monad import Data.IORef import Criterion.Main import BSMaps
--a bytestring consisting of 65536 random bytes testCase = BS.pack $ fst $ flip sampleState (mkStdGen 8) $ replicateM (2^16) stdUniform
--sums elements of a bytestring, using given mapM_ sumIO :: ((Word8 -> IO ()) -> BS.ByteString -> IO ()) -> BS.ByteString -> Word8 sumIO f s = unsafePerformIO $ do sm <- newIORef 0 f (modifyIORef' sm . (+)) s readIORef sm
--runs the tests main = defaultMain [ bench "IO" $ whnf (sumIO bsMapM_IO) testCase, bench "short" $ whnf (sumIO bsMapM_short) testCase, bench "gen" $ whnf (sumIO bsMapM_gen) testCase]
Finally, if there isn’t anything wrong, what are my next steps to see this included into next version of bytestring?
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

On Fri, 06 Sep 2013 03:07:23 +0400, Johan Tibell
I think the important issue here is whether we want to add monadic versions of functions in bytestring (where it makes sense)? It wield yield better performance than using `unpack`, but at the cost of lots of code duplication.
We have the same issue in containers where you could for example want an `updateM` function, that would let you decide whether to update a value by performing some side effect. You could simulate this using a combination of `lookup` and `insert`, but that requires twice the number of traversals of the data structure.
Right now I'm of the mind that the extra traversals (and using unpack in the case of ByteString) is better than the code duplication.
This code already is in bytestring — as part of `sort`. So, no code would actually be duplicated, just moved. (And we get sortBy practically for free!) Then, bytestring *is* code duplication. Raw loops everywhere, same raw loops with minor changes; peekByteOff-pokeByteOff; calling C for 30% speed improvements (and leaving loop-y Haskell versions commented nearby); and much more. I find myself wondering if by carefully choosing a set of basic primitives the code of bytestring could be made ten times shorter.

On Thu, 2013-09-05 at 18:51 -0400, Edward Kmett wrote:
The maintainers for bytestring are still listed as Don Stewart and Duncan Coutts on the package, and it doesn't seem to fall the list of core packages per http://www.haskell.org/haskellwiki/Library_submissions so I suppose it would come down to talking one of them into taking the patch.
Yes, I'm happy to consider pull requests. https://github.com/haskell/bytestring/ I should say that I start off sceptical about any additions to this API. As Raymond Chen says, each new feature starts off with -100 points. For mapM etc, personally I think a better solution would be if ByteString and Text and other specialised containers could be an instance of Foldable/Traversable. Those classes define mapM etc but currently they only work for containers that are polymorphic in their elements, so all specialised containers are excluded. I'm sure there must be a solution to that (I'd guess with type families) and that would be much nicer than adding mapM etc to bytestring itself. We would then just provide efficient instances for Foldable/Traversable. Duncan

On Wed, 2013-09-11 at 14:35 +0100, Duncan Coutts wrote:
For mapM etc, personally I think a better solution would be if ByteString and Text and other specialised containers could be an instance of Foldable/Traversable.
Yes!
Those classes define mapM etc but currently they only work for containers that are polymorphic in their elements, so all specialised containers are excluded.
Indeed :-( Hence my question a couple of days ago.
I'm sure there must be a solution to that (I'd guess with type families) and that would be much nicer than adding mapM etc to bytestring itself. We would then just provide efficient instances for Foldable/Traversable.
I looked into this when this thread started, but bumped into at least one 'issue': when defining (OTOH) something like import qualified Data.ByteString as BS import qualified Data.ByteString.Char8 as BS8 class Foldable t where type Elem c :: * foldr :: (Elem t -> b -> b) -> b -> t -> b instance Foldable BS.ByteString where type Elem BS.ByteString = Word8 foldr = BS.foldr instance Foldable BS8.ByteString where type Elem BS8.ByteString = Char foldr = BS8.foldr which fails because BS.ByteString and BS8.ByteString are the same. Nicolas

On Wed, 2013-09-11 at 15:56 +0200, Nicolas Trangez wrote:
On Wed, 2013-09-11 at 14:35 +0100, Duncan Coutts wrote:
For mapM etc, personally I think a better solution would be if ByteString and Text and other specialised containers could be an instance of Foldable/Traversable.
Yes!
Those classes define mapM etc but currently they only work for containers that are polymorphic in their elements, so all specialised containers are excluded.
Indeed :-( Hence my question a couple of days ago.
I'm sure there must be a solution to that (I'd guess with type families) and that would be much nicer than adding mapM etc to bytestring itself. We would then just provide efficient instances for Foldable/Traversable.
I looked into this when this thread started, but bumped into at least one 'issue': when defining (OTOH) something like
import qualified Data.ByteString as BS import qualified Data.ByteString.Char8 as BS8
class Foldable t where type Elem c :: * foldr :: (Elem t -> b -> b) -> b -> t -> b
instance Foldable BS.ByteString where type Elem BS.ByteString = Word8 foldr = BS.foldr
instance Foldable BS8.ByteString where type Elem BS8.ByteString = Char foldr = BS8.foldr
which fails because BS.ByteString and BS8.ByteString are the same.
Right. I fear there's very little one could do about that. We do have one single type that is a specialised container for two types. That's fairly unusual. I'd probably just use Word8, especially since Char stuff is a bit iffy since it's really "Char8" not a full Char. Duncan

On Wed, Sep 11, 2013 at 7:15 AM, Duncan Coutts wrote: On Wed, 2013-09-11 at 15:56 +0200, Nicolas Trangez wrote: I looked into this when this thread started, but bumped into at least
one 'issue': when defining (OTOH) something like import qualified Data.ByteString as BS
import qualified Data.ByteString.Char8 as BS8 class Foldable t where
type Elem c :: *
foldr :: (Elem t -> b -> b) -> b -> t -> b instance Foldable BS.ByteString where
type Elem BS.ByteString = Word8
foldr = BS.foldr instance Foldable BS8.ByteString where
type Elem BS8.ByteString = Char
foldr = BS8.foldr which fails because BS.ByteString and BS8.ByteString are the same. Right. I fear there's very little one could do about that. We do have
one single type that is a specialised container for two types. That's
fairly unusual. I'd probably just use Word8, especially since Char stuff
is a bit iffy since it's really "Char8" not a full Char. It happens in containers and unordered-containers as well and I suspect we
will see it any time we want both a lazy and strict (in the container
sense) version of a data type.
-- Johan

We can't just upgrade those classes to be type family based, as this
destroys their support for polymorphic recursion.
We could write another class though.
The question is if that class belongs in base or in some other package, as
it takes is pretty far afield of anything purporting to be portable.
On Wed, Sep 11, 2013 at 9:35 AM, Duncan Coutts wrote: On Thu, 2013-09-05 at 18:51 -0400, Edward Kmett wrote: The maintainers for bytestring are still listed as Don Stewart and Duncan
Coutts on the package, and it doesn't seem to fall the list of core
packages per http://www.haskell.org/haskellwiki/Library_submissions so I
suppose it would come down to talking one of them into taking the patch. Yes, I'm happy to consider pull requests. https://github.com/haskell/bytestring/ I should say that I start off sceptical about any additions to this API.
As Raymond Chen says, each new feature starts off with -100 points. For mapM etc, personally I think a better solution would be if
ByteString and Text and other specialised containers could be an
instance of Foldable/Traversable. Those classes define mapM etc but
currently they only work for containers that are polymorphic in their
elements, so all specialised containers are excluded. I'm sure there
must be a solution to that (I'd guess with type families) and that would
be much nicer than adding mapM etc to bytestring itself. We would then
just provide efficient instances for Foldable/Traversable. Duncan

On 09/11/13 10:28, Edward Kmett wrote:
We can't just upgrade those classes to be type family based, as this destroys their support for polymorphic recursion.
We could write another class though.
The question is if that class belongs in base or in some other package, as it takes is pretty far afield of anything purporting to be portable.
Not necessarily. If you get rid of the item type, the resulting class is standard Haskell: class Monoid m => FactorialMonoid m where foldMap :: (FactorialMonoid m, Monoid n) => (m -> n) -> m -> n With this approach you're folding over the atomic factors of the ByteString, i.e., ByteStrings containing 1 byte each, rather than the bytes themselves. The same interface covers strict and lazy ByteString and ByteString.Char8 as well. Yes, abstracting away the item type would have some performance impact, as foldMap would need to construct the singleton ByteStrings. I'm hoping they could be deforested away.

On Wed, 11 Sep 2013, Duncan Coutts wrote:
For mapM etc, personally I think a better solution would be if ByteString and Text and other specialised containers could be an instance of Foldable/Traversable. Those classes define mapM etc but currently they only work for containers that are polymorphic in their elements, so all specialised containers are excluded. I'm sure there must be a solution to that (I'd guess with type families) and that would be much nicer than adding mapM etc to bytestring itself. We would then just provide efficient instances for Foldable/Traversable.
I'd prefer to keep bytestring simple with respect to the number of type extensions. Since you must implement ByteString.mapM anyway, you can plug this into an instance definition of Traversable ByteString.

mapM_ is actually implemented in terms of Foldable, not Traversable, and its implementation in terms of folding a ByteString is actually rather slow in my experience doing so inside lens and isn't much faster than the naive version that was suggested at the start of this discussion. But as we're not monomorphizing Foldable/Traversable, this isn't a think that is able to happen anyways. -Edward On Wed, Sep 11, 2013 at 2:25 PM, Henning Thielemann < lemming@henning-thielemann.de> wrote:
On Wed, 11 Sep 2013, Duncan Coutts wrote:
For mapM etc, personally I think a better solution would be if
ByteString and Text and other specialised containers could be an instance of Foldable/Traversable. Those classes define mapM etc but currently they only work for containers that are polymorphic in their elements, so all specialised containers are excluded. I'm sure there must be a solution to that (I'd guess with type families) and that would be much nicer than adding mapM etc to bytestring itself. We would then just provide efficient instances for Foldable/Traversable.
I'd prefer to keep bytestring simple with respect to the number of type extensions. Since you must implement ByteString.mapM anyway, you can plug this into an instance definition of Traversable ByteString.

I agree with everything Edward has said already. I went through a similar
chain of reasoning a few years ago when I started using ListLike, which
provides a FoldableLL class (although it uses fundeps as ListLike predates
type families). ByteString can't be a Foldable instance, nor do I think
most people would want it to be.
Even though I would also like to see mapM_ in bytestring, it's probably
faster to have a library with a separate monomorphic Foldable class. So I
just wrote one:
https://github.com/JohnLato/mono-foldable
http://hackage.haskell.org/package/mono-foldable
Petr Pudlak has done some work in this area. A big problem is that
foldM/mapM_ are typically implemented in terms of Foldable.foldr (or
FoldableLL), but this isn't always optimal for performance. They really
need to be part of the type class so that different container types can
have specialized implementations. I did that in mono-foldable, using
Artyom's map implementation (Artyom, please let me know if you object to
this!)
pull requests, forks, etc all welcome.
John L.
On Wed, Sep 11, 2013 at 1:29 PM, Edward Kmett
mapM_ is actually implemented in terms of Foldable, not Traversable, and its implementation in terms of folding a ByteString is actually rather slow in my experience doing so inside lens and isn't much faster than the naive version that was suggested at the start of this discussion.
But as we're not monomorphizing Foldable/Traversable, this isn't a think that is able to happen anyways.
-Edward
On Wed, Sep 11, 2013 at 2:25 PM, Henning Thielemann < lemming@henning-thielemann.de> wrote:
On Wed, 11 Sep 2013, Duncan Coutts wrote:
For mapM etc, personally I think a better solution would be if
ByteString and Text and other specialised containers could be an instance of Foldable/Traversable. Those classes define mapM etc but currently they only work for containers that are polymorphic in their elements, so all specialised containers are excluded. I'm sure there must be a solution to that (I'd guess with type families) and that would be much nicer than adding mapM etc to bytestring itself. We would then just provide efficient instances for Foldable/Traversable.
I'd prefer to keep bytestring simple with respect to the number of type extensions. Since you must implement ByteString.mapM anyway, you can plug this into an instance definition of Traversable ByteString.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Actually you can implement them in terms of foldMap and they suffer
somewhat less and become slightly more defined for infinite traversals but
there is still some suffering to be had performance wise.
Snoyman has also been pursuing a monomorphic Foldable class, and there is
one (more or less) in both my monoids and reducers packages, so this is an
oft reinvented idea. ;)
On Wed, Sep 11, 2013 at 4:05 PM, John Lato
I agree with everything Edward has said already. I went through a similar chain of reasoning a few years ago when I started using ListLike, which provides a FoldableLL class (although it uses fundeps as ListLike predates type families). ByteString can't be a Foldable instance, nor do I think most people would want it to be.
Even though I would also like to see mapM_ in bytestring, it's probably faster to have a library with a separate monomorphic Foldable class. So I just wrote one:
https://github.com/JohnLato/mono-foldable http://hackage.haskell.org/package/mono-foldable
Petr Pudlak has done some work in this area. A big problem is that foldM/mapM_ are typically implemented in terms of Foldable.foldr (or FoldableLL), but this isn't always optimal for performance. They really need to be part of the type class so that different container types can have specialized implementations. I did that in mono-foldable, using Artyom's map implementation (Artyom, please let me know if you object to this!)
pull requests, forks, etc all welcome.
John L.
On Wed, Sep 11, 2013 at 1:29 PM, Edward Kmett
wrote: mapM_ is actually implemented in terms of Foldable, not Traversable, and its implementation in terms of folding a ByteString is actually rather slow in my experience doing so inside lens and isn't much faster than the naive version that was suggested at the start of this discussion.
But as we're not monomorphizing Foldable/Traversable, this isn't a think that is able to happen anyways.
-Edward
On Wed, Sep 11, 2013 at 2:25 PM, Henning Thielemann < lemming@henning-thielemann.de> wrote:
On Wed, 11 Sep 2013, Duncan Coutts wrote:
For mapM etc, personally I think a better solution would be if
ByteString and Text and other specialised containers could be an instance of Foldable/Traversable. Those classes define mapM etc but currently they only work for containers that are polymorphic in their elements, so all specialised containers are excluded. I'm sure there must be a solution to that (I'd guess with type families) and that would be much nicer than adding mapM etc to bytestring itself. We would then just provide efficient instances for Foldable/Traversable.
I'd prefer to keep bytestring simple with respect to the number of type extensions. Since you must implement ByteString.mapM anyway, you can plug this into an instance definition of Traversable ByteString.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

| The overhead | seems to be coming from inlinePerformIO (am I right here? Also, am Why do you need all this 'unsafe' stuff? I think because bytestrings work internally using side effects. But why can't you use runST? If for some reason you really need IO, it's presumably not because you are really doing I/O. It's a kind of specialised application. The overheads from unsafePerformIO and friends are partly related (I think) to the bad IO-ish things that might happen inside. So perhaps you (plural) can look carefully at what you are really trying to do, and propose new primpops or whatever to support it at low cost. Don't give *me* the answers! I'm just pointing out that there is no fundamental reason for this to run slower than it would in C, and if it does it might be worth digging a bit. Simon | -----Original Message----- | From: Libraries [mailto:libraries-bounces@haskell.org] On Behalf Of | Artyom Kazak | Sent: 05 September 2013 21:38 | To: libraries@haskell.org | Subject: Re: mapM_ for bytestring | | So, I have written several implementations of mapM_: | * bsMapM_gen — generic, works for any monad | * bsMapM_short — essentially (\f s -> mapM_ f $ unpack s) | * bsMapM_IO — hand-written version specifically for IO | | Generic and hand-written versions don’t differ much. The overhead | seems to be coming from inlinePerformIO (am I right here? Also, am | I using inlinePerformIO legitimately?), which is only needed when | we’re not in the IO monad. | | {-# SPECIALISE ... IO #-} | {-# SPECIALISE ... ST #-} | bsMapM_gen :: Monad m => (Word8 -> m a) -> ByteString -> m () | bsMapM_gen f s = unsafePerformIO $ unsafeUseAsCStringLen s mapp | where | mapp (ptr, len) = return $ go 0 | where | go i | i == len = return () | | otherwise = let !b = inlinePerformIO $ | peekByteOff ptr i | in f b >> go (i+1) | | The short version relies on fusion of `unpack` and `mapM_`. Its | advantage is that even when compiled without optimisations, it’s | still fast. (Question: would the same happen to other versions, | when put into Data.ByteString module? I suppose packages like | bytestring are compiled with optimisations, so it probably would.) | | {-# SPECIALISE ... IO #-} | {-# SPECIALISE ... ST #-} | bsMapM_shortIO :: (Word8 -> IO a) -> ByteString -> IO () | bsMapM_shortIO f s = mapM_ f (unpack s) | | Finally, the IO-specialised version. It’s faster than generic | version (and, similarly, an ST-specialised version using | unsafeIOToST would be just as fast), so I assume a SPECIALISE pragma | involving bsMapM_IO and bsMapM_ST should be present. | (Question: are there other monads for which unsafeIOToMonad exists?) | | bsMapM_IO :: (Word8 -> IO a) -> ByteString -> IO () | bsMapM_IO f s = unsafeUseAsCStringLen s mapp | where | mapp (ptr, len) = go 0 | where | go i | i == len = return () | | otherwise = peekByteOff ptr i >>= f >> go (i+1) | | A-and here’s a table comparing performance of all three functions. | All timings are in milliseconds. | | ghci ghc ghc -O ghc -O2 | +----------+----------+----------+----------+ | gen | 380 | 85 | 4.1 | 4.0 | | short | 45 | 46 | 17.2 | 16.5 | | IO | 434 | 92 | 2.4 | 2.4 | | +----------+----------+----------+----------+ | | Here’s the code I used. (Question: have I messed up anything?) | | import qualified Data.ByteString as BS | import Data.Random | import System.Random | import System.IO.Unsafe | import Control.Monad | import Data.IORef | import Criterion.Main | import BSMaps | | --a bytestring consisting of 65536 random bytes | testCase = BS.pack $ fst $ | flip sampleState (mkStdGen 8) $ | replicateM (2^16) stdUniform | | --sums elements of a bytestring, using given mapM_ | sumIO :: ((Word8 -> IO ()) -> BS.ByteString -> IO ()) -> | BS.ByteString -> Word8 | sumIO f s = unsafePerformIO $ do | sm <- newIORef 0 | f (modifyIORef' sm . (+)) s | readIORef sm | | --runs the tests | main = defaultMain [ | bench "IO" $ whnf (sumIO bsMapM_IO) testCase, | bench "short" $ whnf (sumIO bsMapM_short) testCase, | bench "gen" $ whnf (sumIO bsMapM_gen) testCase] | | Finally, if there isn’t anything wrong, what are my next steps to see | this included into next version of bytestring? | | _______________________________________________ | Libraries mailing list | Libraries@haskell.org | http://www.haskell.org/mailman/listinfo/libraries

ByteString itself works through a ForeignPtr, rather than a ByteArray# and
it does a lot of 'inlinePerformIO' tricks to avoid the MVar implicit in a
full unsafePerformIO as well as eke out some extra performance benefits
over and above unsafeDupablePerformIO that I forget the details of.
Text, which came later on the other hand does work under ByteArray# and
MutableByteArray# s in a much more principled fashion.
There is, however, a case for keeping some ability to work with
ForeignPtr's in a ByteString style, because there are a lot of tricks that
live on the ByteString stack for doing things like mmap'ing that can't work
with the ByteArray# approach.
-Edward
On Fri, Sep 6, 2013 at 3:11 AM, Simon Peyton-Jones
| The overhead | seems to be coming from inlinePerformIO (am I right here? Also, am
Why do you need all this 'unsafe' stuff? I think because bytestrings work internally using side effects. But why can't you use runST?
If for some reason you really need IO, it's presumably not because you are really doing I/O. It's a kind of specialised application. The overheads from unsafePerformIO and friends are partly related (I think) to the bad IO-ish things that might happen inside.
So perhaps you (plural) can look carefully at what you are really trying to do, and propose new primpops or whatever to support it at low cost.
Don't give *me* the answers! I'm just pointing out that there is no fundamental reason for this to run slower than it would in C, and if it does it might be worth digging a bit.
Simon
| -----Original Message----- | From: Libraries [mailto:libraries-bounces@haskell.org] On Behalf Of | Artyom Kazak | Sent: 05 September 2013 21:38 | To: libraries@haskell.org | Subject: Re: mapM_ for bytestring | | So, I have written several implementations of mapM_: | * bsMapM_gen — generic, works for any monad | * bsMapM_short — essentially (\f s -> mapM_ f $ unpack s) | * bsMapM_IO — hand-written version specifically for IO | | Generic and hand-written versions don’t differ much. The overhead | seems to be coming from inlinePerformIO (am I right here? Also, am | I using inlinePerformIO legitimately?), which is only needed when | we’re not in the IO monad. | | {-# SPECIALISE ... IO #-} | {-# SPECIALISE ... ST #-} | bsMapM_gen :: Monad m => (Word8 -> m a) -> ByteString -> m () | bsMapM_gen f s = unsafePerformIO $ unsafeUseAsCStringLen s mapp | where | mapp (ptr, len) = return $ go 0 | where | go i | i == len = return () | | otherwise = let !b = inlinePerformIO $ | peekByteOff ptr i | in f b >> go (i+1) | | The short version relies on fusion of `unpack` and `mapM_`. Its | advantage is that even when compiled without optimisations, it’s | still fast. (Question: would the same happen to other versions, | when put into Data.ByteString module? I suppose packages like | bytestring are compiled with optimisations, so it probably would.) | | {-# SPECIALISE ... IO #-} | {-# SPECIALISE ... ST #-} | bsMapM_shortIO :: (Word8 -> IO a) -> ByteString -> IO () | bsMapM_shortIO f s = mapM_ f (unpack s) | | Finally, the IO-specialised version. It’s faster than generic | version (and, similarly, an ST-specialised version using | unsafeIOToST would be just as fast), so I assume a SPECIALISE pragma | involving bsMapM_IO and bsMapM_ST should be present. | (Question: are there other monads for which unsafeIOToMonad exists?) | | bsMapM_IO :: (Word8 -> IO a) -> ByteString -> IO () | bsMapM_IO f s = unsafeUseAsCStringLen s mapp | where | mapp (ptr, len) = go 0 | where | go i | i == len = return () | | otherwise = peekByteOff ptr i >>= f >> go (i+1) | | A-and here’s a table comparing performance of all three functions. | All timings are in milliseconds. | | ghci ghc ghc -O ghc -O2 | +----------+----------+----------+----------+ | gen | 380 | 85 | 4.1 | 4.0 | | short | 45 | 46 | 17.2 | 16.5 | | IO | 434 | 92 | 2.4 | 2.4 | | +----------+----------+----------+----------+ | | Here’s the code I used. (Question: have I messed up anything?) | | import qualified Data.ByteString as BS | import Data.Random | import System.Random | import System.IO.Unsafe | import Control.Monad | import Data.IORef | import Criterion.Main | import BSMaps | | --a bytestring consisting of 65536 random bytes | testCase = BS.pack $ fst $ | flip sampleState (mkStdGen 8) $ | replicateM (2^16) stdUniform | | --sums elements of a bytestring, using given mapM_ | sumIO :: ((Word8 -> IO ()) -> BS.ByteString -> IO ()) -> | BS.ByteString -> Word8 | sumIO f s = unsafePerformIO $ do | sm <- newIORef 0 | f (modifyIORef' sm . (+)) s | readIORef sm | | --runs the tests | main = defaultMain [ | bench "IO" $ whnf (sumIO bsMapM_IO) testCase, | bench "short" $ whnf (sumIO bsMapM_short) testCase, | bench "gen" $ whnf (sumIO bsMapM_gen) testCase] | | Finally, if there isn’t anything wrong, what are my next steps to see | this included into next version of bytestring? | | _______________________________________________ | Libraries mailing list | Libraries@haskell.org | http://www.haskell.org/mailman/listinfo/libraries _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
participants (12)
-
Artyom Kazak
-
Carter Schonwald
-
Duncan Coutts
-
Edward Kmett
-
Erik de Castro Lopo
-
Henning Thielemann
-
Johan Tibell
-
John Lato
-
Mario Blažević
-
Nicolas Trangez
-
Shachaf Ben-Kiki
-
Simon Peyton-Jones