
I propose to add optimized {to,from}Strict conversion functions between strict and lazy ByteStrings to the Data.ByteString.Lazy API. Discussion deadline: 2 weeks from now (12 November) = Current State = The current Data.ByteString.Lazy API doesn't provide direct conversion functions to/from single strict ByteStrings Currently, there are only `fromChunks` and `toChunks`, by which convert to/from a list of strict ByteStrings. A possible reference implementation of the missing conversion functions is: fromStrict = BL.fromChunks . (:[]) and toStrict = B.concat . BL.toChunks == The Issues == The lack of `fromStrict`/`toStrict` in the Data.ByteString.Lazy API has the following issues: - Convenience: If the single-strict-bytestring conversion is often needed, one tends to define module- or package-local helper functions for convenience/readability to perform the desired conversion. This violates the DRY principle. - Principle of least suprise: Might be confusing to users new to `Data.ByteString.Lazy` why there is no direct conversion. - Symmetry with `Data.Text.Lazy` API which does provide such single-strict-text conversion functions (`fromStrict`/`toStrict`) - Performance: The above provided "naive" `toStrict` definition has a roughly 2 to 4 times higher overhead than a manually fused version (which was kindly provided by Bas van Dijk -- whom I'd like to thank for providing me with the optimized versions of toStrict and fromStrict) -- see end of this mail for criterion benchmark code and results = Proposed Enhancement = Enhance the Data.ByteString.Lazy API by adding the following conversion functions (suggestions for improvements are highly welcome): -- see benchmark code at end of mail for the qualified imports -- |/O(n)/ Convert a strict ByteString into a lazy ByteString. fromStrict :: B.ByteString -> BL.ByteString fromStrict = flip BLI.chunk BLI.Empty -- |/O(n)/ Convert a lazy ByteString into a strict ByteString. toStrict :: BL.ByteString -> B.ByteString toStrict lb = BI.unsafeCreate len $ go lb where len = BLI.foldlChunks (\l sb -> l + B.length sb) 0 lb go BLI.Empty _ = return () go (BLI.Chunk (BI.PS fp s l) r) ptr = withForeignPtr fp $ \p -> do BI.memcpy ptr (p `plusPtr` s) (fromIntegral l) go r (ptr `plusPtr` l) == Benchmark Code & Results == ------------------------------------------------------------------------ {-# LANGUAGE OverloadedStrings #-} import Criterion import Criterion.Main import qualified Data.ByteString as B import qualified Data.ByteString.Internal as BI import qualified Data.ByteString.Lazy as BL import qualified Data.ByteString.Lazy.Internal as BLI import Foreign.ForeignPtr import Foreign.Ptr toStrict1 :: BL.ByteString -> B.ByteString toStrict1 = B.concat . BL.toChunks toStrict2 :: BL.ByteString -> B.ByteString toStrict2 lb = BI.unsafeCreate len $ go lb where len = BLI.foldlChunks (\l sb -> l + B.length sb) 0 lb go BLI.Empty _ = return () go (BLI.Chunk (BI.PS fp s l) r) ptr = withForeignPtr fp $ \p -> do BI.memcpy ptr (p `plusPtr` s) (fromIntegral l) go r (ptr `plusPtr` l) main :: IO () main = do let lbs1 = "abcdefghij" lbs2 = BL.fromChunks (replicate 10 "abcdefghij") lbs3 = BL.fromChunks (replicate 1000 "abcdefghij") -- force evaluation of lbs{1,2,3} and verify validity print $ toStrict1 lbs1 == toStrict2 lbs1 print $ toStrict1 lbs2 == toStrict2 lbs2 print $ toStrict1 lbs3 == toStrict2 lbs3 defaultMain [ bgroup "toStrict" [ bench "simple #1" $ whnf toStrict1 lbs1 , bench "simple #2" $ whnf toStrict1 lbs2 , bench "simple #3" $ whnf toStrict1 lbs3 , bench "optimized #1" $ whnf toStrict2 lbs1 , bench "optimized #2" $ whnf toStrict2 lbs2 , bench "optimized #3" $ whnf toStrict2 lbs3 ] ] {- True True True warming up estimating clock resolution... mean is 2.302557 us (320001 iterations) found 2039 outliers among 319999 samples (0.6%) 1658 (0.5%) high severe estimating cost of a clock call... mean is 54.99870 ns (14 iterations) found 1 outliers among 14 samples (7.1%) 1 (7.1%) low mild benchmarking toStrict/simple #1 mean: 28.96077 ns, lb 28.89527 ns, ub 29.01562 ns, ci 0.950 std dev: 305.8466 ps, lb 262.1008 ps, ub 345.6136 ps, ci 0.950 benchmarking toStrict/simple #2 mean: 487.0739 ns, lb 486.7939 ns, ub 487.4713 ns, ci 0.950 std dev: 1.699232 ns, lb 1.262363 ns, ub 2.457099 ns, ci 0.950 benchmarking toStrict/simple #3 mean: 55.06322 us, lb 54.91370 us, ub 55.20236 us, ci 0.950 std dev: 741.6239 ns, lb 656.3273 ns, ub 846.6403 ns, ci 0.950 benchmarking toStrict/optimized #1 mean: 48.67522 ns, lb 48.65188 ns, ub 48.70237 ns, ci 0.950 std dev: 129.3192 ps, lb 111.3761 ps, ub 165.4819 ps, ci 0.950 benchmarking toStrict/optimized #2 mean: 178.6342 ns, lb 178.5480 ns, ub 178.7276 ns, ci 0.950 std dev: 457.4436 ps, lb 409.2746 ps, ub 519.8267 ps, ci 0.950 benchmarking toStrict/optimized #3 mean: 13.01866 us, lb 13.00734 us, ub 13.03549 us, ci 0.950 std dev: 70.09916 ns, lb 52.18012 ns, ub 97.77226 ns, ci 0.950 -}