Performance best practices

Dear Cafe: I'm getting to the point in my Haskell learning curve where I'm getting more interested in the fine points of programming for performance and would be grateful for some ideas of the tools and options available for exploring this. To make this concrete, let me give a simple example. Here are two different ways of coding the same list-splitting function: import Data.List (inits,tails) splits1, splits2 :: [a] -> [([a],[a])] splits1 xs = zip (inits xs) (tails xs) splits2 [] = [([],[])] splits2 xs@(x:xs') = ([],xs) : [(x:as,bs) | (as,bs) <- splits2 xs'] For example, splits1 [1,2,3] is [([],[1,2,3]),([1],[2,3]),([1,2],[3]),([1,2,3],[])]. How do these functions compare in their actual time and space usage? It seems that, asymptotically, tails is O(n) while inits is O(n^2) in both time and space (when fully evaluated), and splits2 is also O(n^2), but how would I go about comparing them in actual usage to get a more precise comparison? What profiling or run-time statistics could I gather to make a decision about which one to use in a situation where performance mattered (assuming for the sake of discussion that such a situation existed for this simple example)? And, out of curiosity, is there an even more efficient way to code this function? Thanks, Todd Wilson

I think the best way to test the performance is to use a benchmarking library such as criterion (http://www.serpentine.com/criterion/) import Criterion.Main import Data.List (inits,tails) splits1, splits2, splits3 :: [a] -> [([a],[a])] splits1 xs = zip (inits xs) (tails xs) splits2 [] = [([],[])] splits2 xs@(x:xs') = ([],xs) : [(x:as,bs) | (as,bs) <- splits2 xs'] splits3 = go id where go prefix xs = (prefix [],xs) : case xs of x:xs -> go (prefix . (x :)) xs [] -> [] main = defaultMain [ bench "splits1" $ nf splits1 input , bench "splits2" $ nf splits2 input , bench "splits3" $ nf splits3 input ] where input :: [Int] input = [1..1000] It gives the following results on my computer: benchmarking splits1 time 4.693 ms (4.625 ms .. 4.752 ms) 0.999 R² (0.998 R² .. 0.999 R²) mean 4.725 ms (4.694 ms .. 4.758 ms) std dev 97.37 μs (77.95 μs .. 123.3 μs) benchmarking splits2 time 12.10 ms (11.96 ms .. 12.25 ms) 0.999 R² (0.999 R² .. 1.000 R²) mean 12.08 ms (12.03 ms .. 12.14 ms) std dev 149.7 μs (110.0 μs .. 199.1 μs) benchmarking splits3 time 4.110 ms (4.025 ms .. 4.187 ms) 0.997 R² (0.996 R² .. 0.999 R²) mean 4.124 ms (4.081 ms .. 4.177 ms) std dev 147.6 μs (115.6 μs .. 238.5 μs) variance introduced by outliers: 17% (moderately inflated) On 01-08-2019 04:45, Todd Wilson wrote:
Dear Cafe:
I'm getting to the point in my Haskell learning curve where I'm getting more interested in the fine points of programming for performance and would be grateful for some ideas of the tools and options available for exploring this. To make this concrete, let me give a simple example. Here are two different ways of coding the same list-splitting function:
import Data.List (inits,tails) splits1, splits2 :: [a] -> [([a],[a])] splits1 xs = zip (inits xs) (tails xs) splits2 [] = [([],[])] splits2 xs@(x:xs') = ([],xs) : [(x:as,bs) | (as,bs) <- splits2 xs']
For example, splits1 [1,2,3] is [([],[1,2,3]),([1],[2,3]),([1,2],[3]),([1,2,3],[])].
How do these functions compare in their actual time and space usage? It seems that, asymptotically, tails is O(n) while inits is O(n^2) in both time and space (when fully evaluated), and splits2 is also O(n^2), but how would I go about comparing them in actual usage to get a more precise comparison? What profiling or run-time statistics could I gather to make a decision about which one to use in a situation where performance mattered (assuming for the sake of discussion that such a situation existed for this simple example)? And, out of curiosity, is there an even more efficient way to code this function? Thanks,
Todd Wilson _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

If you fully evaluate the list produced by tails, then you're still spending O(n^2) time, because that is just the size of the produced list. But constructing the list and the memory taken by the list is O(n), because most of the lists are shared (https://wiki.haskell.org/Sharing). On 01-08-2019 04:45, Todd Wilson wrote:
It seems that, asymptotically, tails is O(n) while inits is O(n^2) in both time and space (when fully evaluated)

Replying to myself, you can actually write an evaluation function that forces all values in the result of tails and inits in linear time: -- force the result of tails in linear time seqTails (x:xs) = x `deepseq` seqList xs seqTails [] = () -- force all the values in a list to whnf in linear time -- https://wiki.haskell.org/Weak_head_normal_form seqList (x:xs) = x `seq` seqList xs seqList [] = () -- force the result of inits in linear time seqInits xs = last xs `deepseq` seqList xs Try it in ghci with :sprint (https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/ghci.html#gh...): > let x = tails [1..3::Int] > :sprint x x = _ > seqTails x () > :sprint x x = [[1,2,3],[2,3],[3],[]] > let y = inits [1..3::Int] > :sprint y y = _ > seqInits y () > :sprint y y = [[],[1],[1,2],[1,2,3]] Using criterion you can see that it is actually linear time: main = defaultMain [ bgroup "inits" [ bench "1000" $ whnf (seqInits . inits) [1..1000 :: Int] , bench "10000" $ whnf (seqInits . inits) [1..10000 :: Int] , bench "100000" $ whnf (seqInits . inits) [1..100000 :: Int] , bench "1000000" $ whnf (seqInits . inits) [1..1000000 :: Int] ] , bgroup "tails" [ bench "1000" $ whnf (seqTails . tails) [1..1000 :: Int] , bench "10000" $ whnf (seqTails . tails) [1..10000 :: Int] , bench "100000" $ whnf (seqTails . tails) [1..100000 :: Int] , bench "1000000" $ whnf (seqTails . tails) [1..1000000 :: Int] ] ] benchmarking inits/1000 time 204.2 μs (203.2 μs .. 205.4 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 203.4 μs (202.8 μs .. 204.1 μs) std dev 2.163 μs (1.755 μs .. 2.664 μs) benchmarking inits/10000 time 3.127 ms (3.107 ms .. 3.148 ms) 1.000 R² (0.999 R² .. 1.000 R²) mean 3.105 ms (3.088 ms .. 3.118 ms) std dev 45.73 μs (32.97 μs .. 69.14 μs) benchmarking inits/100000 time 41.05 ms (39.11 ms .. 42.87 ms) 0.993 R² (0.988 R² .. 0.998 R²) mean 41.52 ms (40.62 ms .. 42.46 ms) std dev 1.912 ms (1.330 ms .. 2.930 ms) variance introduced by outliers: 12% (moderately inflated) benchmarking inits/1000000 time 423.0 ms (318.2 ms .. 535.5 ms) 0.991 R² (0.969 R² .. 1.000 R²) mean 459.1 ms (428.8 ms .. 505.2 ms) std dev 44.05 ms (10.06 ms .. 58.49 ms) variance introduced by outliers: 22% (moderately inflated) benchmarking tails/1000 time 8.811 μs (8.768 μs .. 8.873 μs) 1.000 R² (0.999 R² .. 1.000 R²) mean 8.874 μs (8.819 μs .. 8.963 μs) std dev 225.7 ns (168.4 ns .. 325.2 ns) variance introduced by outliers: 28% (moderately inflated) benchmarking tails/10000 time 87.21 μs (86.85 μs .. 87.79 μs) 1.000 R² (0.999 R² .. 1.000 R²) mean 87.42 μs (87.01 μs .. 87.88 μs) std dev 1.481 μs (1.132 μs .. 1.953 μs) variance introduced by outliers: 11% (moderately inflated) benchmarking tails/100000 time 886.9 μs (882.9 μs .. 890.9 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 881.5 μs (878.1 μs .. 885.7 μs) std dev 12.40 μs (9.598 μs .. 18.97 μs) benchmarking tails/1000000 time 9.796 ms (9.757 ms .. 9.840 ms) 1.000 R² (1.000 R² .. 1.000 R²) mean 9.817 ms (9.791 ms .. 9.845 ms) std dev 78.47 μs (60.63 μs .. 99.31 μs) On 01-08-2019 10:25, Jaro Reinders wrote:
If you fully evaluate the list produced by tails, then you're still spending O(n^2) time, because that is just the size of the produced list. But constructing the list and the memory taken by the list is O(n), because most of the lists are shared (https://wiki.haskell.org/Sharing).
On 01-08-2019 04:45, Todd Wilson wrote:
It seems that, asymptotically, tails is O(n) while inits is O(n^2) in both time and space (when fully evaluated)

I just realized seqTails and seqInits are unnecessary, seqList is enough. On 01-08-2019 11:04, Jaro Reinders wrote:
Replying to myself, you can actually write an evaluation function that forces all values in the result of tails and inits in linear time:
-- force the result of tails in linear time seqTails (x:xs) = x `deepseq` seqList xs seqTails [] = ()
-- force all the values in a list to whnf in linear time -- https://wiki.haskell.org/Weak_head_normal_form seqList (x:xs) = x `seq` seqList xs seqList [] = ()
-- force the result of inits in linear time seqInits xs = last xs `deepseq` seqList xs
Try it in ghci with :sprint (https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/ghci.html#gh...):
> let x = tails [1..3::Int] > :sprint x x = _ > seqTails x () > :sprint x x = [[1,2,3],[2,3],[3],[]]
> let y = inits [1..3::Int] > :sprint y y = _ > seqInits y () > :sprint y y = [[],[1],[1,2],[1,2,3]]
Using criterion you can see that it is actually linear time:
main = defaultMain [ bgroup "inits" [ bench "1000" $ whnf (seqInits . inits) [1..1000 :: Int] , bench "10000" $ whnf (seqInits . inits) [1..10000 :: Int] , bench "100000" $ whnf (seqInits . inits) [1..100000 :: Int] , bench "1000000" $ whnf (seqInits . inits) [1..1000000 :: Int] ] , bgroup "tails" [ bench "1000" $ whnf (seqTails . tails) [1..1000 :: Int] , bench "10000" $ whnf (seqTails . tails) [1..10000 :: Int] , bench "100000" $ whnf (seqTails . tails) [1..100000 :: Int] , bench "1000000" $ whnf (seqTails . tails) [1..1000000 :: Int] ] ]
benchmarking inits/1000 time 204.2 μs (203.2 μs .. 205.4 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 203.4 μs (202.8 μs .. 204.1 μs) std dev 2.163 μs (1.755 μs .. 2.664 μs)
benchmarking inits/10000 time 3.127 ms (3.107 ms .. 3.148 ms) 1.000 R² (0.999 R² .. 1.000 R²) mean 3.105 ms (3.088 ms .. 3.118 ms) std dev 45.73 μs (32.97 μs .. 69.14 μs)
benchmarking inits/100000 time 41.05 ms (39.11 ms .. 42.87 ms) 0.993 R² (0.988 R² .. 0.998 R²) mean 41.52 ms (40.62 ms .. 42.46 ms) std dev 1.912 ms (1.330 ms .. 2.930 ms) variance introduced by outliers: 12% (moderately inflated)
benchmarking inits/1000000 time 423.0 ms (318.2 ms .. 535.5 ms) 0.991 R² (0.969 R² .. 1.000 R²) mean 459.1 ms (428.8 ms .. 505.2 ms) std dev 44.05 ms (10.06 ms .. 58.49 ms) variance introduced by outliers: 22% (moderately inflated)
benchmarking tails/1000 time 8.811 μs (8.768 μs .. 8.873 μs) 1.000 R² (0.999 R² .. 1.000 R²) mean 8.874 μs (8.819 μs .. 8.963 μs) std dev 225.7 ns (168.4 ns .. 325.2 ns) variance introduced by outliers: 28% (moderately inflated)
benchmarking tails/10000 time 87.21 μs (86.85 μs .. 87.79 μs) 1.000 R² (0.999 R² .. 1.000 R²) mean 87.42 μs (87.01 μs .. 87.88 μs) std dev 1.481 μs (1.132 μs .. 1.953 μs) variance introduced by outliers: 11% (moderately inflated)
benchmarking tails/100000 time 886.9 μs (882.9 μs .. 890.9 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 881.5 μs (878.1 μs .. 885.7 μs) std dev 12.40 μs (9.598 μs .. 18.97 μs)
benchmarking tails/1000000 time 9.796 ms (9.757 ms .. 9.840 ms) 1.000 R² (1.000 R² .. 1.000 R²) mean 9.817 ms (9.791 ms .. 9.845 ms) std dev 78.47 μs (60.63 μs .. 99.31 μs)
On 01-08-2019 10:25, Jaro Reinders wrote:
If you fully evaluate the list produced by tails, then you're still spending O(n^2) time, because that is just the size of the produced list. But constructing the list and the memory taken by the list is O(n), because most of the lists are shared (https://wiki.haskell.org/Sharing).
On 01-08-2019 04:45, Todd Wilson wrote:
It seems that, asymptotically, tails is O(n) while inits is O(n^2) in both time and space (when fully evaluated)

In high performance Haskell, you will often find yourself using sequential
structures besides lists. For example, an unboxed vector implementation is
over 100x faster than any of the proposed list implementations.
Code: https://gist.github.com/wyager/45946f9f1531351468e4366b7ba168fa
Benchmark result:
https://gist.github.com/wyager/96e7876a4b170d83dca971dd152e475e
GHC is very powerful, and can often do a surprisingly good job of
optimizing away list allocations and such. However, in sharing-heavy
applications like this (and if random indexing is helpful), Vectors can be
much more efficient.
On Thu, Aug 1, 2019 at 5:25 PM Jaro Reinders
I just realized seqTails and seqInits are unnecessary, seqList is enough.
On 01-08-2019 11:04, Jaro Reinders wrote:
Replying to myself, you can actually write an evaluation function that forces all values in the result of tails and inits in linear time:
-- force the result of tails in linear time seqTails (x:xs) = x `deepseq` seqList xs seqTails [] = ()
-- force all the values in a list to whnf in linear time -- https://wiki.haskell.org/Weak_head_normal_form seqList (x:xs) = x `seq` seqList xs seqList [] = ()
-- force the result of inits in linear time seqInits xs = last xs `deepseq` seqList xs
Try it in ghci with :sprint ( https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/ghci.html#gh... ):
> let x = tails [1..3::Int] > :sprint x x = _ > seqTails x () > :sprint x x = [[1,2,3],[2,3],[3],[]]
> let y = inits [1..3::Int] > :sprint y y = _ > seqInits y () > :sprint y y = [[],[1],[1,2],[1,2,3]]
Using criterion you can see that it is actually linear time:
main = defaultMain [ bgroup "inits" [ bench "1000" $ whnf (seqInits . inits) [1..1000 :: Int] , bench "10000" $ whnf (seqInits . inits) [1..10000 :: Int] , bench "100000" $ whnf (seqInits . inits) [1..100000 :: Int] , bench "1000000" $ whnf (seqInits . inits) [1..1000000 :: Int] ] , bgroup "tails" [ bench "1000" $ whnf (seqTails . tails) [1..1000 :: Int] , bench "10000" $ whnf (seqTails . tails) [1..10000 :: Int] , bench "100000" $ whnf (seqTails . tails) [1..100000 :: Int] , bench "1000000" $ whnf (seqTails . tails) [1..1000000 :: Int] ] ]
benchmarking inits/1000 time 204.2 μs (203.2 μs .. 205.4 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 203.4 μs (202.8 μs .. 204.1 μs) std dev 2.163 μs (1.755 μs .. 2.664 μs)
benchmarking inits/10000 time 3.127 ms (3.107 ms .. 3.148 ms) 1.000 R² (0.999 R² .. 1.000 R²) mean 3.105 ms (3.088 ms .. 3.118 ms) std dev 45.73 μs (32.97 μs .. 69.14 μs)
benchmarking inits/100000 time 41.05 ms (39.11 ms .. 42.87 ms) 0.993 R² (0.988 R² .. 0.998 R²) mean 41.52 ms (40.62 ms .. 42.46 ms) std dev 1.912 ms (1.330 ms .. 2.930 ms) variance introduced by outliers: 12% (moderately inflated)
benchmarking inits/1000000 time 423.0 ms (318.2 ms .. 535.5 ms) 0.991 R² (0.969 R² .. 1.000 R²) mean 459.1 ms (428.8 ms .. 505.2 ms) std dev 44.05 ms (10.06 ms .. 58.49 ms) variance introduced by outliers: 22% (moderately inflated)
benchmarking tails/1000 time 8.811 μs (8.768 μs .. 8.873 μs) 1.000 R² (0.999 R² .. 1.000 R²) mean 8.874 μs (8.819 μs .. 8.963 μs) std dev 225.7 ns (168.4 ns .. 325.2 ns) variance introduced by outliers: 28% (moderately inflated)
benchmarking tails/10000 time 87.21 μs (86.85 μs .. 87.79 μs) 1.000 R² (0.999 R² .. 1.000 R²) mean 87.42 μs (87.01 μs .. 87.88 μs) std dev 1.481 μs (1.132 μs .. 1.953 μs) variance introduced by outliers: 11% (moderately inflated)
benchmarking tails/100000 time 886.9 μs (882.9 μs .. 890.9 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 881.5 μs (878.1 μs .. 885.7 μs) std dev 12.40 μs (9.598 μs .. 18.97 μs)
benchmarking tails/1000000 time 9.796 ms (9.757 ms .. 9.840 ms) 1.000 R² (1.000 R² .. 1.000 R²) mean 9.817 ms (9.791 ms .. 9.845 ms) std dev 78.47 μs (60.63 μs .. 99.31 μs)
On 01-08-2019 10:25, Jaro Reinders wrote:
If you fully evaluate the list produced by tails, then you're still spending O(n^2) time, because that is just the size of the produced list. But constructing the list and the memory taken by the list is O(n), because most of the lists are shared (https://wiki.haskell.org/Sharing).
On 01-08-2019 04:45, Todd Wilson wrote:
It seems that, asymptotically, tails is O(n) while inits is O(n^2) in both time and space (when fully evaluated)
Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

To make the comparison more fair you should use the seqList function instead of nf for evaluation the list functions. Then the list version is actually faster (*sl is with my seqList function): benchmarking splits1 time 4.799 ms (4.751 ms .. 4.857 ms) 0.999 R² (0.998 R² .. 0.999 R²) mean 4.854 ms (4.821 ms .. 4.893 ms) std dev 111.0 μs (90.31 μs .. 136.2 μs) benchmarking splits2 time 12.18 ms (11.93 ms .. 12.53 ms) 0.995 R² (0.990 R² .. 0.999 R²) mean 12.82 ms (12.61 ms .. 13.04 ms) std dev 568.8 μs (514.4 μs .. 636.1 μs) variance introduced by outliers: 18% (moderately inflated) benchmarking splits3 time 4.118 ms (4.067 ms .. 4.164 ms) 0.999 R² (0.998 R² .. 0.999 R²) mean 4.186 ms (4.154 ms .. 4.229 ms) std dev 114.2 μs (89.12 μs .. 144.2 μs) variance introduced by outliers: 11% (moderately inflated) benchmarking splits1sl time 24.91 μs (24.78 μs .. 25.04 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 24.78 μs (24.67 μs .. 24.89 μs) std dev 378.3 ns (317.1 ns .. 466.5 ns) variance introduced by outliers: 11% (moderately inflated) benchmarking splits2sl time 8.903 ms (8.835 ms .. 8.962 ms) 1.000 R² (0.999 R² .. 1.000 R²) mean 8.877 ms (8.834 ms .. 8.924 ms) std dev 124.0 μs (104.2 μs .. 147.1 μs) benchmarking splits3sl time 11.54 μs (11.49 μs .. 11.58 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 11.59 μs (11.54 μs .. 11.64 μs) std dev 156.7 ns (127.7 ns .. 207.3 ns) benchmarking splits4 (boxed) time 1.903 ms (1.894 ms .. 1.910 ms) 1.000 R² (1.000 R² .. 1.000 R²) mean 1.908 ms (1.901 ms .. 1.915 ms) std dev 25.61 μs (20.81 μs .. 31.73 μs) benchmarking splits4 (unboxed) time 33.63 μs (33.51 μs .. 33.74 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 33.60 μs (33.46 μs .. 33.76 μs) std dev 493.7 ns (386.8 ns .. 622.1 ns) variance introduced by outliers: 10% (moderately inflated) On 01-08-2019 12:48, William Yager wrote:
In high performance Haskell, you will often find yourself using sequential structures besides lists. For example, an unboxed vector implementation is over 100x faster than any of the proposed list implementations.
Code: https://gist.github.com/wyager/45946f9f1531351468e4366b7ba168fa
Benchmark result: https://gist.github.com/wyager/96e7876a4b170d83dca971dd152e475e
GHC is very powerful, and can often do a surprisingly good job of optimizing away list allocations and such. However, in sharing-heavy applications like this (and if random indexing is helpful), Vectors can be much more efficient.
On Thu, Aug 1, 2019 at 5:25 PM Jaro Reinders
wrote: I just realized seqTails and seqInits are unnecessary, seqList is enough.
On 01-08-2019 11:04, Jaro Reinders wrote:
Replying to myself, you can actually write an evaluation function that forces all values in the result of tails and inits in linear time:
-- force the result of tails in linear time seqTails (x:xs) = x `deepseq` seqList xs seqTails [] = ()
-- force all the values in a list to whnf in linear time -- https://wiki.haskell.org/Weak_head_normal_form seqList (x:xs) = x `seq` seqList xs seqList [] = ()
-- force the result of inits in linear time seqInits xs = last xs `deepseq` seqList xs
Try it in ghci with :sprint ( https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/ghci.html#gh... ):
> let x = tails [1..3::Int] > :sprint x x = _ > seqTails x () > :sprint x x = [[1,2,3],[2,3],[3],[]]
> let y = inits [1..3::Int] > :sprint y y = _ > seqInits y () > :sprint y y = [[],[1],[1,2],[1,2,3]]
Using criterion you can see that it is actually linear time:
main = defaultMain [ bgroup "inits" [ bench "1000" $ whnf (seqInits . inits) [1..1000 :: Int] , bench "10000" $ whnf (seqInits . inits) [1..10000 :: Int] , bench "100000" $ whnf (seqInits . inits) [1..100000 :: Int] , bench "1000000" $ whnf (seqInits . inits) [1..1000000 :: Int] ] , bgroup "tails" [ bench "1000" $ whnf (seqTails . tails) [1..1000 :: Int] , bench "10000" $ whnf (seqTails . tails) [1..10000 :: Int] , bench "100000" $ whnf (seqTails . tails) [1..100000 :: Int] , bench "1000000" $ whnf (seqTails . tails) [1..1000000 :: Int] ] ]
benchmarking inits/1000 time 204.2 μs (203.2 μs .. 205.4 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 203.4 μs (202.8 μs .. 204.1 μs) std dev 2.163 μs (1.755 μs .. 2.664 μs)
benchmarking inits/10000 time 3.127 ms (3.107 ms .. 3.148 ms) 1.000 R² (0.999 R² .. 1.000 R²) mean 3.105 ms (3.088 ms .. 3.118 ms) std dev 45.73 μs (32.97 μs .. 69.14 μs)
benchmarking inits/100000 time 41.05 ms (39.11 ms .. 42.87 ms) 0.993 R² (0.988 R² .. 0.998 R²) mean 41.52 ms (40.62 ms .. 42.46 ms) std dev 1.912 ms (1.330 ms .. 2.930 ms) variance introduced by outliers: 12% (moderately inflated)
benchmarking inits/1000000 time 423.0 ms (318.2 ms .. 535.5 ms) 0.991 R² (0.969 R² .. 1.000 R²) mean 459.1 ms (428.8 ms .. 505.2 ms) std dev 44.05 ms (10.06 ms .. 58.49 ms) variance introduced by outliers: 22% (moderately inflated)
benchmarking tails/1000 time 8.811 μs (8.768 μs .. 8.873 μs) 1.000 R² (0.999 R² .. 1.000 R²) mean 8.874 μs (8.819 μs .. 8.963 μs) std dev 225.7 ns (168.4 ns .. 325.2 ns) variance introduced by outliers: 28% (moderately inflated)
benchmarking tails/10000 time 87.21 μs (86.85 μs .. 87.79 μs) 1.000 R² (0.999 R² .. 1.000 R²) mean 87.42 μs (87.01 μs .. 87.88 μs) std dev 1.481 μs (1.132 μs .. 1.953 μs) variance introduced by outliers: 11% (moderately inflated)
benchmarking tails/100000 time 886.9 μs (882.9 μs .. 890.9 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 881.5 μs (878.1 μs .. 885.7 μs) std dev 12.40 μs (9.598 μs .. 18.97 μs)
benchmarking tails/1000000 time 9.796 ms (9.757 ms .. 9.840 ms) 1.000 R² (1.000 R² .. 1.000 R²) mean 9.817 ms (9.791 ms .. 9.845 ms) std dev 78.47 μs (60.63 μs .. 99.31 μs)
On 01-08-2019 10:25, Jaro Reinders wrote:
If you fully evaluate the list produced by tails, then you're still spending O(n^2) time, because that is just the size of the produced list. But constructing the list and the memory taken by the list is O(n), because most of the lists are shared (https://wiki.haskell.org/Sharing).
On 01-08-2019 04:45, Todd Wilson wrote:
It seems that, asymptotically, tails is O(n) while inits is O(n^2) in both time and space (when fully evaluated)
Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Todd Wilson wrote:
For example, splits1 [1,2,3] is [([],[1,2,3]),([1],[2,3]),([1,2],[3]),([1,2,3],[])].
Note that `inits` (and hence `splits`) on lists is inherently quadratic, because the resulting lists [1], [1,2], [1,2,3], and so on, cannot share any parts of their spines. The only way around this is to change the result, either the type (as suggested elsewhere in this list), or the result itself. A cute option is to produce reversed lists, which can be achieved by rinits = scanl (flip (:)) [] (so `rinits [1,2,3] = [[],[1],[2,1],[3,2,1]]`). This works on plain lists, and uses linear time and memory. (One can view this as changing the result type to a list of snoc lists. This fact can be expressed using a newtype wrapper or a custom list type if desired.) Cheers, Bertram
participants (4)
-
Bertram Felgenhauer
-
Jaro Reinders
-
Todd Wilson
-
William Yager