vector indexing time

Hi everyone! I have a question about time complexity of vector indexing. Obviously, it should be constant, but if I do the following naive tests it looks linear. What do I do wrong? Ivan $ for i in 10000000 20000000 40000000 80000000 ; do cat vec_in_"$i".hs; ghc -fforce-recomp -O2 --make vec_in_"$i".hs && time ./vec_in_"$i"; done import Data.Vector.Unboxed main = do let a = enumFromN 1 100000001 :: Vector Int print $ a ! 10000000 [1 of 1] Compiling Main ( vec_in_10000000.hs, vec_in_10000000.o ) Linking vec_in_10000000 ... 10000001 real 0m0.033s user 0m0.032s sys 0m0.000s import Data.Vector.Unboxed main = do let a = enumFromN 1 100000001 :: Vector Int print $ a ! 20000000 [1 of 1] Compiling Main ( vec_in_20000000.hs, vec_in_20000000.o ) Linking vec_in_20000000 ... 20000001 real 0m0.062s user 0m0.060s sys 0m0.000s import Data.Vector.Unboxed main = do let a = enumFromN 1 100000001 :: Vector Int print $ a ! 40000000 [1 of 1] Compiling Main ( vec_in_40000000.hs, vec_in_40000000.o ) Linking vec_in_40000000 ... 40000001 real 0m0.125s user 0m0.120s sys 0m0.004s import Data.Vector.Unboxed main = do let a = enumFromN 1 100000001 :: Vector Int print $ a ! 80000000 [1 of 1] Compiling Main ( vec_in_80000000.hs, vec_in_80000000.o ) Linking vec_in_80000000 ... 80000001 real 0m0.244s user 0m0.240s sys 0m0.000s

On 2 August 2012 19:45, Ivan Vyalov
Hi everyone!
I have a question about time complexity of vector indexing. Obviously, it should be constant, but if I do the following naive tests it looks linear. What do I do wrong?
Ivan
$ for i in 10000000 20000000 40000000 80000000 ; do cat vec_in_"$i".hs; ghc -fforce-recomp -O2 --make vec_in_"$i".hs && time ./vec_in_"$i"; done import Data.Vector.Unboxed main = do let a = enumFromN 1 100000001 :: Vector Int print $ a ! 10000000
[1 of 1] Compiling Main ( vec_in_10000000.hs, vec_in_10000000.o ) Linking vec_in_10000000 ... 10000001
real 0m0.033s user 0m0.032s sys 0m0.000s import Data.Vector.Unboxed main = do let a = enumFromN 1 100000001 :: Vector Int print $ a ! 20000000
[1 of 1] Compiling Main ( vec_in_20000000.hs, vec_in_20000000.o ) Linking vec_in_20000000 ... 20000001
real 0m0.062s user 0m0.060s sys 0m0.000s import Data.Vector.Unboxed main = do let a = enumFromN 1 100000001 :: Vector Int print $ a ! 40000000
[1 of 1] Compiling Main ( vec_in_40000000.hs, vec_in_40000000.o ) Linking vec_in_40000000 ... 40000001
real 0m0.125s user 0m0.120s sys 0m0.004s import Data.Vector.Unboxed main = do let a = enumFromN 1 100000001 :: Vector Int print $ a ! 80000000
[1 of 1] Compiling Main ( vec_in_80000000.hs, vec_in_80000000.o ) Linking vec_in_80000000 ... 80000001
real 0m0.244s user 0m0.240s sys 0m0.000s
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
I'm no expert, but I suspect it has something to do with stream fusion and the entire vector thus not being generated when you only ask for an early element. The time difference went away when I modified your code to also print the length of the vector. Also, when I compiled without optimization, there were only negligible differences between the different runs. Alex $ for i in 10000000 20000000 40000000 80000000 ; do cat vec_in_"$i".hs; ghc -fforce-recomp -O2 --make vec_in_"$i".hs && time ./vec_in_"$i"; done import Data.Vector.Unboxed main = do let a = enumFromN 1 100000001 :: Vector Int print $ Data.Vector.Unboxed.length a print $ a ! 10000000 [1 of 1] Compiling Main ( vec_in_10000000.hs, vec_in_10000000.o ) Linking vec_in_10000000 ... 100000001 10000001 real 0m0.323s user 0m0.100s sys 0m0.220s import Data.Vector.Unboxed main = do let a = enumFromN 1 100000001 :: Vector Int print $ Data.Vector.Unboxed.length a print $ a ! 20000000 [1 of 1] Compiling Main ( vec_in_20000000.hs, vec_in_20000000.o ) Linking vec_in_20000000 ... 100000001 20000001 real 0m0.323s user 0m0.130s sys 0m0.190s import Data.Vector.Unboxed main = do let a = enumFromN 1 100000001 :: Vector Int print $ Data.Vector.Unboxed.length a print $ a ! 40000000 [1 of 1] Compiling Main ( vec_in_40000000.hs, vec_in_40000000.o ) Linking vec_in_40000000 ... 100000001 40000001 real 0m0.317s user 0m0.117s sys 0m0.197s import Data.Vector.Unboxed main = do let a = enumFromN 1 100000001 :: Vector Int print $ Data.Vector.Unboxed.length a print $ a ! 80000000 [1 of 1] Compiling Main ( vec_in_80000000.hs, vec_in_80000000.o ) Linking vec_in_80000000 ... 100000001 80000001 real 0m0.329s user 0m0.133s sys 0m0.193s

The vector indexing using (!) should run in constant time. However, as the Data.Vector.Unboxed haddock documentation states, enumFromN allocates and populates the vector in linear time using its stream code. I'm not familiar with the stream code, but it looks to be of similar nature to the basic list type. Nick On Friday, August 03, 2012 04:45:38 AM Ivan Vyalov wrote:
Hi everyone!
I have a question about time complexity of vector indexing. Obviously, it should be constant, but if I do the following naive tests it looks linear. What do I do wrong?
Ivan
$ for i in 10000000 20000000 40000000 80000000 ; do cat vec_in_"$i".hs; ghc -fforce-recomp -O2 --make vec_in_"$i".hs && time ./vec_in_"$i"; done import Data.Vector.Unboxed main = do let a = enumFromN 1 100000001 :: Vector Int print $ a ! 10000000
[1 of 1] Compiling Main ( vec_in_10000000.hs, vec_in_10000000.o ) Linking vec_in_10000000 ... 10000001
real 0m0.033s user 0m0.032s sys 0m0.000s import Data.Vector.Unboxed main = do let a = enumFromN 1 100000001 :: Vector Int print $ a ! 20000000
[1 of 1] Compiling Main ( vec_in_20000000.hs, vec_in_20000000.o ) Linking vec_in_20000000 ... 20000001
real 0m0.062s user 0m0.060s sys 0m0.000s import Data.Vector.Unboxed main = do let a = enumFromN 1 100000001 :: Vector Int print $ a ! 40000000
[1 of 1] Compiling Main ( vec_in_40000000.hs, vec_in_40000000.o ) Linking vec_in_40000000 ... 40000001
real 0m0.125s user 0m0.120s sys 0m0.004s import Data.Vector.Unboxed main = do let a = enumFromN 1 100000001 :: Vector Int print $ a ! 80000000
[1 of 1] Compiling Main ( vec_in_80000000.hs, vec_in_80000000.o ) Linking vec_in_80000000 ... 80000001
real 0m0.244s user 0m0.240s sys 0m0.000s
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners

Ivan Vyalov wrote:
Hi everyone!
I have a question about time complexity of vector indexing. Obviously, it should be constant, but if I do the following naive tests it looks linear. What do I do wrong?
Creating the vector still takes time proportional to the length of the vector. In fact, it appears that in your example, the vector packages optimizes the creation time to create only up to the element that you actually demand. The linear time you're seeing is not the result of an inefficiency of vector indexing, but the result of an efficiency in vector creation. Best regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

On 08/03/2012 01:50 PM, Heinrich Apfelmus wrote:
Ivan Vyalov wrote:
Hi everyone!
I have a question about time complexity of vector indexing. Obviously, it should be constant, but if I do the following naive tests it looks linear. What do I do wrong?
Creating the vector still takes time proportional to the length of the vector. In fact, it appears that in your example, the vector packages optimizes the creation time to create only up to the element that you actually demand.
The linear time you're seeing is not the result of an inefficiency of vector indexing, but the result of an efficiency in vector creation.
Best regards, Heinrich Apfelmus
-- http://apfelmus.nfshost.com
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
Thank you, I've got it. Thanks to other guys who replied to! Ivan

Thank you for that insight. :)
On Fri, Aug 3, 2012 at 4:50 AM, Heinrich Apfelmus wrote: Creating the vector still takes time proportional to the length of the
vector. In fact, it appears that in your example, the vector packages
optimizes the creation time to create only up to the element that you
actually demand. The linear time you're seeing is not the result of an inefficiency of
vector indexing, but the result of an efficiency in vector creation. Best regards,
Heinrich Apfelmus --
--
Regards,
KC
participants (6)
-
Alexander Dunlap
-
Heinrich Apfelmus
-
Ivan Vyalov
-
Ivan Vyalov
-
KC
-
Nick Vanderweit