Difference of time execution times when measuring with time and profiling

HI all, I am messing around with bang patterns and noticed some huge differences between the total time as reported by the time tool and the .prof file. Below is the code used. Without bang patterns: module Main where import Data.List fastFibs = unfoldr nextFib (1, 1) where nextFib (x, y) = Just $ (x, (y, (x + y))) main = putStrLn $ (show n) ++ "th fib is: " ++ (show $ fastFibs !! (n - 1)) where n = 1000000 With bang patterns: {-# LANGUAGE BangPatterns #-} module Main where import Data.List fastFibs = unfoldr nextFib (1, 1) where nextFib (!x, !y) = Just $ (x, (y, (x + y))) main = putStrLn $ (show n) ++ "th fib is: " ++ (show $ fastFibs !! (n - 1)) where n = 1000000 when looking at the first through time and prof I get the following. Without: real 0m53.501s user 0m0.015s sys 0m0.328s Thu Oct 22 16:46 2015 Time and Allocation Profiling Report (Final) fast-fib.exe +RTS -p -RTS total time = 9.52 secs (9520 ticks @ 1000 us, 1 processor) total alloc = 43,500,223,152 bytes (excludes profiling overheads) Please note the huge difference 53 vs 9 seconds. With: real 0m10.095s user 0m0.031s sys 0m0.344s Thu Oct 22 16:50 2015 Time and Allocation Profiling Report (Final) fast-fib.exe +RTS -p -RTS total time = 8.97 secs (8971 ticks @ 1000 us, 1 processor) total alloc = 43,500,309,960 bytes (excludes profiling overheads) Here differences seem to be much smaller. I am using Windows 8.1 64 bit, GHC 7.8.3 and measuring with the following line: ghc Main.hs -o fast-fib.exe -O2 -prof && time ./fast-fib.exe +RTS -p && cat fast-fib.prof Could someone please explain where the big difference is coming from and how to change the measuring approach to get more consistent results? Best regards, Javier de Vega Ruiz.

Javier,
On 23 October 2015 at 03:00, Javier de Vega Ruiz
I am messing around with bang patterns and noticed some huge differences between the total time as reported by the time tool and the .prof file. Below is the code used. Without bang patterns: module Main where
import Data.List
fastFibs = unfoldr nextFib (1, 1) where nextFib (x, y) = Just $ (x, (y, (x + y)))
main = putStrLn $ (show n) ++ "th fib is: " ++ (show $ fastFibs !! (n - 1)) where n = 1000000
With bang patterns: {-# LANGUAGE BangPatterns #-}
module Main where
import Data.List
fastFibs = unfoldr nextFib (1, 1) where nextFib (!x, !y) = Just $ (x, (y, (x + y)))
main = putStrLn $ (show n) ++ "th fib is: " ++ (show $ fastFibs !! (n - 1)) where n = 1000000
when looking at the first through time and prof I get the following. Without: real 0m53.501s user 0m0.015s sys 0m0.328s Thu Oct 22 16:46 2015 Time and Allocation Profiling Report (Final)
fast-fib.exe +RTS -p -RTS
total time = 9.52 secs (9520 ticks @ 1000 us, 1 processor) total alloc = 43,500,223,152 bytes (excludes profiling overheads)
Please note the huge difference 53 vs 9 seconds.
With: real 0m10.095s user 0m0.031s sys 0m0.344s Thu Oct 22 16:50 2015 Time and Allocation Profiling Report (Final)
fast-fib.exe +RTS -p -RTS
total time = 8.97 secs (8971 ticks @ 1000 us, 1 processor) total alloc = 43,500,309,960 bytes (excludes profiling overheads)
Here differences seem to be much smaller.
I am using Windows 8.1 64 bit, GHC 7.8.3 and measuring with the following line: ghc Main.hs -o fast-fib.exe -O2 -prof && time ./fast-fib.exe +RTS -p && cat fast-fib.prof
Could someone please explain where the big difference is coming from and how to change the measuring approach to get more consistent results?
Before going into why the numbers are what they are, I think there is something wrong with your time tool on Windows. Your "user" time is suspiciously low in both measurements. When I ran your program on Linux, the report from the -s RTS option and the "real" and "user" numbers from the time tool were within milliseconds of each other (qualification: I used GHC 7.10.2). Apart from that, I am pretty sure the -p RTS option and the time tool are not measuring the same thing. I think the profiler samples "ticks" in the runtime, whereas the time tool and the -s option probably use CPU performance counters. Somebody more familiar with the GHC runtime should be able to give you more detail. -- Thomas Koster

Thanks Thomas,
I tried in Linux and the behavior of the time tool seems more reasonable
now:
real 0m46.743s
user 0m45.888s
sys 0m0.160s
After checking some of the RTS documentation, it turns out the GC's cost
centre is not included by default, in order to include I had to change the
-p to -pa which results in:
fast-fib +RTS -pa -RTS
total time = 47.43 secs (47433 ticks @ 1000 us, 1 processor)
total alloc = 44,128,957,088 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc ticks bytes
GC GC 86.0 0.0 40778 0
fastFibs.nextFib Main 13.1 99.6 6228 43952857104
This clearly states where that 86% difference was coming from, the GC,
which means the world makes sense again :)
Does someone know how to add the cost centre for GC without having to use
-pa?
Best regards,
Javier de Vega Ruiz.
On Fri, Oct 23, 2015 at 1:27 AM, Thomas Koster
Javier,
I am messing around with bang patterns and noticed some huge differences between the total time as reported by the time tool and the .prof file. Below is the code used. Without bang patterns: module Main where
import Data.List
fastFibs = unfoldr nextFib (1, 1) where nextFib (x, y) = Just $ (x, (y, (x + y)))
main = putStrLn $ (show n) ++ "th fib is: " ++ (show $ fastFibs !! (n - 1)) where n = 1000000
With bang patterns: {-# LANGUAGE BangPatterns #-}
module Main where
import Data.List
fastFibs = unfoldr nextFib (1, 1) where nextFib (!x, !y) = Just $ (x, (y, (x + y)))
main = putStrLn $ (show n) ++ "th fib is: " ++ (show $ fastFibs !! (n - 1)) where n = 1000000
when looking at the first through time and prof I get the following. Without: real 0m53.501s user 0m0.015s sys 0m0.328s Thu Oct 22 16:46 2015 Time and Allocation Profiling Report (Final)
fast-fib.exe +RTS -p -RTS
total time = 9.52 secs (9520 ticks @ 1000 us, 1
On 23 October 2015 at 03:00, Javier de Vega Ruiz
wrote: processor) total alloc = 43,500,223,152 bytes (excludes profiling
overheads)
Please note the huge difference 53 vs 9 seconds.
With: real 0m10.095s user 0m0.031s sys 0m0.344s Thu Oct 22 16:50 2015 Time and Allocation Profiling Report
(Final)
fast-fib.exe +RTS -p -RTS
total time = 8.97 secs (8971 ticks @ 1000 us, 1
processor)
total alloc = 43,500,309,960 bytes (excludes profiling
overheads)
Here differences seem to be much smaller.
I am using Windows 8.1 64 bit, GHC 7.8.3 and measuring with the following line: ghc Main.hs -o fast-fib.exe -O2 -prof && time ./fast-fib.exe +RTS -p &&
cat
fast-fib.prof
Could someone please explain where the big difference is coming from and how to change the measuring approach to get more consistent results?
Before going into why the numbers are what they are, I think there is something wrong with your time tool on Windows. Your "user" time is suspiciously low in both measurements. When I ran your program on Linux, the report from the -s RTS option and the "real" and "user" numbers from the time tool were within milliseconds of each other (qualification: I used GHC 7.10.2).
Apart from that, I am pretty sure the -p RTS option and the time tool are not measuring the same thing. I think the profiler samples "ticks" in the runtime, whereas the time tool and the -s option probably use CPU performance counters. Somebody more familiar with the GHC runtime should be able to give you more detail.
-- Thomas Koster _______________________________________________ Beginners mailing list Beginners@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
participants (2)
-
Javier de Vega Ruiz
-
Thomas Koster