Re: [Haskell-cafe] Performance question

Hi, thanks for your input.
You can get reasonable numeric performance out of GHC, but you need to work at it. There is some advice in the GHC manual at http://www.haskell.org/ghc/docs/latest/html/users_guide/faster.html
I am using -O2 and strictness already. Currently i can only imagine to define a data type in order to use unboxed Ints instead of the accumulator tuple. The thing is that i don't see in the profile output yet what to improve. There are some allocations going on in "main", but i don't know what causes it.
The first thing I would do is replace your isInCircle :: (Floating a, Ord a) => (a,a) -> Bool with isInCircle :: (Double, Double) -> Bool
Can you point me to why that matters?
Ben.
On 26/02/2009, at 8:53 PM, haskell@kudling.de wrote:
Hi,
i have compared a C++ implementation with a Haskell implementation of the Monte Carlo Pi approximation:
http://lennart.kudling.de/haskellPi/
The Haskell version is 100 times slower and i wonder whether i do something obvious wrong.
Profiling says that the majority of the time is spend in "main". But i have no idea where.
Can someone give me a hint?
Thanks, Lenny

On 26/02/2009, at 9:27 PM, haskell@kudling.de wrote:
Currently i can only imagine to define a data type in order to use unboxed Ints instead of the accumulator tuple.
That would probably help a lot. It would also help to use two separate Double# parameters instead of the tuple.
The thing is that i don't see in the profile output yet what to improve. There are some allocations going on in "main", but i don't know what causes it.
The first thing I would do is replace your isInCircle :: (Floating a, Ord a) => (a,a) -> Bool with isInCircle :: (Double, Double) -> Bool
Can you point me to why that matters?
At the machine level, GHC treats the (Floating a, Ord a) as an extra argument to the function. This argument holds function pointers that tell it how to perform multiplication and <= for the unknown type 'a'. If you use Double instead of 'a', then it's more likely to use the actual machine op. Ben.

Ben Lippmeier wrote:
The first thing I would do is replace your isInCircle :: (Floating a, Ord a) => (a,a) -> Bool with isInCircle :: (Double, Double) -> Bool
Can you point me to why that matters?
At the machine level, GHC treats the (Floating a, Ord a) as an extra argument to the function. This argument holds function pointers that tell it how to perform multiplication and <= for the unknown type 'a'. If you use Double instead of 'a', then it's more likely to use the actual machine op.
I'd recommend use of a SPECIALIZE pragma instead of rewriting the code itself: http://www.haskell.org/ghc/docs/latest/html/users_guide/pragmas.html (section 8.13.8) Ganesh =============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ===============================================================================

Here is a variant that uses mersenne-random-pure64 and works less than
2x slower than C++:
- You don't need to compute samples count an extra time
- You don't need to assemble double pairs from a list
- Notice the strictness in randomDoublePairs: it doubled performance
{-# LANGUAGE BangPatterns #-}
import System.Random.Mersenne.Pure64
import System( getArgs )
import Data.List( foldl' )
isInCircle :: (Double,Double) -> Bool
isInCircle (!x,!y) = sqrt (x*x + y*y) <= 1.0
accumulateHit :: Int -> (Double,Double) -> Int
accumulateHit !hits pair = if isInCircle pair then hits + 1 else hits
monteCarloPi :: Int -> [(Double,Double)] -> Double
monteCarloPi n xs = 4.0 * fromIntegral hits / fromIntegral n
where hits = foldl' accumulateHit 0 . take n $ xs
randomDoublePairs g = let
!(!x,!g') = randomDouble g
!(!y,!g'') = randomDouble g'
in (x,y):randomDoublePairs g''
main = do
samples <- (read . head) `fmap` getArgs
randomNumbers <- randomDoublePairs `fmap` newPureMT
putStrLn . show $ monteCarloPi samples randomNumbers
jkff@*****:~/montecarlo$ time ./mc-hs 10000000
3.1417088
real 0m1.141s
user 0m1.140s
sys 0m0.000s
jkff@*****:~/montecarlo$ time ./mc 10000000
10000000
3.14113
real 0m0.693s
user 0m0.690s
sys 0m0.000s
2009/2/26 Ben Lippmeier
On 26/02/2009, at 9:27 PM, haskell@kudling.de wrote:
Currently i can only imagine to define a data type in order to use unboxed Ints instead of the accumulator tuple.
That would probably help a lot. It would also help to use two separate Double# parameters instead of the tuple.
The thing is that i don't see in the profile output yet what to improve. There are some allocations going on in "main", but i don't know what causes it.
The first thing I would do is replace your isInCircle :: (Floating a, Ord a) => (a,a) -> Bool with isInCircle :: (Double, Double) -> Bool
Can you point me to why that matters?
At the machine level, GHC treats the (Floating a, Ord a) as an extra argument to the function. This argument holds function pointers that tell it how to perform multiplication and <= for the unknown type 'a'. If you use Double instead of 'a', then it's more likely to use the actual machine op.
Ben.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Eugene Kirpichov Web IR developer, market.yandex.ru

On Thu, Feb 26, 2009 at 7:56 AM, Eugene Kirpichov
Here is a variant that uses mersenne-random-pure64 and works less than 2x slower than C++:
And I would like to notice that rand() is incredibly dumber than the Mersenne twister, so probably if we took rand()'s code from glibc and rewrote it in Haskell, there would be another performance increase. -- Felipe.

C's rand() function is very bad and should never be used really.
On Thu, Feb 26, 2009 at 11:37 AM, Felipe Lessa
On Thu, Feb 26, 2009 at 7:56 AM, Eugene Kirpichov
wrote: Here is a variant that uses mersenne-random-pure64 and works less than 2x slower than C++:
And I would like to notice that rand() is incredibly dumber than the Mersenne twister, so probably if we took rand()'s code from glibc and rewrote it in Haskell, there would be another performance increase.
-- Felipe. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Lennart Augustsson
C's rand() function is very bad and should never be used really.
On Linux (really GNU libc, I suppose) it is the same as random(). But for portability, one is encouraged to spend the two extra letters. I don't understand the details, but I think the Mersenne twister is much better in any case. -k -- If I haven't seen further, it is by standing in the footprints of giants

The random() function is only marginally better than rand().
On Fri, Feb 27, 2009 at 6:50 AM, Ketil Malde
Lennart Augustsson
writes: C's rand() function is very bad and should never be used really.
On Linux (really GNU libc, I suppose) it is the same as random(). But for portability, one is encouraged to spend the two extra letters. I don't understand the details, but I think the Mersenne twister is much better in any case.
-k -- If I haven't seen further, it is by standing in the footprints of giants

On 2009 Feb 27, at 1:50, Ketil Malde wrote:
Lennart Augustsson
writes: C's rand() function is very bad and should never be used really.
On Linux (really GNU libc, I suppose) it is the same as random(). But for portability, one is encouraged to spend the two extra letters. I don't understand the details, but I think the Mersenne twister is much better in any case.
Yes, much better than any LC PRNG including rand(), random(), and the System V/POSIX lrand48() family. That said, Linux making rand() == random() breaks portability in the case where you want a repeatable stream for testing purposes; as usual, Linux chucks portability out the window at any opportunity. (Sadly, POSIX permits this because of POSIX implementations for non-Unix hosts, although it does include a fixed-behavior PRNG as documentation for this case.) -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

How about an FFI call to rand() and then measure the performance
On Thu, Feb 26, 2009 at 3:37 AM, Felipe Lessa
On Thu, Feb 26, 2009 at 7:56 AM, Eugene Kirpichov
wrote: Here is a variant that uses mersenne-random-pure64 and works less than 2x slower than C++:
And I would like to notice that rand() is incredibly dumber than the Mersenne twister, so probably if we took rand()'s code from glibc and rewrote it in Haskell, there would be another performance increase.
-- Felipe. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Ben.Lippmeier:
On 26/02/2009, at 9:27 PM, haskell@kudling.de wrote:
Currently i can only imagine to define a data type in order to use unboxed Ints instead of the accumulator tuple.
That would probably help a lot. It would also help to use two separate Double# parameters instead of the tuple.
data T = T !Double !Double should be enough.

I looked at the core and the tuples were already unboxed IIRC.
2009/2/26 Don Stewart
Ben.Lippmeier:
On 26/02/2009, at 9:27 PM, haskell@kudling.de wrote:
Currently i can only imagine to define a data type in order to use unboxed Ints instead of the accumulator tuple.
That would probably help a lot. It would also help to use two separate Double# parameters instead of the tuple.
data T = T !Double !Double
should be enough. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Eugene Kirpichov Web IR developer, market.yandex.ru
participants (10)
-
Ben Lippmeier
-
Brandon S. Allbery KF8NH
-
David Leimbach
-
Don Stewart
-
Eugene Kirpichov
-
Felipe Lessa
-
haskell@kudling.de
-
Ketil Malde
-
Lennart Augustsson
-
Sittampalam, Ganesh