Foreign function performance: monadic vs pure - Haskell-Cafe - Haskell.org

newer
Stacking data types

Foreign function performance: monadic vs pure

older
Emscripten: compiling LLVM to...

Serguei Son

11 Apr 2011 11 Apr '11

12:09 p.m.

Consider two versions of sin wrapped: foreign import ccall "math.h sin" c_sin_m :: CDouble -> IO CDouble and foreign import ccall "math.h sin" c_sin :: CDouble -> CDouble One can invoke them so: mapM c_sin_m [1..n] mapM (return . c_sin) [1..n] On my computer with n = 10^7 the first version never finishes, whereas the second one calculates the result within seconds. To give you my context, I need to call a random variable generator multiple times, so that it must return IO a. Any explanation for this behavior?

Reply

Sign in to reply online Use email software

Show replies by date

Maciej Marcin Piechotka

11 Apr 11 Apr

1:14 p.m.

On Mon, 2011-04-11 at 12:09 +0000, Serguei Son wrote:

Consider two versions of sin wrapped: foreign import ccall "math.h sin" c_sin_m :: CDouble -> IO CDouble and foreign import ccall "math.h sin" c_sin :: CDouble -> CDouble

One can invoke them so:

mapM c_sin_m [1..n] mapM (return . c_sin) [1..n]

On my computer with n = 10^7 the first version never finishes, whereas the second one calculates the result within seconds.

To give you my context, I need to call a random variable generator multiple times, so that it must return IO a.

Any explanation for this behavior?

Simple (but possibly wrong) - the first one is always evaluated (as it might have side-effects) while the second one is left in unevaluated form (return does not force effects): (values for 2^14)

mapM c_sin_m [1..n] 1.087 s mapM (return . c_sin) [1..n] 0.021 s mapM (\x -> return $! c_sin x) [1..n] 1.160 s return $ map c_sin [1..n] 0.006 s mapM (const (return undefined)) [1..n] 0.011 s

I.e. - c_sin_m have forced evaluation so you do 10^7 times save of Haskell context (it is not marked as unsafe) and call of function - return . c_sin have not forced evaluation so you do 10^7 times wrap unevaluated value into IO To compare:

foreign import ccall unsafe "math.h sin" c_sin_um :: CDouble -> IO CDouble

foreign import ccall unsafe "math.h sin" c_sin_u :: CDouble -> CDouble

main = mapM c_sin_um [1..n] 0.028 s main = mapM (\x -> return $! c_sin_u) [1..n] 0.012 s main = mapM (return . c_sin_u) [1..n] 0.023 s

I.e. it is difference in laziness of Haskell and the making sure that function may safely call back to Haskell (which sin does not). Regards

Reply

Sign in to reply online Use email software

Felipe Almeida Lessa

1:21 p.m.

On Mon, Apr 11, 2011 at 10:14 AM, Maciej Marcin Piechotka wrote:

...
main = mapM (\x -> return $! c_sin_u) [1..n] 0.012 s

This should be main = mapM (\x -> return $! c_sin_u x) [1..n] -- Felipe.

Reply

Sign in to reply online Use email software

Serguei Son

1:55 p.m.

Felipe Almeida Lessa writes:

On Mon, Apr 11, 2011 at 10:14 AM, Maciej Marcin Piechotka wrote:

...
...
main = mapM (\x -> return $! c_sin_u) [1..n] 0.012 s

This should be

main = mapM (\x -> return $! c_sin_u x) [1..n]

So if I must use a safe function returning IO a, there is no way to improve its performance? To give you a benchmark, calling gsl_ran_ugaussian a million times in pure C takes only a second or two on my system.

Reply

Sign in to reply online Use email software

Serguei Son

2:02 p.m.

Serguei Son writes:

Felipe Almeida Lessa writes:

...
On Mon, Apr 11, 2011 at 10:14 AM, Maciej Marcin Piechotka wrote:

...
...
main = mapM (\x -> return $! c_sin_u) [1..n] 0.012 s

This should be

main = mapM (\x -> return $! c_sin_u x) [1..n]

So if I must use a safe function returning IO a, there is no way to improve its performance? To give you a benchmark, calling gsl_ran_ugaussian a million times in pure C takes only a second or two on my system.

Also, please note that I can force the evaluation of c_sin, e.g. mapM (return . c_sin) [1..n] >>= (print $ foldl' (+) 0) And it will still execute reasonably fast.

Reply

Sign in to reply online Use email software

Serguei Son

2:18 p.m.

Serguei Son writes:

Also, please note that I can force the evaluation of c_sin, e.g.

mapM (return . c_sin) [1..n] >>= (print $ foldl' (+) 0)

And it will still execute reasonably fast.

Pls disregard the my previous post. I actually meant let lst = map c_sin [1..n] print $ foldl' (+) 0 lst This executes in 0.2 s for n = 10^7. c_sin is safe, as well as c_sin_m. The only difference is CDouble -> CDouble vs CDouble -> IO CDouble.

Reply

Sign in to reply online Use email software

Gregory Collins

6:36 p.m.

On Mon, Apr 11, 2011 at 3:55 PM, Serguei Son wrote:

So if I must use a safe function returning IO a, there is no way to improve its performance? To give you a benchmark, calling gsl_ran_ugaussian a million times in pure C takes only a second or two on my system.

In the C version, are you also producing a linked list containing all of the values? Because that's what mapM does. Your test is mostly measuring the cost of allocating and filling ~3 million machine words on the heap. Try mapM_ instead. G -- Gregory Collins

Reply

Sign in to reply online Use email software

Anthony Cowley

3:17 p.m.

On Mon, Apr 11, 2011 at 8:09 AM, Serguei Son wrote:

Consider two versions of sin wrapped: foreign import ccall "math.h sin" c_sin_m :: CDouble -> IO CDouble

Marking this call as unsafe (i.e. foreign import ccall unsafe "math.h sin") can improve performance dramatically. If the FFI call is quick, then I believe this is the recommended approach. If you really need the imported function to be thread safe, then perhaps you should move more of the calculation into C to decrease the granularity of FFI calls. It is remarkably easy to get the meanings of safe and unsafe confused, and I can't even see the word "unsafe" in the current FFI user's guide! http://www.haskell.org/ghc/docs/7.0.3/html/users_guide/ffi-ghc.html Anthony

and foreign import ccall "math.h sin" c_sin :: CDouble -> CDouble

One can invoke them so:

mapM c_sin_m [1..n] mapM (return . c_sin) [1..n]

On my computer with n = 10^7 the first version never finishes, whereas the second one calculates the result within seconds.

To give you my context, I need to call a random variable generator multiple times, so that it must return IO a.

Any explanation for this behavior?

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply

Sign in to reply online Use email software

5213

Age (days ago)

5213

Last active (days ago)

Download

7 comments

5 participants

tags

participants (5)

Anthony Cowley
Felipe Almeida Lessa
Gregory Collins
Maciej Marcin Piechotka
Serguei Son