glad I could help, 
https://github.com/wellposed/hblas/blob/master/src/Numerical/HBLAS/BLAS/Internal.hs#L146 is 
an example of the "choose to do the safe vs unsafe ffi call" trick
in the case of blas / lapack routines, i can always estimate how long a compute job will take as a function of its inputs, and i use that estimate to decide which ffi strategy to use (ie i use unsafe ffi on < 10 microsecond computations so that the overhead doesn't dominate the compute time on tiny inputs)


On Thu, Aug 14, 2014 at 5:38 PM, Christian Höner zu Siederdissen <choener@tbi.univie.ac.at> wrote:
That's actually a great idea, especially since the safe variants of the
calls are already in place.

* Carter Schonwald <carter.schonwald@gmail.com> [14.08.2014 23:10]:
>    have a smart wrapper around you ffi call, and if when you think the ffi
>    call will take more than 1 microsecond, ALWAYS use the safe ffi call,
>    i do something like this in an FFI i wrote, it works great
>
>    On Thu, Aug 14, 2014 at 1:20 PM, Christian HAP:ner zu Siederdissen
>    <choener@tbi.univie.ac.at> wrote:
>
>      Thanks,
>
>      I've played around some more and finally more than one capability is
>      active. And indeed, unsafe calls don't block everything. I /had/
>      actually read that but when I saw the system spending basically only
>      100% cpu time, I'd thought to ask.
>
>      One problem with this program seems to be that the different tasks are
>      of vastly different sizes. Inputs range from ~ 7x10^1 to ~ 3x10^7
>      elements inducing waits with the larger problem sizes.
>
>      We'll keep the program single-threaded for now as this also keeps memory
>      consumption at only 25 gbyte instead of the more impressive 70 gbyte in
>      multi-threaded mode ;-)
>
>      Viele Gruesse,
>      Christian
>
>      _______________________________________________
>      Glasgow-haskell-users mailing list
>      Glasgow-haskell-users@haskell.org
>      http://www.haskell.org/mailman/listinfo/glasgow-haskell-users