I actually don't know that it would be the bottleneck if I ported it to Haskell at this time, as lots of other things will change speed, and I may lose for other reasons: The memory layout is less efficient, the cache management overhead is higher, etc.

I do know that if I were to reintroduced a global lock on the BDD cache in the current implementation I have in C++ which I'm looking at porting to Haskell, it definitely would be a throughput bottleneck as that was precisely what I had to fix to make it fast in the first place. =) I originally had a single lock between me and the BDD cache and couldn't scale up nicely to multiple cores.

There, in order to get decent scaling, I had to carefully split out out my cache Herlihy-style into a bunch of separate locks I hash down to, in order to get decent concurrency and the speedup was quite dramatic.

But replicating that logic here it does a lot less good, as I'll wind up just waiting on a different lock to create the weak reference whenever I need to instantiate a new variable split in the BDD.

Now, that isn't entirely apples-to-apples. In theory if I have a high enough hit rate on the hash-cons table for my BDDs then I'm producing a lot fewer new weak references from scratch than hits against the BDD cache.

So if I get a 65% hit rate, I only get 35% of the benefit from a fully parallel-friendly weak reference creation, but even a 35% regression from the current scaling I have to the original scaling I had would be pretty bad.

-Edward

On Thu, May 29, 2014 at 3:38 AM, Simon Marlow <marlowsd@gmail.com> wrote:

On 28/05/2014 12:16, Edward Kmett wrote:

How hard /would/ it be to replace the global weak pointer lock with

something that scales better?

I'm looking at switching some of my older BDD code into Haskell.

To do so I'd love to be able to use an "intuitive" weak-pointer based
cache management scheme, but I have to confess the notion of a global
lock getting in the way basically damns whatever solution I come up with
to be a single-threaded toy. =/

If I'm trying to sell Haskell's parallelism to others, it seems my only
winning moves right now are:

1.) play less satisfying games with typed regions, which scale less well
than the tried and tested solutions of GCing hash-consed BDD structures.

2.) just tackle the problem that allocating a weak pointer grabs a
global lock.

3.) not to play the game

Do you know that the global lock is the bottleneck? I suggest trying it first, if it's not too hard to do the experiment. Then we'll know how big an issue we're dealing with.

Cheers,
Simon