
One thing that is hard to communicate to people in industry is what value laziness has. You can point people at the WhyFP paper, and talk about infinite data structures and decoupling generation from search until you're blue in the face, but it's so far beyond the day-to-day work of regular programmers that they don't really comprehend. But I was talking to Simon Marlow yesterday and he pointed out something that strikes me as a great example to give of "what laziness buys you". Lets talk about atomic changes. In particular, I want to examine the function atomicModifyIORef :: IORef a -> (a -> (a,b)) -> IO b This takes an IORef, reads the value out of it, calls some code that gets a new value and may return some derived data, writes the new value into the IORef, and returns the derived data. It has the invariant that no other change to that IORef can happen in between. Now, there are all sorts of issues like "what if 'f' has a side effect" that get sidestepped entirely by purity, but I don't want to talk about those; I think the benefits of purity are easier to argue in many contexts. Lets just look at a simple Haskell implementation:
atomicModifyIORef p f = do a <- readIORef p let r = f a writeIORef p (fst r) return (snd r)
This is a non-atomic implementation, but we can fix it assuming we have "compare-and-swap" available; any language with boxed types can do this:
-- low-level primitive that uses pointer equality behind the scenes compareAndSwap# :: IORef a -> a -> a -> IO Bool
atomicModifyIORef p f = do a <- readIORef p let r = f a let aNew = fst r ok <- compareAndSwap# p a aNew if ok then return (snd r) else atomicModifyIORef p f
This loops until compareAndSwap# succeeds and is a simple lock-free implementation. But what if f takes a long time to run? In a strict language, if there was contention on p, it's likely that atomicModifyIORef never finishes executing due to starvation! But since Haskell is lazy, each attempt to modify p is a constant-time operation! We just allocate thunks for "r" and "aNew" and do the compare-and-swap. Now, the responsibility for executing f is shifted to anyone who reads from the IORef; they get a thunk. It's likely that this thread will do so, as the value returned by atomicModifyIORef depends on f executing, but if this thread gets suspended or there is contention, the next reader can pick up the slack! In fact, at the low level we can improve this further; we can build the thunks for "r" and "aNew" with a hole to put the value we read out of the IORef; the inner loop then looks like this:
a <- readIORef p update thunk hole in "r" with "a" ok <- compareAndSwap# p a aNew
which compiles down to just a pointer read, assignment, and CAS instruction. This blew my mind; from the "imperative, strict" point of view, this sort of operation is impossible to do reasonably, and here in Haskell we can get a great "probability of success"; the "critical section" where we have to loop if someone else modified the value behind our back is just a few instructions long! -- ryan
participants (1)
-
Ryan Ingram