[Haskell-cafe] Right approach to profiling and optimizing a concurrent data structure?