I was going to suggest something like this. The usual approach in audio software is to have a main processing loop that never yields (or calls threadDelay, or blocking operations). This loop runs at a fixed rate, and is responsible for handling the scheduling, mixing, writing to output, etc. You can have other threads that work on other processes, but if they're time-sensitive again you wouldn't want them to be de-scheduled, unless you can reschedule them in advance then wait.
Currently it's hard to do this very reliably in Haskell because the RTS has no support for thread priorities or anything like that. Even if you use posix-realtime, you'd still need to be aware that the Haskell RTS can unschedule threads, or move calculations around. And a stop-the-world GC could theoretically happen at any time.
That said, I've had decent success doing real-time work so long as the CPUs don't get completely saturated.
An alternate approach is to hand off the real audio processing to something like csound or OSC. At least with csound I know it's possible to pipe raw audio data via the API, and I imagine OSC has a facility for this as well.
John L