
I personally find doing higher order functions with IO extremely difficult, and the resulting compiler errors are often downright scary. But this is probably a direct consequence my rather limited understanding of the monadic operators that the do notion hides. It seems that one has to be /very/ good in Haskell before one can do IO effectively. Hmm.
Well, as good as a necromancer has to be :) But seriously, I think the most mind boggling point about IO actions is that those are higher order, too. I for myself have taken the point of view that values of type (IO a) are just "plans" that will be executed when the main Haskell program has done its work. The programs task is to assemble the "main plan" by sequencing smaller "plans" with (>>=).
But back to the point, using Haskell lists for representing a largish buffer of, say, 16-bit samples that is constantly being refreshed from hard disk and sent into the sound system for playback would probably be inefficient beyond imagination.
Lists are indeed not suited for massive byte crunching. If they fit into a black box, you can of course outsource things into C. In case you were writing a track scheduler à la iTunes, the black box most likely would be a single function (playSoundFile) which does the job of handling data from disk to the sound card. Actual sound processing with Haskell functions is more involved. As already said, specifying loops over the samples as higher order functions will save you a lot of headaches. The point is that one just cannot start writing Haskell code and hope that it will run as fast as a tight loop in C. Instead, one should do aggressive optimizations at a few critical points only. And these are exactly the higher order loops. I have to admit that (accumM) is not very fast because it is able to change data at arbitrary indexes which therefore have to be kept around. Most often, you want to process each index exactly once which is better expressed as a (map) or a (fold). As Bulat and Duncan already said, Data.ByteString does exactly this for arrays of bytes. Using it or the generalization promised by Duncan is likely to be the best way to go. On the implementation level, lazy evaluation is in the way when crunching bytes. So be sure to tell GHC to make things strict and to unbox and inline everything it can put its hands on by giving appropriate command line options. As others already pointed out, the details are on http://haskell.org/haskellwiki/Performance As a first test, you may want to compile your original code with -O3 -optc-O3 -funfolding-use-threshold=16 and explore what happens. GHC does a good job with strictness analysis and it's of no use to drown your code in strictness annotations. Of course, some well placed ones may hint GHC about things it overlooked. To mention yet another approach, the image synthesis library Pan http://conal.net/pan/ pretends that the programmer writes ordinary Haskell functions, but under the hood, it's a tiny programming language that gets compiled to C++. Of course, the main point is that the image synthesis combinators are strictly less powerful than full Haskell. But as the full power is unneeded in that context, this doesn't hurt. While there are audio related projects, http://haskell.org/haskellwiki/Libraries_and_tools/Music_and_sound I don't know whether they focus on speed. Regards, afpelmus