Others have already discussed this in terms of GHC's model of IO, but as Tom Ellis indicates, this model is a bit screwy, and not really the best way to think about it. I think it is much more useful to think of it in terms of a "free monad". That is, think about the `IO` type as a *data structure*. An `IO a` value is a sort of recipe for producing a value of type `a`. That is,
data IO :: * -> * where
ReturnIO :: a -> IO a
BindIO :: IO a -> (a -> IO b) -> IO b
HPutStr :: Handle -> String -> IO ()
HGetStr :: Handle -> IO String
....
And then think about the runtime system as an interpreter whose job is to run the programs represented by these IO values.