
Hello, I was looking at cleaning up my refactoring a core loop of template rendering to go from a loop with many parameters loop :: RenderConfig -> BlockMap -> InputBucket m -> Builder -> [Pieces] -> ExceptT StrapError m Builder to a looped state monad transformer loop :: [Pieces] -> RenderT m Builder newtype RenderT m a = RenderT { runRenderT :: ExceptT StrapError (StateT (RenderState m) m) a } deriving ( Functor, Applicative, Monad, MonadIO ) data RenderState m = RenderState { position :: SourcePos , renderConfig :: RenderConfig , blocks :: BlockMap , bucket :: InputBucket m } however, there is a big slow down (about 6-10x) using a StateT. I think it might have something to do with laziness but I am not exactly sure of where to begin in tracking it down. Swapping out the Lazy State to a Strict State helps a little (only a 5x slow down) You can find some of the processing code here: https://github.com/hansonkd/StrappedTemplates/blob/321a88168d54943fc217553c8... With my old loop commented out. Its messy right now since I am just trying a number of different approaches. I did some more work factoring out the lifts, trying different iterations of foldlM and stuff but that didn't have that much of an effect on performance. After profiling I see in the StateT, the report has a lot more CAFs and garbage collecting. Here is the profiling report from my original version w/o StateT http://lpaste.net/108995 Slow version with StateT http://lpaste.net/108997 Here is the "makeBucket" function that is referenced (it is the same in both state and nonstate): https://github.com/hansonkd/StrappedTemplates/blob/321a88168d54943fc217553c8... Looking at stacked overflow and the official docs I have gotten an idea of what is going on. The heaps generated between them tells me that a lot more memory is being allocated to lists. These heaps were generated running my render function against a template with nested loops and a list of elements. http://imgur.com/a/2jOIf I am hoping that maybe someone could give me a hint at what to look at next. I've played around with Strictness and refactoring loops to no avail and now am kind of stuck. Any help would be appreciated. -- Kyle Hanson