Hello,

I was looking at cleaning up my refactoring a core loop of template rendering to go from a loop with many parameters

loop :: RenderConfig -> BlockMap -> InputBucket m -> Builder -> [Pieces] -> ExceptT StrapError m Builder

to a looped state monad transformer

loop :: [Pieces] -> RenderT m Builder

newtype RenderT m a = RenderT 
  { runRenderT :: ExceptT StrapError (StateT (RenderState m) m) a 
  } deriving ( Functor, Applicative, Monad, MonadIO )

data RenderState m = RenderState
  { position     :: SourcePos
  , renderConfig :: RenderConfig
  , blocks       :: BlockMap
  , bucket       :: InputBucket m
  }

however, there is a big slow down (about 6-10x) using a StateT. I think it might have something to do with laziness but I am not exactly sure of where to begin in tracking it down. Swapping out the Lazy State to a Strict State helps a little (only a 5x slow down)

You can find some of the processing code here:

https://github.com/hansonkd/StrappedTemplates/blob/321a88168d54943fc217553c873f188797c0d4f5/src/Text/Strapped/Render.hs#L189

With my old loop commented out.

Its messy right now since I am just trying a number of different approaches. I did some more work factoring out the lifts, trying different iterations of foldlM and stuff but that didn't have that much of an effect on performance.

After profiling I see in the StateT, the report has a lot more CAFs and garbage collecting.

Here is the profiling report from my original version w/o StateT
http://lpaste.net/108995

Slow version with StateT
http://lpaste.net/108997

Here is the "makeBucket" function that is referenced (it is the same in both state and nonstate):

https://github.com/hansonkd/StrappedTemplates/blob/321a88168d54943fc217553c873f188797c0d4f5/examples/big_example.hs#L24

Looking at stacked overflow and the official docs I have gotten an idea of what is going on. The heaps generated between them tells me that a lot more memory is being allocated to lists. These heaps were generated running my render function against a template with nested loops and a list of elements.

http://imgur.com/a/2jOIf

I am hoping that maybe someone could give me a hint at what to look at next. I've played around with Strictness and refactoring loops to no avail and now am kind of stuck. Any help would be appreciated.

--
Kyle Hanson