
Hi Haskell Café! I'm writing a perl/python like string templating system which I plan to release soon: darcs get http://darcs.johantibell.com/template The goal is to provide simple string templating; no inline code, etc.. An alternative to printf and ++. Example usage:
import qualified Data.ByteString as B import Text.Template
helloTemplate = "Hello, $name! Would you like some ${fruit}s?" helloContext = [("name", "Johan"), ("fruit", "banana")]
test1 = B.putStrLn $ substitute (B.pack helloTemplate) helloContext
I want to make it perform well, especially when creating a template once and then rendering it multiple times. "Compiling" the template is a separate step from rendering in this use case:
compiledTemplate = template $ B.pack helloTemplate
test2 = B.putStrLn $ render compiledTemplate helloContext
A template is represented by a list of template fragments, each fragment is either a ByteString literal or a variable which is looked up in the "context" when rendered.
data Frag = Lit ByteString | Var ByteString newtype Template = Template [Frag]
This leads me to my first question. Would a lazy ByteString be better or worse here? The templates are of limited length. I would say the length is usually between one paragraph and a whole HTML page. The Template data type already acts a bit like a lazy ByteString since it consists of several chunks (although the chunck size is not adjusted to the CPU cache size like with the lazy ByteString). Currently the context in which a template is rendered is represented by a type class.
class Context c where lookup :: ByteString -> c -> Maybe ByteString
instance Context (Map String String) where lookup k c = liftM B.pack (Map.lookup (B.unpack k) c)
instance Context (Map ByteString ByteString) where lookup = Map.lookup
-- More instance, for [(String, String)], etc.
I added this as a convenience for the user, mainly to work around the problem of not having ByteString literals. A typical usage would have the keys in the context being literals and the values some variables:
someContext = Map.fromList [("name", name), ("fruit", fruit)]
I'm not sure if this was a good decision, With this I'm halfway to the (in)famous Stringable class and it seems like many smarter people than me have avoided introducing such a class. How will this affect performace? Take for example the rendering function:
render :: Context c => Template -> c -> ByteString render (Template frags) ctx = B.concat $ map (renderFrag ctx) frags
renderFrag :: Context c => c -> Frag -> ByteString renderFrag ctx (Lit s) = s renderFrag ctx (Var x) = case Text.Template.lookup x ctx of Just v -> v Nothing -> error $ "Key not found: " ++ (B.unpack x)
How will the type dictionary 'c' hurt performance here? Would specializing the function directly in render help?
render (Template frags) ctx = B.concat $ map (renderFrag f) frags where f = flip Text.Template.lookup ctx
renderFrag f (Var x) = case f x of
I can see the implementation taking one of the following routes: - Go full Stringable, including for the Template - Revert to Context = Map ByteString ByteString which was the original implementation. - Some middle road, without MPTC, for example:
class Context c where lookup :: ByteString -> c ByteString ByteString -> Maybe ByteString This would allow the user to supply some more efficient data type for lookup but not change the string type. Having a type class would allow me to provide things like the possibility to create a Context from a record where each record accessor function would server as key. Something like:
data Person { personName :: String, personAge :: Int } would get converted (using Data?) to: personContext = [("personName", show $ personName aPerson), ("personAge", show $ personAge aPerson)] but not actually using a Map but the record itself.
I guess my more general question is: how do I reason about the performance of my code or any code like this? Are there any other performance improvements that could be made? Also, I would be grateful if someone could provide some feedback on the implementation, anything goes! I still have some known TODOs: - Import error messages for invalid uses of "$". - Improve the regex usage overall. - Add some more functions; the plan is to add those function which could be expressed in efficiently with the current interface. An example is things like renderAndWrite, when writing doing a B.concat first is unnecessary. Cheers, Johan Tibell