
I played with another approach without any parser library, just with plain pattern matching. The idea was to create function to match all different cases of codes. Since I already got most of the code, it was quite easy to do. The core function consist of cases like those:
parse ('\ESC':'[':'1':';':'4':'0':'m':rest) = modifyAndPrint (\x -> x
{ bgcol = light black })>> parse rest
parse ('\ESC':'[':'1':';':'4':'1':'m':rest) = modifyAndPrint (\x -> x
{ bgcol = light red })>> parse rest
parse ('\ESC':'[':'1':';':'4':'2':'m':rest) = modifyAndPrint (\x -> x
{ bgcol = light green })>> parse rest
parse ('\ESC':'[':'1':';':'4':'3':'m':rest) = modifyAndPrint (\x -> x
{ bgcol = light yellow })>> parse rest If you have read the old code you should recognize some parts of it here. It should consume rather constant amount of memory. To my surprise it consumed almost exactly the same amount of memory as the previous program. Turns out the problematic line was this:
hPutStrLn stderr $ printf "File %s processed. It took %s. File size was %d characters." fname (show $ diffUTCTime t2 t1) *(length src)*
It computed length of the input file. Needless to say, because "src" was actually the input file parsed previously, it was all hanging in the memory. Having removed that reference to src both programs (the one that parses input per line and the most recent one) are running in constant memory (2Mb). This doesn't apply to the first program, which has to read whole file before producing any output. And the last note: the new program is also 2x faster, perhaps due to very simple structure that is easy to optimize. It also makes sense now to use mapMPar as it reduces run time by 30%. The full code is in attachments. Best regards Christopher Skrzętnicki