Re[12]: [Haskell-cafe] Toy compression algorithms [was: A very edgy language]

13 Jul 2007

      Hello ajb,

Friday, July 13, 2007, 12:10:54 PM, you wrote:
...
...
it seems that you either misunderstood PPM or make too wide
assumptions.
...
More likely I'm not explaining myself well.
...
...
in classical fixed-order ppm each char probability
found in context of previous N chars. i.e. when encoding abcdef..
with order 3, 'd' encoded in context of 'abc', 'e' encoded in context
of 'bcd' and so on
...
Let's suppose that "abcdef" is a context in PPM, and we're starting from
the "start" state (which usually isn't the case, of course).  Then to
encode it, you encode the probability of finding an "a" in the start state,
followed by the probability of finding a "b" in the "a" state, followed by
the probability of finding a "c" in the "ab" state...
sorry, but you really misunderstand how PPM works and why it's so
efficient. it searches probability of each symbol in context of
previous N chars where N is fixed. lzw can be directly compared with
ppm modification you described here, but this modification isn't ppm
itself. also, because ppm encodes each symbol separately, making its
probabilities flat effectively disables compression - the key why flat
probabilities still work in lzw is that it encodes whole word with
this flat model

-- 
Best regards,
 Bulat                            mailto:Bulat.Ziganshin@gmail.com

Re[12]: [Haskell-cafe] Toy compression algorithms [was: A very edgy language]

Bulat Ziganshin