
G'day.
Quoting Bulat Ziganshin
well, ppm differs from simple order-0 coder in that it uses previous symbols to calculate probability. lzw differs from order-0 coder with flat static probabilities in that it encodes a whole word each time.
LZW doesn't have the concept of a "word" at all. Both PPM and LZW build up contexts as they go by starting with one context for each symbol in the input alphabet, and by extending contexts with single symbols as they are found. (Conceptually; the algorithms are slightly different, but they do the same job.) The difference has to do, almost completely, with coding. LZW uses a degenerate model to decide what the next input symbol is. It assumes that all "seen" contexts are equally likely, and hence uses a binary code to encode them. (A "binary" code represents the numbers 0 to k-1 using k codewords each of which is (log k) bits in length.) PPM uses a different coding scheme, using the observed frequency of each context to drive an arithmetic coder.
there is nothing common between ppm and lzw except for this basic order-0 model, so i don't see any reasons to call lzw a sort of ppm.
There are a lot of differences between PPM and LZW, but they're based around the same principle: Build contexts based on what you've seen so far, and use that to predict what the next symbol should be.
it will be more fair to say that lzw is order-0 coder with static flat probabilities, you agree?
That's the _coding_ phase, sure. All known text compression algorithms basically work by building a model of the text, to try to predict the next symbol, and then using that to drive a coder which represents what was actually found using some bit representation. PPM and LZW use very different coders, but their models are quite similar.
you said about algorithm that efficiently encodes english words using huffman codes. is such algorithm really exists?
Ask Google about "word-based huffman". Cheers, Andrew Bromage