Rich Pasc
Delphi Developer 
Wed, 18 Jun 1902 08:00:00 GMT
Re:Arithmetic data compression
QuoteFrederic wrote: > With the help of an article I figured out how to compress a short string > (10 characters) using floating point numbers and calculations, according > to the arithmetic compression algorithm.
Funny you should ask. My 1976 Stanford Ph.D. dissertation "Source coding algorithms for fast data compression" was some of the fundamental work on arithmetic compression. Quote> I used the extended data type > to store the result of the compression as it provides the greatest > precision. > How many consecutive characters can I compress at a time when using > extended, considering that only the fractional part of the number is > relevant?
There is no hardandfast answer to that. It depends on the probabilities assigned to the characters, since the length of the codestring grows as their surprise (log(probability)). You can encode many characters which are not surprising (high probability) or few which are suprising (low probability). Quote> It should be more than 10 characters, as an extended variable > is already 10 bytes long.
Not all of the bytes in an "extended" character are devoted to the "fractional" part. But this is true if the characters are highly probable. Quote> I guess that after a certain number of > characters, the results of the calculations become to small to store in > a variable of type extended and a data loss would evidently occur.
True! Quote> Is there a better way to implement arithmetic compression?
Yes. Use multiprecision arithmetic, where the words get extended to arbitrary length as required. This would normally cause run time to grow as the square of the length of the string to be encoded, but there are ways to avoid it by controlling carryover. See, for example, R.B. Arps, T.K. Truong, D.J. Lu, R.C. Pasco, and T.D. Friedman, "A multipurpose VLSI chip for adaptive data compression of bilevel images," IBM Journal of Research & Development, Vol.32, No.6, November, 1988, pp. 775795. Actually, that entire issue was devoted to the Qcoder, a type of arithmetic coder which uses fixedpoint arithmetic and various other techniques. Quote> How does it > stack up against other compression algorithms, both in terms of speed > and compression ratio?
Depends on the data to be compressed, and how well you can model its expected probabilities! Arithmetic coding is not the fastest, but gets mathematically optimum compression, given the right model.  Rich
