Arithmetic Coding

Yoni Elisha ッ
Image Compression
Published in
1 min readSep 18, 2020

We get a group of symbols and probabilities for each of the symbols to show up in our code and want to calculate the encoding of a word, For example:
We want to encode the word “hi” with the symbols:
{a=0.2, h=0.5, i=0.05, t=0.25}.

We divide the symbols in a range from 0 to 1:
- a : [0,0.2)
- h: [0.2,0.7)
- i: [0.7,0.75)
- t: [0.75,1)

From now on we want to use the ranges of the chosen symbols in our word and the following formula in order to calculate the ranges of our next symbols:

R_symbol = [lower_bound, lower_bound+(gap_between_lower_and_higher_bound*P_symbol))

The first symbol is ‘h’ so we will take its range: [0.2,0.7)

The next symbol is ‘i’, so now using the given formula from above, we will calculate the new range for chosing this symbol.

In order to get to the ‘i’ symbol, we first need to calculate R_a, then use it to calculate R_h and then R_i:

  • gap_between_lower_and_higher_bound = 0.7-0.2 = 0.5.
  • R_a = [0.2, 0.2+0.5*0.2) = [0.2, 0.3)
  • R_h = [0.3, 0.3+0.5*0.5) = [0.3, 0.55)
  • R_i = [0.55, 0.55+0.5*0.05) = [0.55, 0.575) -> that’s the final range.

After we got the range of the encoding of our word,
We need to return a single number within this range as an encoding.
Most of the time is will be (lower_bound+higher_bound)/2.
In this case, we get 0.5625.

--

--