Huffman encoding

Huffman coding is a lossless source coding for text and image compression. The method is based on the redundancy, character frequency, and statistical distribution of the data in a piece of information. For example, in a text, certain letters occur more frequently than others, such as the letter "e", which occurs more frequently than others. In a graphic, on the other hand, colors are statistically distributed: Green, for example, may occur more frequently than yellow.

In Huffman coding, named after its inventor David Huffman (1925-1999), the most frequently occurring record is replaced by a short signal code, while less frequently occurring records are replaced by longer signal codes. The most frequently occurring characters are transmitted with only 3 bits (short hand), which means that 5 bits are saved for one character. Some, rarely occurring characters are even transmitted with more than eight bits.

Example of a Huffman coding

The length of the code is thus variable and is determined in the Huffman coding on the basis of the frequency. The frequency of the individual letters or color pixels is determined. Letters/color pixels with lower frequency are summarized in groups and from these again new groups are formed in a tree structure. Depending on the composition of the signals, this can save up to 50 percent or more of the transmission time.

Text compression with Huffman coding using the example: ERDBESTATTER

Huffman coding is used in H.320 and other video codecs, in Group-3 fax, in Microcom Networking Protocol (MNP5) and in JPEG.

Englisch:	Huffman encoding
Updated at:	14.12.2012
#Words:	248
Links:	coding, text (TXT), image compression, method, redundancy
Translations:	DE
Sharing: