Generalized Shannon Code Minimizes the Maximal Redundancy

Get BibTex-formatted data

Download

PDF

Author

Michael Drmota and Wojciech Szpankowski

Tech report number

CERIAS TR 2002-12

Entry type

inproceedings

Abstract

Source coding, also known as data compression, is an area of information theory that deals with the design and performance evaluation of optimal codes for data compression. In 1952 Huffman constructed his optimal code that minimizes the average code length among all prefix codes for known sources. Actually, Huffman codes minimize the average redundancy defined as the difference between the code length and the entropy of the source. Interestingly enough, no optimal code is known for other popular optimization criterion such asthemaximal redundancy defined as the maximum of the point-wise redundancy over all source sequences. We first prove that a generalized Shannon code minimizes the maximal redundancy among all prefix codes, and present an efficient implementation of the optimal code. Then we compute precisely its redundancy for memory less sources. Finally, we study universal codes for unknown source distributions. We adopt the minimax approach and search for the best code for the worst source. We establish that such redundancy is a sum of the likelihood estimator and the redundancy of the generalize code computed for the maximum likelihood distribution. This replaces Shtarkov\'s bound by an exact formula. We also compute precisely the maximal minimax for a class of memory less sources. The main findings of this paper are established by techniques that belong to the toolkit of the

Download

PDF

Date

2002

URL

https://www.cerias.purdue.edu/papers/archive/2002-12.ps

Booktitle

Proc. LATIN'02

Institution

CERIAS, Purdue University

Key alpha

Drmota

Publication Date

1900-01-01

BibTex-formatted data

To refer to this entry, you may select and copy the text below and paste it into your BibTex document. Note that the text may not contain all macros that BibTex supports.

@Inproceedings{ Drmota,
	title = "Generalized Shannon Code Minimizes the Maximal Redundancy",
	author = "Michael Drmota and Wojciech Szpankowski",
	year = "2002",
	booktitle = "Proc. LATIN'02",
	institution = "CERIAS, Purdue University",
	abstract = "Source coding, also known as data compression, is an area of information theory that deals with the design and performance evaluation of optimal codes for data compression. In 1952 Huffman constructed his optimal code that minimizes the average code length among all prefix codes for known sources. Actually, Huffman codes minimize the average redundancy defined as the difference between the code length and the entropy of the source. Interestingly enough, no optimal code is known for other popular optimization criterion such asthemaximal redundancy defined as the maximum of the point-wise redundancy over all source sequences. We first prove that a generalized Shannon code minimizes the maximal redundancy among all prefix codes, and present an efficient implementation of the optimal code. Then we compute precisely its redundancy for memory less sources. Finally, we study universal codes for unknown source distributions. We adopt the minimax approach and search for the best code for the worst source. We establish that such redundancy is a sum of the likelihood estimator and the redundancy of the generalize code computed for the maximum likelihood distribution. This replaces Shtarkov\'s bound by an exact formula. We also compute precisely the maximal minimax for a class of memory less sources. The main findings of this paper are established by techniques that belong to the toolkit of the ",
	url = "https://www.cerias.purdue.edu/papers/archive/2002-12.ps",
}