Formally, a q-ary [n, M, d] error-correcting code is a set of M vectors over the finite field $ \mathbb {F}$qn with the property that any two vectors in the code are separated by a Hamming distance of no less than d. The Hamming distance between two vectors is simply the number of positions in which the vectors differ. For instance, the vectors 1011 and 1002 differ in their two last positions, and are therefore separated by a Hamming distance of 2.

It is often useful to restrict our attention to vectors which form a linear subspace of $ \mathbb {F}$qn. Thus, a q-ary (n, k, d ) linear code is a subspace of $ \mathbb {F}$qn of dimension k with the property that any two vectors in the subspace are separated by a Hamming distance no less than d. Figure 1.1 depicts a ternary non-linear code and a binary linear code.

Figure 1.1: A sample [n = 6, M = 3, d = 4] ternary non-linear and a (n = 6, k = 2, d = 4) binary linear code.
\begin{figure}\centering\subfigure[]%%[Vectors of a ternary non-linear $[n=6,M=3...
...& 1\\
1& 1& 1& 1& 0& 0\\
1& 1& 0& 0& 1& 1\\ \hline

We will typically specify a linear code by its set of basis vectors (also known as generators), which form the rows of a generator matrix G. A parity-check matrix for a linear code is a matrix H with the property that GHT = 0 for some generator matrix G of the code. For example, the linear code in Figure 1.1 has parity-check matrix

H = $\displaystyle \left[\vphantom{ \begin{array}{cccccc} 1&1&0&0&0&0\\  0&0&1&1&0&0\\  1&0&1&0&1&0\\  1&0&1&0&0&1 \end{array} }\right.$$\displaystyle \begin{array}{cccccc} 1&1&0&0&0&0\\  0&0&1&1&0&0\\  1&0&1&0&1&0\\  1&0&1&0&0&1 \end{array}$$\displaystyle \left.\vphantom{ \begin{array}{cccccc} 1&1&0&0&0&0\\  0&0&1&1&0&0\\  1&0&1&0&1&0\\  1&0&1&0&0&1 \end{array} }\right]$ (9)

as can be verified explicitly.

Linear codes impose a natural coset-structure on their ambient vector spaces. Specifically, given a linear code $\ensuremath{\mathbb{C}}\subseteq \mathbb{F}_q^n$, we can say that two vectors v1, v2$ \mbox{$\,\inn\,$}$$ \mathbb {F}$qn are in the same coset if and only if $v_1 - v_2 \mbox{$\,\inn\,$}\ensuremath{\mathbb{C}}$. Moreover, we can represent each coset by a coset leader which is arbitrarily chosen among the lowest weight vectors in a coset. For example, some cosets of the linear code in Figure 1.1 are:

C0 = {000000, 001111, 111100, 110011}  
C1 = {000001, 001110, 111101, 110010}  
C2 = {000011, 001100, 111111, 110000}  

The coset leaders are shown in bold face. In general, for a linear code of dimension k, there will be 2n-k coset leaders. These cosets can be used in decoding, because e is the coset leader of a corrupted vector c + e whenever c is a codeword and the Hamming weight of e is below the correctable error rate of the code. More importantly, however, they will be used to generate and study lexicodes in Chapter 2.

To use an (n, k, d ) binary linear code over a communications channel, a user would split his input into blocks of k bits each and encode each block b by multiplying it by the generator matrix G. The resulting n-bit vector bG is sent over the channel giving a transmission rate of k/n. The receiver then decodes the received error-corrupted vector bG + e to the nearest vector v in the code. The vector v will be equal to bG as long as the error e contains no more than $\ensuremath {\left\lfloor \frac{d-1}{2} \right\rfloor}$ bits. Figure 1.2 depicts this transmission model. The type of decoding we have described is also known as maximum-likelihood decoding because it decodes to the most likely transmitted input. Several descriptions of this communications model may be found in [42], [7], and [4].

Figure 1.2: The transmission model: an input message 1011 is split into blocks of 2 bits each, and then encoded by the generator matrix of the linear code in Figure 1.1. It is transmitted over a noisy channel (errors shown in bold face) and then decoded to the nearest codeword and back into the message word 1011.
\fbox{10 \quad 11} $ \longrightarrow$ \fbox{
\begin{bmatrix}1& 0\\ \end{bmatrix}\...
1& 1& 0& 0& 1& 1\\

$ \;\stackrel{{\mbox{noise}}}{{\-\-\-\-\-\-\-\longrightarrow}}\;$ \fbox{
\mathbf{1}& 0& 1&...
1& 1& 0& \mathbf{1}& 1& 1\\
\end{bmatrix}\end{align*}\end{minipage}} $ \longrightarrow$ \fbox{10 \quad 11}

Often when we talk about a family of linear error-correcting codes, we shall be interested in the asymptotic tendency of the code. Specifically, we would like that, as the length n increases, the error-correction capability of the code likewise increases. It has been proven that there exist families of linear codes which achieve the Gilbert-Varshamov bound [42], namely

$\displaystyle {\frac{{k}}{{n}}}$ $\displaystyle \geq$ 1 - H2$\displaystyle \left(\vphantom{ \frac{d}{n} }\right.$$\displaystyle {\frac{{d}}{{n}}}$$\displaystyle \left.\vphantom{ \frac{d}{n} }\right)$. (9)

Here H2( . ) refers to the binary entropy function

H2(p) = - p log(p) - (1 - p)log(1 - p).

On the other hand, the McEliece, Rodemich, Rumsey, and Welch [4] provide a well known upper bound on the parameters of linear codes

$\displaystyle {\frac{{k}}{{n}}}$ $\displaystyle \lesssim$ H2$\displaystyle \left(\vphantom{ \frac{1}{2} - \sqrt{\frac{d}{n}\left( 1 - \frac{d}{n} \right)} }\right.$$\displaystyle {\frac{{1}}{{2}}}$ - $\displaystyle \sqrt{{\frac{d}{n}\left( 1 - \frac{d}{n} \right)}}$$\displaystyle \left.\vphantom{ \frac{1}{2} - \sqrt{\frac{d}{n}\left( 1 - \frac{d}{n} \right)} }\right)$. (9)

Thus, the asymptotically optimal codes are somewhere in the region between these two bounds, as depicted in Figure 1.3. For our purposes, we shall consider a linear code asymptotically good if it asymptotically bounds the relative minimum distance d /n away from 0, for information rates k/n < 1 and length n going to infinity. Our definition of ``asymptotically good'' is deliberately weak, yet we shall show that certain families of codes cannot even achieve this weak definition.

Figure 1.3: A plot of the McEliece-Rodemich-Rumsey-Welch bound (upper) and the Gilbert-Varshamov bound (lower) on the parameters of linear codes. The information rate R = k/n is plotted versus the relative minimum distance $ \delta$ = d /n, for asymptotically large n. The gray area marks the gap between the known upper and lower bounds.