We start with a simple theorem, which gives a tight upper bound on the minimum distance of high-rate cycle-free linear codes.
Proof. Let H be the r×n cycle-free parity-check matrix for . We assume without loss of generality that H has full row-rank and that r = n - k, since otherwise we can remove the linearly dependent rows of H while preserving the cycle-free property. Let denote the number of columns of weight i in H. If 0 then d = 1, and we are done. Otherwise, we have at least
Theorem 3.3 implies that the (n, n-1, 2) code n, whose parity-check matrix consists of a single all-1 vector, is, in a sense, the optimal cycle-free code of rate 0.5. Since all such codes have distance d2 and n has the highest rate. The cycle-free Tanner graph for n is depicted in Figure 3.2a.
To show that the bound of Theorem 3.3 is tight for all n and k, with n/2kn - 1, we may start with the single-parity-check code k+1 and repeat any symbol (or symbols) in k+1 until a code of length n is obtained. The following lemma shows that this always produces an (n, k, 2) cycle-free code for kn/2.
The length and dimension of
are obvious. To see that
is cycle-free, observe that a Tanner graph T' for
obtained from the cycle-free Tanner graph T for
two new vertices x' and y' and two new edges: (x', y') and
(xi, y'). Clearly this procedure does not create
width4pt depth2pt height6pt
Let be a cycle-free code of length n, and let be the code of length n + obtained from by iteratively applying times the procedure of Lemma 3.4, while possibly choosing a different value of i at different iterations. We then say that is a code obtained by repeating symbols in . To make our terminology precise, we further extend the notion of codes obtained by ``repeating symbols in '' to also include the codes obtained from by appending all-zero coordinates. The following proposition shows that every low-rate cycle-free linear code has this structure.
Let H be the
r×n cycle-free parity-check matrix for
with r = n - k. Then by Lemma 3.2 and (3.2), we have
where the second inequality follows from the fact that
This implies that H contains at least one row of
weight 2. If this row is of weight one, then the corresponding
, say the n-th coordinate, is all-zero. Otherwise,
assume without loss of generality that this row is of the form
h = (0, 0,..., 0,*,*).
Then, up to scaling the last two columns of H by constants in
we may further assume that
h = (0, 0,..., 0, 1, - 1).
This would mean that the n-th symbol in
is a repetition
of the preceding symbol. In both cases, we can puncture-out the n-th
, and recursively repeat the argument until a cycle-free
code of rate > 0.5 is obtained.
width4pt depth2pt height6pt
Loosely speaking, Proposition4 implies that every cycle-free code of rate 0.5 can be represented by a Tanner graph whose structure is shown in Figure 3.2b. The dashed line in Figure 3.2b encloses a cycle-free Tanner graph for a code of rate > 0.5 and distance 2. It follows that to establish a bound on the minimum distance of low-rate cycle-free codes, we need to determine an optimal choice for in Figure 3.2b and an optimal sequence of symbol repetitions. This problem is considered in detail in the next two sections.
Specifically, we will show in Section 3.4 and Section 3.5 that the single-parity-check code constitutes an optimal choice for , and every symbol should be repeated equally often.