Results and Conclusions
Two approaches were taken towards implementing the BCH decoder. In the first approach, a look up table was created that stored all the different exponents of the primitive element a. In the second approach, the required exponent values of a were calculated at run time. All coding was done in the ‘C’ programming language on Linux platform.
Results of the two approaches:
The graphs below show the performance of both the approaches for various parameters.
The graph above shows the variation of the reconciliation time with respect to number of differences between the sets. The dotted line is a linear fitting for the values obtained with the implementation that used look-up tables. The axes are colored according to the respective plots they refer to.
The graph above shows the variation of the reconciliation time with respect to the length of the codewords.
Comparison between performance of Characteristic Polynomial Interpolation scheme and Error Correcting Code (BCH Code) scheme:
The table below lists the time taken in milliseconds to synchronize two sets with 16 bit elements for various number of differences between them.
The graph below shows a plot of the data from the table:
Comparison of BCH scheme and CPI scheme for 16 bit vectors.Note: All data collected by running the code on a PIII with 196 Mb of RAM, running Linux (Red Hat 7.0)
Conclusions drawn from the BCH Error-Correcting-Code Set-Reconciliation scheme:
From Graph1, it appears that the reconciliation time for the BCH scheme with and without look-up tables grow linearly as the number of differences increase. The amount of memory required by the BCH scheme when implemented with a look up table is, however, an exponential function of the length of the code. (Q(2n)). As is evident from Graph1, there is a significant difference in the time taken by the two approaches, a difference of almost 69 seconds for 130 differences between the sets. From Graph 3, it is also clear that for small differences between the sets, CPI synchronization consumes less time but as the number of differences between the sets increases, the BCH scheme, increases almost linearly but the CPI synchronization scheme increases quadratically. The only drawback of the BCH scheme is that the constants involved in the running time are relatively large. And even between the two approaches to the BCH scheme, the scheme with look-up tables has a much smaller constant when compared to the approach without look-up tables.
From Graph2, it is clear that the running time of the BCH scheme, in both approaches, is exponential in the length of the codeword. This fact affects the constant term involved in the running time for a given length of codeword.
The maximum size of the look – up table is bounded by the amount of memory available and the maximum value that an integer variable can assume. This is because the array index has to be an integer.
The large constants involved in the scheme without look-up tables makes it a non-practical scheme for set reconciliation. Also, the amount of memory that is required for the scheme with look-up tables is not practical when synchronizing large databases whose contents may require representation with up to 512 bits.
A simple analysis of the BCH decoding scheme reveals that the maximum amount of time (more than 99% of the total running time) is taken up by the final stage of the decoding process – the Chien Search. Each bit of the vector has to be examined to check if it is in error. If the data is represented by 16 bits, the length of the characteristic vector would be 216 and each bit of this vector has to be checked for error in the Chien Search. This leads to the possibility of usage of this scheme in scenarios where either only deletions or only insertions take place. If we knew before hand that a particular server is going to make only deletions in its data set, we would just need to examine each bit of the characteristic vector on the client that is “set” to see if it is in error. Again, if we know before hand that the server is going to make only insertions, we can model these insertions as deletions. We can check all the “unset” bits in the characteristic vector on the client to see if they are in error. The running time of such an operation would be dependent only on the number of entries in the set. Again, this scheme would be useful only if the data set on the server has a small number of initial entries on which insertions are performed or if it has a very large number of entries on which deletions are performed.
Implementation issues and conclusions:
Tornado codes are difficult to implement, as there isn’t much documentation available on the codes. Designing good graphs for the codes involves complex differential equations. The code that was used in the project was of very low rate (less
than 1/2). The basic idea behind using error-correcting codes for the Set Reconciliation problem was to use the redundant bits in an error-correcting code to synchronize two hosts. In this implementation the number of redundant bits exceeded the number of message bits and so, wasn’t really practical to apply it to the Set – Reconciliation problem.
Implementation issues and conclusions:
An implementation of LDPC codes developed by Radford McNeal2 and David MacKay3 was used in the study. The software is capable of simulating memoryless channels, including Binary Symmetric Channels and Additive White Gaussian Noise Channels.
The code wasn’t s able to decode some arbitrarily random vectors. When synchronizing databases that contain critical information, it is absolutely necessary that the two hosts reflect the same data. The fact that the LDPC codes cannot correct all possible errors, however, make it a bad choice for application to the Set Reconciliation problem.
Although the results of this project clearly indicate that Error Correcting Codes can be used for the Set Reconciliation problem, most of the codes available right now are not well suited for this purpose.