Introduction

The Reconciliation Problem:

Some of the common applications where synchronization plays a vital role are Domain Name Servers (DNS) that constantly have to map new IP addresses to new web pages and also remove records of web pages that are no longer in service. Different DNS servers in a network have to constantly synchronize with each other to maintain consistent data. Another common application that employs synchronization is databases that maintain resources available in a network. Resources are continually added to and taken off from a network. At any given time, any person who has logged into the network needs to know all the available resources and this list should be complete and up to date.  Yet another common area where synchronization plays a key role is in the maintaining of bank accounts over distributed databases across a large country. Different servers maintain the same records of the databases across geographically separated locations for faster access of records and these records need to be perfectly in synchronization. Nowadays PDA’s are very common.  PDA’s need to synchronize between various devices to update their records. Some of the common methods employed by PDA’s for synchronization are:

 

Slow Sync, where the PDA and a PC or laptop can transfer data back and forth. The record of one device is transferred in its entirety to the other device regardless of whether the other device has a part of the record that is being transferred.
 

Hot Sync, where the two devices can synchronize their data and make updates only on those records that have been updated since the last synchronization. The drawback here is that the same two devices need to be always involved with the synchronization.
 

The overall problem of synchronization can be visualized as, a group of devices, each maintaining a database and each device should keep track of the changes taking place in each of the other devices and present the most up to date information to any application or user.

 

For the sake of analysis, the problem can be formalized as follows:

Given a pair of devices or nodes in a network, A and B, each with a set of length-b bit strings (denoted as Sa and Sb), and no a- priori knowledge of the other host set, each host has to determine the mutual differences between the two sets with a minimal amount of communication. Minimality is measured in terms of

number of bits transmitted across the link,
number of times each host communicates with the other, and
computational complexity of the entire reconciliation.
 

This is known as the Set Reconciliation problem.

 

The data represented in each host or device is considered as sets whose elements are chosen from a finite universal set U.

 

Every set S whose elements are taken from U may be associated uniquely with a characteristic vector v(S) of length |U|. The components of the characteristic vector are either a 0 or a 1.

 

The i-th component of the characteristic vector of set S is 1 if and only if the i-th element of the universal set U occurs in the set.

 

For example, the characteristic vector of the set :

S={3,5,9}   taken over the universal set U=Z24 would be:

 

V(S) = 0    0    0    1    0    1    0    0    0    1    0    0    0    0    0    0

 

Reconciling two sets involves determining the mutual difference between two such characteristic vectors.

 
Synchronization Issues :


Each type of networked device that must synchronize with another, usually is capable of only a select few methods of synchronization depending on the application running on each device. In some cases the bandwidth available to certain devices limits the speed and methods of synchronization. For devices like PDA’s the computational power of the processor is minimal and cannot be expected to perform complex procedures to limit the amount of communication bits or the number of rounds of communication. An optimal synchronization protocol should be capable of synchronizing any two heterogeneous or homogeneous devices with minimum amount of communication bits, minimum computation complexity and minimum number of rounds of communication between the devices.