Wrap Filters - A new non-interactive method to determine sizes of deviation between similar data sets - 02/28/2002
Sachin Agarwal
In many problems related to computer networks and database consistency in lazy replication of distributed databases, it is important to gauge the extent of dissimiliarity between two datasets in order to make intelligent decisions about when to initiate synchronization procedures and to estimate the level of deviation in databases that are only weakly consistent. We propose, describe and analyze a new algorithm based on "Wrap Filters" to estimate this information. Our scheme has low communication complexity and requires only one round of communication in order for two hosts to determine an estimate of the size of the difference between them. Moreover, the information exchanged may also be used as a normal bloom filter to estimate what set elements the local set has that the remote host's set does not. We are thus able to extend the functionality of the bloom filter and make it more informative, at a modest cost in terms of the computation and communication.