Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/64296
Citations
Scopus Web of Science® Altmetric
?
?
Type: Conference paper
Title: Improved Decaying Bloom Filter for duplicate detection in data streams over sliding windows
Author: Wang, X.
Shen, H.
Citation: Proceedings - 2010 3rd IEEE International Conference on Computer Science and Information Technology, ICCSIT 2010 / Yi Hang, Wen Desheng, P. S. Sandhu (eds.): vol. 4, pp. 348-353
Publisher: IEEE
Publisher Place: USA
Issue Date: 2010
Series/Report no.: International Conference on Computer Science and Information Technology
ISBN: 9781424455379
ISSN: 2381-3458
Conference Name: IEEE International Conference on Computer Science and Information Technology (3rd : 2010 : Chengdu, China)
Editor: Hang, Y.
Desheng, W.
Sandhu, P.S.
Statement of
Responsibility: 
Xiujun Wang, Hong Shen
Abstract: Approximate duplicate detection based on the Decaying Bloom Filter (DBF) for data streams over sliding windows (DDMDBF) is an effective technique, but may have a large false positive rate. Because it simply takes a querying element to be duplicated when the counters that this element is hashed to are non-zero, while neglects the actual values of the counters. In this paper, we propose a new data structure, Flag Decaying Bloom Filter (FDBF), which can maintain duplicate information more accurately by extending DBF with one additional flag bit for each integer counter. Then we propose an efficient approximate duplicate detection method (DDMFDBF) based on FDBF that reduces the false positive rate (FPR) p (0 <; p <; 1)of DDMDBF by a factor of p1-√(2) for approximately same bit space. Experimental results on synthetic data validate the analytical results on the efficiency and accuracy of our method.
Keywords: Counting Bloom Filter
Decay Bloom Filter
Flag Deacying Bloom Filter
False Positive
Duplicate Detection
Description: Conference also known as: ICCSIT 2010
Rights: ©2010 IEEE
DOI: 10.1109/ICCSIT.2010.5564586
Published version: http://dx.doi.org/10.1109/iccsit.2010.5564586
Appears in Collections:Aurora harvest
Computer Science publications

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.