Bin Bloom Filter Using Heuristic Optimization Techniques for Spam Detection
Authors: N. Arulanand, K. Premalatha
Abstract:
Bloom filter is a probabilistic and memory efficient data structure designed to answer rapidly whether an element is present in a set. It tells that the element is definitely not in the set but its presence is with certain probability. The trade-off to use Bloom filter is a certain configurable risk of false positives. The odds of a false positive can be made very low if the number of hash function is sufficiently large. For spam detection, weight is attached to each set of elements. The spam weight for a word is a measure used to rate the e-mail. Each word is assigned to a Bloom filter based on its weight. The proposed work introduces an enhanced concept in Bloom filter called Bin Bloom Filter (BBF). The performance of BBF over conventional Bloom filter is evaluated under various optimization techniques. Real time data set and synthetic data sets are used for experimental analysis and the results are demonstrated for bin sizes 4, 5, 6 and 7. Finally analyzing the results, it is found that the BBF which uses heuristic techniques performs better than the traditional Bloom filter in spam detection.
Keywords: Cuckoo search algorithm, levy’s flight, metaheuristic, optimal weight.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1095923
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2264References:
[1] B.H. Bloom, "Space/time tradeoffs in hash coding with allowable errors,” Commun. ACM., vol. 13, no. 7, pp. 422–426, July, 1970.
[2] M. Abdoh, M. Musa and N. Salman, "Detecting Spam by Weighting Message Words,” J. Arts Sci., vol. 1, no. 1, pp. 1-14, Aug. 2009.
[3] K. Xie, Y. Min, D. Zhang, G. Xie and J. Wen, "Basket Bloom Filters for Membership Queries,” in Proc. IEEE Tencon’05, Melbourne, Qld, pp. 1- 6, 2005.
[4] D.E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, Boston, 2009.
[5] L.N. De Castro and F.J. Von Zuben, "Learning and Optimization using the Clonal Selection Principle,” IEEE Trans. Evol. Comput., vol. 6, no. 3, pp. 239-251, Aug. 2002.
[6] J. Timmis and L.N. de Castro, Artificial Immune Systems: A New Computational Intelligence Approach, Springer, London, 2002.
[7] N. Arulanand, P. Swathy Priyadharsini and S. Subramanian "Artificial Immune System for Bloom filter Optimization,” Int. J. Computer App., vol. 41, no. 8, pp. 26-32, March 2012.
[8] J. Kennedy and R.C. Eberhart, "Particle Swarm Optimization,” in Proc. IEEE Int. Confer. Neural. Networks., Perth, WA, Australia, pp. 1942- 1948, 1995
[9] R.C. Eberhart and Y. Shi, "Particle swarm optimization, developments, applications and resources”, in. Proc Cong. Evol. Comput., Seoul, Korea. Piscataway, pp.445-457, 2001.
[10] N. Arulanand, S. Subramanian and K. Premalatha "Optimized Bin Bloom Filter for Spam filtering using Particle Swarm Optimization,” European J. Scientific Research, vol. 68, no. 2, pp. 199-213, July 2012.
[11] M. Clerc and J. Kennedy, "The particle swarm: explosion, stability, and convergence in a multi-dimensional complex space,” IEEE Trans. Evol. Comput., vol. 6, pp. 58-73, Feb. 2002.
[12] N. Arulanand, S. Subramanian and K. Premalatha "An Enhanced Cuckoo Search for Optimization of Bloom Filter in Spam Filtering,” Global J. comp. Scie. Tech., vol. 12, no. 1, Jan. 2012
[13] N. Arulanand , S. Subramanian and K. Premalatha "A Comparison study of cuckoo-bat search for Optimization of Bloom Filter in Spam Filtering,” Int. J. Bio-Inspired Comput., vol. 4, no. 2, pp.89-99, June 2012.
[14] X.S. Yang, and S. Deb, "Engineering optimisation by Cuckoo search” Int. J. Math. Modeil. Numer. optim., vol. 1, no. 4, pp. 330-343, Dec. 2010.
[15] C. Moskat, and M. Honza "European Cuckoo Cuculus Canorus Parasitism and Host's Rejection Behaviour in a Heavily Parasitized Great Reed Warbler Acrocephalus Arundinaceus Population,” Int. J. Avian. Scie., vol. 144, no. 4, pp. 614-622, Sep. 2002
[16] A. Moksnes and E. Roskaft "Egg-Morphs and Host Preference in the Common Cuckoo (Cuculus Canorus): An Analysis of Cuckoo and Host Eggs form European Museums and Collections,” J. Zool., vol. 236, no. 4, pp. 625-648, Mar. 1995.
[17] X.S. Yang, "A New Metaheuristic Bat-Inspired Algorithm”, Nature Inspired Cooperative Strategies for Optimization (NISCO 2010), Studies in Computational Intelligence, Springer Berlin, Springer, vol. 284, pp.65-74, April, 2010.