A Hybridization of Constructive Beam Search with Local Search for Far From Most Strings Problem
Authors: Sayyed R Mousavi
Abstract:
The Far From Most Strings Problem (FFMSP) is to obtain a string which is far from as many as possible of a given set of strings. All the input and the output strings are of the same length, and two strings are said to be far if their hamming distance is greater than or equal to a given positive integer. FFMSP belongs to the class of sequences consensus problems which have applications in molecular biology. The problem is NP-hard; it does not admit a constant-ratio approximation either, unless P = NP. Therefore, in addition to exact and approximate algorithms, (meta)heuristic algorithms have been proposed for the problem in recent years. On the other hand, in the recent years, hybrid algorithms have been proposed and successfully used for many hard problems in a variety of domains. In this paper, a new metaheuristic algorithm, called Constructive Beam and Local Search (CBLS), is investigated for the problem, which is a hybridization of constructive beam search and local search algorithms. More specifically, the proposed algorithm consists of two phases, the first phase is to obtain several candidate solutions via the constructive beam search and the second phase is to apply local search to the candidate solutions obtained by the first phase. The best solution found is returned as the final solution to the problem. The proposed algorithm is also similar to memetic algorithms in the sense that both use local search to further improve individual solutions. The CBLS algorithm is compared with the most recent published algorithm for the problem, GRASP, with significantly positive results; the improvement is by order of magnitudes in most cases.
Keywords: Bioinformatics, Far From Most Strings Problem, Hybrid metaheuristics, Matheuristics, Sequences consensus problems.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1078319
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1753References:
[1] J. K. Lanctot, M. Li, B. Ma, S. Wang, and L. Zhang, "Distinguishing string selection problems," in SODA -99: Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms. Philadelphia, PA, USA: Society for Industrial and Applied Mathematics, 1999, pp. 633-642.
[2] J. K. Lanctot, "Some string problems in computational biology," Ph.D. dissertation, Univesity of Waterloo, 2000.
[3] J. K. Lanctot, M. Li, B. Ma, S. Wang, and L. Zhang, "Distinguishing string selection problems," Inf. Comput., vol. 185, no. 1, pp. 41-55, 2003.
[4] C. N. Meneses, C. A. S. Oliveira, and P. M. Pardalos, "Optimization techniques for string selection and comparison problems in genomics," Engineering in Medicine and Biology Magazine, IEEE, vol. 24, no. 3, pp. 81-87, May-June 2005.
[5] P. Festa, "On some optimization problems in molecular biology," Mathematical Biosciences, vol. 207, no. 2, pp. 219 - 234, 2007, bIOCOMP2005 Special Issue. (Online). Available: http://www.sciencedirect.com/science/article/B6VHX-4N0X60P- 2/2/3a9aa7c9fde1693bef0402f190eb1a36
[6] G. D. Cohen, M. G. Karpovsky, H. F. M. Jr., and J. R. Schatz, "Covering radius - survey and recent results," IEEE Transactions on Information Theory, vol. 31, no. 3, pp. 328-343, 1985.
[7] A. J. L. Macario and E. C. de Macario, Gene probes for bacteria. Academic Press, 1990.
[8] M. Li, B. Ma, and L. Wang, "Finding similar regions in many strings," in STOC -99: Proceedings of the thirty-first annual ACM symposium on Theory of computing. New York, NY, USA: ACM, 1999, pp. 473-482.
[9] J. Gramm, F. Hffner, and R. Niedermeier, "Closest strings, primer design, and motif search," in Sixth Annual International Conference on Computational Molecular Biology, April 2002, pp. 74-75.
[10] S. T. Crooke and B. Lebleu, Antisense Research and Applications. CRC Press, 1993.
[11] M. Li, B. Ma, and L. Wang, "On the closest string and substring problem," Journal of the ACM, vol. 49, no. 2, pp. 157-171, March 2002.
[12] C. N. de Meneses, Z. Lu, C. A. S. Oliveira, and P. M. Pardalos, "Optimal solutions for the closest-string problem via integer programming," INFORMS Journal on Computing, vol. 16, no. 4, pp. 419-429, 2004.
[13] J.-C. Chen, "Iterative rounding for the closest string problem," CoRR, vol. abs/0705.0561, 2007.
[14] S. C. Sahinalp, S. Muthukrishnan, and U. Dogrus¨oz, Eds., Combinatorial Pattern Matching, 15th Annual Symposium, CPM 2004, Istanbul,Turkey, July 5-7, 2004, Proceedings, ser. Lecture Notes in Computer Science, vol. 3109. Springer, 2004.
[15] B. Ma and X. Sun, "More efficient algorithms for closest string and substring problems," in Research in Computational Molecular Biology: 12th Annual International Conference RECOMB 2008, April 2008, pp. 396-409.
[16] J. Wang, J. Chen, and J. M. Huang, "An improved lower bound on approximation algorithms for the closest substring problem," Information Processing Letters, vol. 107, no. 1, pp. 24-28, June 2008.
[17] J. Gramm, "Closest substring," in Encyclopedia of Algorithms, 2008.
[18] F. C. Gomes, C. N. de Meneses, P. M. Pardalos, and G. V. R. Viana, "A parallel multistart algorithm for the closest string problem," Computers & OR, vol. 35, no. 11, pp. 3636-3643, 2008.
[19] C. H. Cheng, C. C. Huang, S. Y. Hu, and K. Chao, "Efficient algorithms for some variants of the farthest string problem," in 21st Workshop on Combinatorial Mathematics and Computation Theory, may 2004, pp. 266-273.
[20] C. N. Meneses, P. M. Pardalos, M. G. C. Resende, and A. Vazacopoulos, "Modeling and solving string selection problems," in Second International Symposium on Mathematical and Computational Biology, December 2005, pp. 54-64.
[21] J. Gramm, J. Guo, and R. Niedermeier, "On exact and approximation algorithms for distinguishing substring selection," in FCT, 2003, pp. 195-209.
[22] J. Gramm, R. Niedermeier, and P. Rossmanith, "Fixed-parameter algorithms for closest string and related problems," Algorithmica, vol. 37, no. 1, pp. 25-42, 2003.
[23] M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979.
[24] T. Feo and M. Resende, "A probabilistic heuristic for a computationally difficult set covering problem," Operations Research Letters, vol. 8, pp. 67-71, 1989.
[25] T. A. Feo and M. G. C. Resende, "Greedy randomized adaptive search procedures," Journal of Global Optimization, vol. 6, no. 2, pp. 109-133, 1995.
[26] P. Festa and M. Resende, "Grasp: An annotated bibliography," in Essays and Surveys on Metaheuristics. Kluwer Academic Publishers, 2002, pp. 325-367.
[27] P. Festa and M. G. C. Resende, "An annotated bibliography of GRASP; part I: Algorithms," International Transactions in Operational Research, vol. 16, no. 1, pp. 1-24, 2009.
[28] M. Babaei and S. R. Mousavi, "A memetic algorithm for closest string problem and farthest string problem," in ICEE -10: Proceedings of the 18th Iranian Conference on Electrical and Electronics Engineering, to appear. IEEE, 2010.
[29] S. R. Mousavi, M. Babaei, and M. Montazerian, "An improved heuristic for the far from most strings problem," submitted for publication.
[30] H. A. Peelle, "Euclid, fibonacci, and pascal recursed," International Journal of Mathematical Education in Science and Technology, vol. 6, no. 4, pp. 395-405, November 1975.
[31] A. W. F. Edwards, Pascal-s Arithmetical Triangle: The Story of a Mathematical Idea. Johns Hopkins University Press, 2002.