Leveraging Quality Metrics in Voting Model Based Thread Retrieval

Atefeh Heydari; Mohammadali Tavakoli; Zuriati Ismail; Naomie Salim

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33156

Leveraging Quality Metrics in Voting Model Based Thread Retrieval

Authors: Atefeh Heydari, Mohammadali Tavakoli, Zuriati Ismail, Naomie Salim

Abstract:

Seeking and sharing knowledge on online forums have made them popular in recent years. Although online forums are valuable sources of information, due to variety of sources of messages, retrieving reliable threads with high quality content is an issue. Majority of the existing information retrieval systems ignore the quality of retrieved documents, particularly, in the field of thread retrieval. In this research, we present an approach that employs various quality features in order to investigate the quality of retrieved threads. Different aspects of content quality, including completeness, comprehensiveness, and politeness, are assessed using these features, which lead to finding not only textual, but also conceptual relevant threads for a user query within a forum. To analyse the influence of the features, we used an adopted version of voting model thread search as a retrieval system. We equipped it with each feature solely and also various combinations of features in turn during multiple runs. The results show that incorporating the quality features enhances the effectiveness of the utilised retrieval system significantly.

Keywords: Content quality, Forum search, Thread retrieval, Voting techniques.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1338656

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1766

References:

[1] Heydari, Atefeh, Mohammad ali Tavakoli, Naomie Salim, and Zahra Heydari. "Detection of review spam: A survey." Expert Systems with Applications 42, no. 7 (2015): 3634-3642.
[2] Elsas, J. L., Arguello, J., Callan, J. and Carbonell, J. G. (2008). Retrieval and feedback models for blog feed search. In: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, ACM, New York, NY, USA, SIGIR '08, pp. 347-354, DOI 10.1145/1390334.1390394, URL http://doi.acm.org/10.1145/1390334.1390394
[3] Elsas, J.L. and Carbonell, J. G. (2009) It pays to be picky: an evaluation of thread retrieval in online forums. In: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, ACM, New York, NY, USA, SIGIR '09, pp. 714- 715, DOI 10.1145/1571941.1572092, URL http://doi.acm.org/10.1145/1571941.1572092.
[4] Macdonald, C. and Ounis, I. (2008a). Voting techniques for expert search. Knowl Inf Syst., 16(3), pp. 259-280. DOI 10.1007/s10115-007- 0105-3, URL http://dx.doi.org/10.1007/s10115-007-0105-3
[5] Macdonald, C. and Ounis, I. (2008b). Key blog distillation: ranking aggregates. In: Proceedings of the 17th ACM conference on Information and knowledge management, ACM, New York, NY, USA, CIKM '08, pp. 1043-1052, DOI 10.1145/1458082.1458221, URL http://doi.acm.org/10.1145/1458082.1458221
[6] McCreadie,R. M. C.,Macdonald, C. and Ounis, I. (2010). News article ranking: leveraging the wisdom of bloggers. In Adaptivity, Personalization and Fusion of Heterogeneous Information, RIAO ‘10, pp. 40-48, Paris, France. Le Centre de Hautes Etudes Internationals D’Informatique Documentaire.http://dl.acm.org/citation.cfm?id=1937055.1937064.
[7] Albaham, A. T. and Salim, N. (2012a). Adapting voting techniques for online forum thread retrieval. Advanced Machine Learning Technologies and Applications, volume 322 of Communications in Computer and Information Science, pages 439-448. Springer Berlin Heidelberg. ISBN 978-3-642-35325-3.
[8] Wang, G. A., Jiao, J. and Fan, W.(2009). Searching for Authoritative Documents in Knowledge-Base Communities. ICIS 2009 Proceedings. Paper 109.http://aisel.aisnet.org/icis2009/109
[9] Fan, W. (2009). Effective search in online knowledge communities: A genetic algorithm approach (Doctoral dissertation, Virginia Polytechnic Institute and State University).
[10] Albaham, A. T. and Salim, N. (2012b). Quality-biased retrieval in online forums. Journal of Theoretical and Applied Information Technology, 38(1), pp. 55-62.
[11] Albaham, A. T. and Salim, N. (2013, December). Quality biased thread retrieval using the voting model. In Proceedings of the 18th Australasian Document Computing Symposium (pp. 97-100). ACM.
[12] Albaham, A. T., Salim, N. and Adekunle, O. I. (2014, January). Leveraging Post Level Quality Indicators in Online Forum Thread Retrieval. In Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013) (pp. 417- 425). Springer Singapore.
[13] Bhatia, S. and Mitra, P. (2010). Adopting inference networks for online thread retrieval. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, pp. 1300-1305, Atlanta, Georgia, USA.
[14] Zuriati Ismail, Atefeh Heydari, Mohammadali Tavakoli, Naomie Salim. “Incorporating Author’s Activeness in Online Discussion in Thread Retrieval Model” ARPN Journal of Engineering and Applied Sciences 10 (2), 473-479
[15] Weimer, M. and Gurevych, I. (2007). Predicting the perceived quality of web forum posts. In Proceedings of the Conference on Recent Advances in Natural Language Processing (RANLP), pp. 643-648.
[16] Lui, M. and Baldwin, T. (2010). Classifying user forum participants: Separating the gurus from the hacks, and other tales of the internet. In Proceedings of the Australasian Language Technology Association Workshop 2010, pp. 49-57, Melbourne, Australia, December 2010.
[17] Eng, K. and Chai, K. (2011). A Machine Learning-based Approach for Automated Quality Assessment of User Generated Content in Web Forums. PhD thesis, Digital Ecosystems and Business Intelligence Institute, Curtin University.
[18] Burel, G., He, Y. and Alani, H. (2012). Automatic identification of best answers in online enquiry communities. In 9th Extended Semantic Web Conference, May 2012.
[19] Fan, W., Wang, G. and Liu, X. (2011). A knowledge adaption model based framework for finding helpful user generated content in online communities: In Thirty Second International Conference on Information Systems. AIS Electronic Library (AISeL).
[20] Ponte, J. M. and Croft, W.B. (1998). A language modeling approach to information retrieval. In: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, ACM, NewYork, NY, USA, SIGIR ‘98, pp. 275-281, DOI 10.1145/290941.291008.
[21] Zhai, C. and Lafferty, J. (2004). A study of smoothing methods for language models applied to information retrieval. ACM Trans Inf Syst, 22(2), pp. 179-214, DOI 10.1145/984321.984322.
[22] Craswell, N., Robertson, S., Zaragoza, H., and Taylor, M. (2005, August). Relevance weighting for query independent evidence. In Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval (pp. 416-423). ACM.
[23] Aslam, J. A. and Montague, M. (2001). Models for metasearch. In: Oft, W. B., Harper, D., Kraft, D. et al. (eds.) Proceedings of ACM SIGIR 2001. ACM Press, New Orleans, pp. 276–284. doi: 10.1145/383952.384007
[24] Fox, E.A. and Shaw, J. A. (1994). Combination of multiple searches. In: Proceedings of TREC-2. NIST, Gaithersburg.
[25] Metzler, D. and Croft, W. B. (2007). Linear feature-based models for information retrieval. Inf. Retr., 10(3), pp. 257-274, June. ISSN 1386- 4564.