Automated Fact-Checking By Incorporating Contextual Knowledge and Multi-Faceted Search
Authors: Wenbo Wang, Yi-fang Brook Wu
Abstract:
The spread of misinformation and disinformation has become a major concern, particularly with the rise of social media as a primary source of information for many people. As a means to address this phenomenon, automated fact-checking has emerged as a safeguard against the spread of misinformation and disinformation. Existing fact-checking approaches aim to determine whether a news claim is true or false, and they have achieved decent veracity prediction accuracy. However, the state of the art methods rely on manually verified external information to assist the checking model in making judgments, which requires significant human resources. This study presents a framework, SAC, which focuses on 1) augmenting the representation of a claim by incorporating additional context using general-purpose, comprehensive and authoritative data; 2) developing a search function to automatically select relevant, new and credible references; 3) focusing on the important parts of the representations of a claim and its reference that are most relevant to the fact-checking task. The experimental results demonstrate that: 1) Augmenting the representations of claims and references through the use of a knowledge base, combined with the multi-head attention technique, contributes to improved performance of fact-checking. 2) SAC with auto-selected references outperforms existing fact-checking approaches with manual selected references. Future directions of this study include I) exploring knowledge graph in Wikidata to dynamically augment the representations of claims and references without introducing too much noises; II) exploring semantic relations in claims and references to further enhance fact-checking.
Keywords: Fact checking, claim verification, Deep Learning, Natural Language Processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 98References:
[1] Preslav Nakov et al. “Automated Fact-Checking for Assisting Human Fact-Checkers”. In: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21. Ed. by Zhi-Hua Zhou. Survey Track. International Joint Conferences on Artificial Intelligence Organization, Aug. 2021, pp. 4551–4558. DOI: 10.24963/ijcai.2021/619. URL: https://doi.org/10. 24963/ijcai.2021/619.
[2] Zhijiang Guo, Michael Schlichtkrull, and Andreas Vlachos. “A survey on automated fact-checking”. In: Transactions of the Association for Computational Linguistics 10 (2022), pp. 178–206.
[3] Did a woman get fired after donating a kidney on her boss’ behalf? https://www.snopes.com/fact-check/firedkidney- donor/. July 2023.
[4] Did WEF call for an AI-written bible to create new religions? https://www.snopes.com/fact- check/wefrewrite- bible. June 2023.
[5] Neema Kotonya and Francesca Toni. “Explainable Automated Fact-Checking for Public Health Claims”. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Online: Association for Computational Linguistics, Nov. 2020, pp. 7740–7754. DOI: 10.18653/v1/2020.emnlpmain. 623. URL: https://aclanthology.org/2020.emnlpmain. 623.
[6] Lianwei Wu et al. “DTCA: Decision Tree-based Co-Attention Networks for Explainable Claim Verification”. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online: Association for Computational Linguistics, July 2020, pp. 1024–1035. DOI: 10.18653/v1/2020.acl-main.97. URL: https://aclanthology.org/2020.acl-main.97.
[7] Kai Shu et al. “defend: Explainable fake news detection”. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 2019, pp. 395–405.
[8] Zhiwei Yang et al. “A Coarse-to-fine Cascaded Evidence-Distillation Neural Network for Explainable Fake News Detection”. In: Proceedings of the 29th International Conference on Computational Linguistics. Gyeongju, Republic of Korea: International Committee on Computational Linguistics, Oct. 2022, pp. 2608–2621. URL: https://aclanthology.org/2022.coling-1.230.
[9] Kashyap Popat et al. “DeClarE: Debunking Fake News and False Claims using Evidence-Aware Deep Learning”. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium: Association for Computational Linguistics, Oct. 2018, pp. 22–32. DOI: 10.18653/v1/D18- 1003. URL: https://aclanthology.org/D18-1003.
[10] Nguyen Vo and Kyumin Lee. “Hierarchical Multi-head Attentive Network for Evidence-aware Fake News Detection”. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Online: Association for Computational Linguistics, Apr. 2021, pp. 965–975. DOI: 10 . 18653 / v1 / 2021 . eacl - main . 83. URL: https : //aclanthology.org/2021.eacl-main.83.
[11] Denny Vrandeˇci´c and Markus Krötzsch. “Wikidata: a free collaborative knowledgebase”. In: Communications of the ACM 57.10 (2014), pp. 78–85.
[12] Sergey Brin and Lawrence Page. “The anatomy of a large-scale hypertextual web search engine”. In: Computer networks and ISDN systems 30.1-7 (1998), pp. 107–117.
[13] Victoria Higgins et al. “COVID-19: from an acute to chronic disease? Potential long-term health consequences”. In: Critical reviews in clinical laboratory sciences 58.5 (2021), pp. 297–310.
[14] Monica Bianchini, Marco Gori, and Franco Scarselli. “Inside pagerank”. In: ACM Transactions on Internet Technology (TOIT) 5.1 (2005), pp. 92–128.
[15] Ashish Vaswani et al. “Attention is all you need”. In: Advances in neural information processing systems 30 (2017).
[16] Hannah Rashkin et al. “Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking”. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Copenhagen, Denmark: Association for Computational Linguistics, Sept. 2017, pp. 2931–2937. DOI: 10.18653/ v1/D17-1317. URL: https://aclanthology.org/D17-1317.
[17] Jing Ma et al. “Sentence-Level Evidence Embedding for Claim Verification with Hierarchical Attention Networks”. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association for Computational Linguistics, July 2019, pp. 2561–2571. DOI: 10.18653/ v1/P19-1244. URL: https://aclanthology.org/P19-1244.
[18] Lianwei Wu et al. “Evidence Inference Networks for Interpretable Claim Verification”. In: Proceedings of the AAAI Conference on Artificial Intelligence 35.16 (May 2021), pp. 14058–14066. DOI: 10.1609/aaai.v35i16. 17655. URL: https://ojs.aaai.org/index.php/AAAI/article/ view/17655.
[19] Ramy Baly et al. “Integrating Stance Detection and Fact Checking in a Unified Corpus”. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). New Orleans, Louisiana: Association for Computational Linguistics, June 2018, pp. 21–27. DOI: 10.18653/v1/ N18-2004. URL: https://aclanthology.org/N18-2004.
[20] Isabelle Augenstein et al. “MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims”. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, China: Association for Computational Linguistics, Nov. 2019, pp. 4685–4697. DOI: 10.18653/v1/D19-1475. URL: https://aclanthology.org/D19-1475.
[21] Ashim Gupta and Vivek Srikumar. “X-Fact: A New Benchmark Dataset for Multilingual Fact Checking”. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Online: Association for Computational Linguistics, Aug. 2021, pp. 675–682. DOI: 10.18653/v1/2021.acl-short.86. URL: https://aclanthology.org/2021.acl-short.86.
[22] Xuming Hu et al. “CHEF: A Pilot Chinese Dataset for Evidence-Based Fact-Checking”. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Seattle, United States: Association for Computational Linguistics, July 2022, pp. 3362–3376. DOI: 10.18653/v1/2022.naacl-main.246. URL: https://aclanthology.org/2022.naacl-main.246.
[23] Sahar Abdelnabi, Rakibul Hasan, and Mario Fritz. “Open-domain, content-based, multi-modal fact-checking of out-of-context images via online resources”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, pp. 14940–14949.
[24] Claudio Carpineto and Giovanni Romano. “A survey of automatic query expansion in information retrieval”. In: Acm Computing Surveys (CSUR) 44.1 (2012), pp. 1–50.
[25] George A Miller. “WordNet: a lexical database for English”. In: Communications of the ACM 38.11 (1995), pp. 39–41.
[26] Alexander O Zhirov, Oleg V Zhirov, and Dima L Shepelyansky. “Two-dimensional ranking of Wikipedia articles”. In: The European Physical Journal B 77 (2010), pp. 523–531.
[27] Jon M Kleinberg. “Authoritative sources in a hyperlinked environment”. In: Journal of the ACM (JACM) 46.5 (1999), pp. 604–632.
[28] Zoltán Gyöngyi, Hector Garcia-Molina, and Jan Pedersen. “Combating web spam with trustrank”. In: Proceedings of the Thirtieth international conference on Very large data bases-Volume 30. 2004, pp. 576–587.
[29] Jacob Devlin et al. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics, June 2019, pp. 4171–4186. DOI: 10.18653/ v1/N19-1423. URL: https://aclanthology.org/N19-1423.
[30] Nils Reimers and Iryna Gurevych. “Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks”. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, China: Association for Computational Linguistics, Nov. 2019, pp. 3982–3992. DOI: 10.18653/v1/D19- 1410. URL: https://aclanthology.org/D19-1410.
[31] Iz Beltagy, Kyle Lo, and Arman Cohan. “SciBERT: A Pretrained Language Model for Scientific Text”. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, China: Association for Computational Linguistics, Nov. 2019, pp. 3615–3620. DOI: 10.18653/v1/D19- 1371. URL: https://aclanthology.org/D19-1371.
[32] Jinhyuk Lee et al. “BioBERT: a pre-trained biomedical language representation model for biomedical text mining”. In: Bioinformatics 36.4 (2020), pp. 1234–1240.
[33] Hugging Face – The AI community building the future. https://huggingface.co/. July 2023.
[34] Sainbayar Sukhbaatar and Rob Fergus. “Learning from Noisy Labels with Deep Neural Networks”. In: CoRR abs/1406.2080 (2014).
[35] Microsoft. Neural Network Intelligence. https://github. com/microsoft/nni. Version 2.0. Jan. 2021.
[36] spaCy · Industrial-strength Natural Language Processing in Python. https://spacy.io/. July 2023.
[37] MediaWiki API help. https://www.wikidata.org/w/api.php. July 2023.
[38] Domcop.com. Bringing back pagerank using Open Data (free API key). July 2023. URL: https://www.domcop.com/openpagerank/.
[39] Wanjun Zhong et al. “Reasoning Over Semantic-Level Graph for Fact Checking”. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online: Association for Computational Linguistics, July 2020, pp. 6170–6180. DOI: 10.18653/ v1/2020.acl-main.549. URL: https://aclanthology.org/ 2020.acl-main.549.