Toward A Multilingual and Multimodal Data Repository for COVID-19 Disinformation

Yichuan Li, Bohan Jiang, Kai Shu, Huan Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

14 Scopus citations

Abstract

The COVID-19 epidemic is considered as the global health crisis of the whole society and the greatest challenge mankind faced since World War Two. Unfortunately, the fake news about COVID-19 is spreading as fast as the virus itself. The incorrect health measurements, anxiety, and hate speeches will have bad consequences on people's physical health, as well as their mental health in the whole world. To help better combat the COVID-19 fake news, we propose a new fake news detection dataset MM-COVID1 (Multilingual and Multidimensional COVID-19 Fake News Data Repository). This dataset provides the multilingual fake news and the relevant social context. We collect 3981 pieces of fake news content and 7192 trustworthy information from English, Spanish, Portuguese, Hindi, French and Italian, 6 different languages. We present a detailed and exploratory analysis of MM-COVID from different perspectives.

Original languageEnglish (US)
Title of host publicationProceedings - 2020 IEEE International Conference on Big Data, Big Data 2020
EditorsXintao Wu, Chris Jermaine, Li Xiong, Xiaohua Tony Hu, Olivera Kotevska, Siyuan Lu, Weijia Xu, Srinivas Aluru, Chengxiang Zhai, Eyhab Al-Masri, Zhiyuan Chen, Jeff Saltz
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages4325-4330
Number of pages6
ISBN (Electronic)9781728162515
DOIs
StatePublished - Dec 10 2020
Event8th IEEE International Conference on Big Data, Big Data 2020 - Virtual, Atlanta, United States
Duration: Dec 10 2020Dec 13 2020

Publication series

NameProceedings - 2020 IEEE International Conference on Big Data, Big Data 2020

Conference

Conference8th IEEE International Conference on Big Data, Big Data 2020
Country/TerritoryUnited States
CityVirtual, Atlanta
Period12/10/2012/13/20

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'Toward A Multilingual and Multimodal Data Repository for COVID-19 Disinformation'. Together they form a unique fingerprint.

Cite this