Not all Fake News is Written: A Dataset and Analysis of Misleading Video Headlines

Yoo Yeon Sung, Jordan Boyd-Graber, and Naeemul Hassan

January 2023

PDF Dataset Code Video

Abstract

Polarization and the marketplace for impressions have conspired to make navigating information online difficult for users, and while there has been a significant effort to detect false or misleading text, multimodal datasets have received considerably less attention. To complement existing resources, we present multimodal Video Misleading Headline (VMH), a dataset that consists of videos and whether annotators believe the headline is representative of the video’s contents. After collecting and annotating this dataset, we analyze multimodal baselines for detecting misleading headlines. Our annotation process also focuses on why annotators view a video as misleading, allowing us to better understand the interplay of annotators’ background and the content of the videos.

Type

Preprint

Publication

Empirical Methods in Natural Language Processing (EMNLP 2023). Main

Not all Fake News is Written: A Dataset and Analysis of Misleading Video Headlines

Abstract

Yoo Yeon Sung (성유연)

Applied Scientist