Skip to content

Multimodal Sarcasm Detection Dataset (MUStARD)

Multi-Modal LearningEnglishBenchmark

Multimodal Sarcasm Detection Dataset (MUStARD) is a multi-modal learning benchmark dataset in English from Castro et al. with 6,365 records in JSON format.

📊 This dataset is used as an LLM benchmark. See model leaderboards →

About Multimodal Sarcasm Detection Dataset (MUStARD)

The dataset, a multimodal video corpus, consists of audiovisual utterances annotated with sarcasm labels. Each utterance is accompanied by its context, which provides additional information on the scenario where the utterance occurs.

Details

Task
Multi-Modal Learning
Language
English
Format
JSON
Rows / instances
6,365
Creator
Castro et al.
Year
2019
Download Paper

Related Multi-Modal Learning datasets

FAQ