Skip to content

COVID-19 Open Research Dataset (CORD-19)

Text CorporaEnglishBenchmark

Created by Allen Institute at 2020, the COVID-19 Open Research Dataset (CORD-19) is a text corpora benchmark dataset in English containing 44 records in JSON format.

📊 This dataset is used as an LLM benchmark. See model leaderboards →

About COVID-19 Open Research Dataset (CORD-19)

Dataset contains 44,000 scholarly articles, including over 29,000 with full text, about COVID-19 and the coronavirus family of viruses for use by the global research community.

Details

Task
Text Corpora
Language
English
Format
JSON
Rows / instances
44
Creator
Allen Institute
Year
2020
Download

Related Text Corpora datasets

FAQ