Skip to content

Trec CAR Dataset

Information RetrievalEnglish

Trec CAR Dataset is a information retrieval dataset in English from Dietz et al. with ~285,000 records in CBOR format.

About Trec CAR Dataset

Dataset contains topics, outlines, and paragraphs that are extracted from English Wikipedia (2016 XML dump). Wikipedia articles are split into the outline of sections and the contained paragraphs.

Details

Task
Information Retrieval
Language
English
Format
CBOR
Rows / instances
~285,000
Creator
Dietz et al.
Year
2019
Download Paper

Related Information Retrieval datasets

FAQ