dell-research-harvard/newswire
Text ClassificationText GenerationText RetrievalSummarizationQuestion AnsweringENBenchmarkcc-by-4.0
Created by dell-research-harvard at 2024, the dell-research-harvard/newswire is a text classification benchmark dataset in EN in Parquet format. With 410 downloads and 91 likes, it is actively used by the community. It is released under the cc-by-4.0 license and is a 1M<n<10M-scale dataset.
📊 This dataset is used as an LLM benchmark. See model leaderboards →
About dell-research-harvard/newswire
Dataset Card for NewsWire
Dataset Summary
NewsWire contains 2.7 million unique public domain U.S. news wire articles, written between 1878 and 1977. Locations in these articles are georeferenced, topics are tagged using customized ne...
Details
- Task
- Text Classification, Text Generation, Text Retrieval, Summarization, Question Answering
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 1M<n<10M
- Creator
- dell-research-harvard
- Year
- 2024
- License
- cc-by-4.0
- Downloads
- 410
- Likes
- 91