Skip to content

General NLP Datasets

There are 200 general nlp datasets in our directory, 8 of which are benchmarks. Each links to its source, paper, and download — browse the full list below or filter by language.

General NLP is a machine-learning task covered in our directory. We catalog 200 datasets for it.

Updated June 2026

What languages do general nlp datasets cover?

Explore other dataset tasks

Frequently asked questions