Skip to content

allenai/paloma

General NLPEnglish

Created by allenai at 2023, the allenai/paloma is a General NLP dataset in English in Parquet format. With 2.5K downloads and 44 likes, it is actively used by the community and is a 100K<n<1M-scale dataset.

About allenai/paloma

Dataset Card for Paloma Evaluations of language models (LMs) commonly report perplexity on monolithic data held out from training. Implicitly or explicitly, this data is composed of domains—varying distributions of language. We introduce Perpl...

Details

Task
General NLP
Language
English
Format
Parquet
Rows / instances
N/A
Size
100K<n<1M
Creator
allenai
Year
2023
Downloads
2508
Likes
44
Download Homepage

Related General NLP datasets

FAQ