google-research-datasets/conceptual_captions
Image To TextENBenchmark
Google-research-datasets/conceptual_captions is a image to text-focused benchmark dataset in EN that provides 8,675,436 labeled examples distributed in Parquet format. It is distributed under the other license and falls in the 1M<n<10M size category, and has been downloaded 15.5K times.
📊 This dataset is used as an LLM benchmark. See model leaderboards →
About google-research-datasets/conceptual_captions
Dataset Card for Conceptual Captions
Dataset Summary
Conceptual Captions is a dataset consisting of ~3.3M images annotated with captions. In contrast with the curated style of other image caption annotations, Conceptual Caption image...
Details
- Task
- Image To Text
- Language
- EN
- Format
- Parquet
- Rows / instances
- 8675436
- Size
- 1M<n<10M
- Creator
- google-research-datasets
- Year
- 2022
- License
- other
- Downloads
- 15489
- Likes
- 107