data-archetype/cc12_imagenet21k_recap_hq_bucketed
Text To ImageENBenchmark
Created by data-archetype at 2026, the data-archetype/cc12_imagenet21k_recap_hq_bucketed is a text to image benchmark dataset in EN in Parquet format. With 12.4K downloads and 0 likes, it is actively used by the community. It is released under the other license and is a 10M<n<100M-scale dataset.
📊 This dataset is used as an LLM benchmark. See model leaderboards →
About data-archetype/cc12_imagenet21k_recap_hq_bucketed
cc12_imagenet21k_recap_hq_bucketed
Title: cc12_imagenet21k_recap_hq_bucketed
Description: This ~18M rows dataset is a re upload of https://huggingface.co/datasets/gmongaras/CC12M_and_Imagenet21K_Recap_Highqual where the images have
been pre bu...
Details
- Task
- Text To Image
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 10M<n<100M
- Creator
- data-archetype
- Year
- 2026
- License
- other
- Downloads
- 12423
- Likes
- 0