Skip to content

data-archetype/cc12_imagenet21k_recap_hq_bucketed

Text To ImageENBenchmark

Created by data-archetype at 2026, the data-archetype/cc12_imagenet21k_recap_hq_bucketed is a text to image benchmark dataset in EN in Parquet format. With 12.4K downloads and 0 likes, it is actively used by the community. It is released under the other license and is a 10M<n<100M-scale dataset.

📊 This dataset is used as an LLM benchmark. See model leaderboards →

About data-archetype/cc12_imagenet21k_recap_hq_bucketed

cc12_imagenet21k_recap_hq_bucketed Title: cc12_imagenet21k_recap_hq_bucketed Description: This ~18M rows dataset is a re upload of https://huggingface.co/datasets/gmongaras/CC12M_and_Imagenet21K_Recap_Highqual where the images have been pre bu...

Details

Task
Text To Image
Language
EN
Format
Parquet
Rows / instances
N/A
Size
10M<n<100M
Creator
data-archetype
Year
2026
License
other
Downloads
12423
Likes
0
Download Homepage

Related Text To Image datasets

FAQ