tomg-group-umd/pixelprose
Image To TextText To ImageVisual Question AnsweringEN
Tomg-group-umd/pixelprose is a image to text dataset in EN from tomg-group-umd in Parquet format.
About tomg-group-umd/pixelprose
From Pixels to Prose: A Large Dataset of Dense Image Captions
[ arXiv paper ] | [ 🌮 image tars ]
PixelProse is a comprehensive dataset of over 16M (million) synthetically generated captions,
leveraging cutting-edge vision-language models (Gemi...
Details
- Task
- Image To Text, Text To Image, Visual Question Answering
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- tomg-group-umd
- Year
- 2024