AudioVisual-Caption/ASID-1M
Image Text To TextENcc-by-2.0
AudioVisual-Caption/ASID-1M is a image text to text-focused dataset in EN distributed in Parquet format. It is distributed under the cc-by-2.0 license and falls in the 100K<n<1M size category, and has been downloaded 969 times.
About AudioVisual-Caption/ASID-1M
ASID-1M: Attribute-Structured and Quality-Verified Audiovisual Instructions
[🏠 Homepage] [📖 Arxiv Paper] [🤗 Models & Datasets] [💻 Code]
Introduction
We introduce ASID-1M, a large-scale audiovisual instruction dataset built to support...
Details
- Task
- Image Text To Text
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 100K<n<1M
- Creator
- AudioVisual-Caption
- Year
- 2026
- License
- cc-by-2.0
- Downloads
- 969
- Likes
- 85