Skip to content

AudioVisual-Caption/ASID-1M

Image Text To TextENcc-by-2.0

AudioVisual-Caption/ASID-1M is a image text to text-focused dataset in EN distributed in Parquet format. It is distributed under the cc-by-2.0 license and falls in the 100K<n<1M size category, and has been downloaded 969 times.

About AudioVisual-Caption/ASID-1M

ASID-1M: Attribute-Structured and Quality-Verified Audiovisual Instructions [🏠 Homepage] [📖 Arxiv Paper] [🤗 Models & Datasets] [💻 Code] Introduction We introduce ASID-1M, a large-scale audiovisual instruction dataset built to support...

Details

Task
Image Text To Text
Language
EN
Format
Parquet
Rows / instances
N/A
Size
100K<n<1M
Creator
AudioVisual-Caption
Year
2026
License
cc-by-2.0
Downloads
969
Likes
85
Download Homepage

Related Image Text To Text datasets

FAQ