AudioSet
Speech RecognitionVisualMulti-Lingual
AudioSet is a speech recognition-focused dataset in Multi-Lingual distributed in CSV, TFR format.
About AudioSet
Dataset consists of an expanding ontology of 632 audio event classes and a collection of 2,084,320 human-labeled 10-second sound clips drawn from YouTube videos.
Details
- Task
- Speech Recognition, Visual
- Language
- Multi-Lingual
- Format
- CSV, TFR
- Rows / instances
- n/a
- Creator
- Year
- 2017