Skip to content

Microsoft Speech Language Translation Corpus (MSLT)

Speech RecognitionMachine TranslationMulti-Lingual

Microsoft Speech Language Translation Corpus (MSLT) is a speech recognition dataset in Multi-Lingual from Federmann et al. in Wav format.

About Microsoft Speech Language Translation Corpus (MSLT)

Dataset contains conversational, bilingual speech test and tuning data for English, Chinese, and Japanese. It includes audio data, transcripts, and translations; and allows end-to-end testing of spoken language translation systems on real-world data.

Details

Task
Speech Recognition, Machine Translation
Language
Multi-Lingual
Format
Wav
Rows / instances
n/a
Creator
Federmann et al.
Year
2017
Download Paper

FAQ