Skip to content

Parallel Arabic DIalectal Corpus (PADIC)

Text CorporaArabic

Parallel Arabic DIalectal Corpus (PADIC) is a text corpora-focused dataset in Arabic that provides 6,000+ labeled examples distributed in HTML format.

About Parallel Arabic DIalectal Corpus (PADIC)

Dataset is a multi-dialectal corpus - contains six dialects in addition to MSA in Buckwalter format.

Details

Task
Text Corpora
Language
Arabic
Format
HTML
Rows / instances
6,000+
Creator
Abbas et al.
Year
2013
Download

Related Text Corpora datasets

FAQ