UIT-SPC
Text CorporaVietnamese
The UIT-SPC dataset is a Vietnamese text corpora resource from Thin et al. at 2017 comprising 1,565 examples.
About UIT-SPC
Dataset contains 1,565 papers of top NLP/CL conferences such as ACL, CoNLL , EACL NAACL and EMNLP. They are pre-processed by removing unnecessary information (e.g formula, table, etc). Then, they were formatted to .xml that includes the title paper, sections, and sub-sections according to the paper's structure. [requires contacting author for corpus]
Details
- Task
- Text Corpora
- Language
- Vietnamese
- Format
- n/a
- Rows / instances
- 1,565
- Creator
- Thin et al.
- Year
- 2017