NCBI Disease Corpus
Information ExtractionNamed Entity Recognition (NER)English
NCBI Disease Corpus is a information extraction dataset in English from Dogan et al. with 6,892 records in Text format.
About NCBI Disease Corpus
Dataset contains 6,892 disease mentions, which are mapped to 790 unique disease concepts. Of these, 88% link to a MeSH identifier, while the rest contain an OMIM identifier.
Details
- Task
- Information Extraction, Named Entity Recognition (NER)
- Language
- English
- Format
- Text
- Rows / instances
- 6,892
- Creator
- Dogan et al.
- Year
- 2014