Genia
Part of Speech (POS)ConstituencyCoreferenceEventRelationEnglish
Genia is a Part of Speech (POS)-focused dataset in English that provides 1,999 labeled examples distributed in Text, XML format.
About Genia
Dataset contains 1,999 Medline abstracts, selected using a PubMed query for the three MeSH terms "human", "blood cells", and "transcription factors". The corpus has been annotated for part-of-speech, contituency syntactic, terms, events, relations, and coreference.
Details
- Task
- Part of Speech (POS), Constituency, Coreference, Event, Relation
- Language
- English
- Format
- Text, XML
- Rows / instances
- 1,999
- Creator
- Kim et al.
- Year
- 2003