Skip to content

Genia

Part of Speech (POS)ConstituencyCoreferenceEventRelationEnglish

Genia is a Part of Speech (POS)-focused dataset in English that provides 1,999 labeled examples distributed in Text, XML format.

About Genia

Dataset contains 1,999 Medline abstracts, selected using a PubMed query for the three MeSH terms "human", "blood cells", and "transcription factors". The corpus has been annotated for part-of-speech, contituency syntactic, terms, events, relations, and coreference.

Details

Task
Part of Speech (POS), Constituency, Coreference, Event, Relation
Language
English
Format
Text, XML
Rows / instances
1,999
Creator
Kim et al.
Year
2003
Download

FAQ