Skip to content

LEDGAR

ClassificationEnglish

The LEDGAR dataset is a English classification resource from Tuggener et al. at 2020 comprising 1.8 examples.

About LEDGAR

LEDGAR is a multilabel corpus of legal provisions in contracts suited for text classification in the legal domain (legaltech). It features over 1.8M+ provisions and a set of 180K+ labels. A smaller, cleaned version of the corpus is also available.

Details

Task
Classification
Language
English
Format
JSON
Rows / instances
1.8M
Creator
Tuggener et al.
Year
2020
Download Paper

Related Classification datasets

FAQ