CodeSearchNet Corpus
Text CorporaEnglishBenchmark
The CodeSearchNet Corpus dataset is a English text corpora resource from Husain et al. at 2019 comprising 6 examples.
📊 This dataset is used as an LLM benchmark. See model leaderboards →
About CodeSearchNet Corpus
Dataset contains functions with associated documentation written in Go, Java, JavaScript, PHP, Python, and Ruby from open source projects on GitHub.
Details
- Task
- Text Corpora
- Language
- English
- Format
- JSON
- Rows / instances
- 6M
- Creator
- Husain et al.
- Year
- 2019