Question 1

What is the princeton-nlp/SWE-bench_Verified dataset?

Accepted Answer

Dataset Summary
SWE-bench Verified is a subset of 500 samples from the SWE-bench test set, which have been human-validated for quality. SWE-bench is a dataset that tests systems’ ability to solve GitHub issues automatically. See this post for more...

Question 2

Is princeton-nlp/SWE-bench_Verified a benchmark?

Accepted Answer

Yes — princeton-nlp/SWE-bench_Verified is used as an LLM benchmark. See model leaderboards in the Benchmarks section.

Question 3

Where can I download princeton-nlp/SWE-bench_Verified?

Accepted Answer

princeton-nlp/SWE-bench_Verified is available at its source: https://huggingface.co/datasets/princeton-nlp/SWE-bench_Verified.

princeton-nlp/SWE-bench_Verified

About princeton-nlp/SWE-bench_Verified

Details

Related General NLP datasets

FAQ