google-research-datasets/tydiqa
Question AnsweringAR, BN, ENBenchmarkapache-2.0
Google-research-datasets/tydiqa is a question answering benchmark dataset in AR, BN, EN from google-research-datasets with 240,544 records in Parquet format. It is distributed under the apache-2.0 license and falls in the 100K<n<1M size category, and has been downloaded 3K times.
📊 This dataset is used as an LLM benchmark. See model leaderboards →
About google-research-datasets/tydiqa
Dataset Card for "tydiqa"
Dataset Summary
TyDi QA is a question answering dataset covering 11 typologically diverse languages with 204K question-answer pairs.
The languages of TyDi QA are diverse with regard to their typology -- the ...
Details
- Task
- Question Answering
- Language
- AR, BN, EN
- Format
- Parquet
- Rows / instances
- 240544
- Size
- 100K<n<1M
- Creator
- google-research-datasets
- Year
- 2022
- License
- apache-2.0
- Downloads
- 2967
- Likes
- 38