ClusterlabAi/101_billion_arabic_words_dataset
Text GenerationARapache-2.0
ClusterlabAi/101_billion_arabic_words_dataset is a text generation-focused dataset in AR that provides 33,059,988 labeled examples distributed in Parquet format. It is distributed under the apache-2.0 license and falls in the 10M<n<100M size category, and has been downloaded 1.1K times.
About ClusterlabAi/101_billion_arabic_words_dataset
101 Billion Arabic Words Dataset
Updates
Maintenance Status: Actively Maintained
Update Frequency: Weekly updates to refine data quality and expand coverage.
Upcoming Version
More Cleaned Version: A more cleaned version...
Details
- Task
- Text Generation
- Language
- AR
- Format
- Parquet
- Rows / instances
- 33059988
- Size
- 10M<n<100M
- Creator
- ClusterlabAi
- Year
- 2024
- License
- apache-2.0
- Downloads
- 1133
- Likes
- 72