CohereLabs/aya_collection_language_split
General NLPACE, AFR, AMHapache-2.0
The CohereLabs/aya_collection_language_split dataset is a ACE, AFR, AMH General NLP resource from CohereLabs at 2024 comprising 513,758,189 examples. With 18.9K downloads and 119 likes, it is actively used by the community. It is released under the apache-2.0 license and is a 100M<n<1B-scale dataset.
About CohereLabs/aya_collection_language_split
This is a re-upload of the aya_collection, and only differs in the structure of upload. While the original aya_collection is structured by folders split according to dataset name, this dataset is split by language. We recommend you use this versio...
Details
- Task
- General NLP
- Language
- ACE, AFR, AMH
- Format
- Parquet
- Rows / instances
- 513758189
- Size
- 100M<n<1B
- Creator
- CohereLabs
- Year
- 2024
- License
- apache-2.0
- Downloads
- 18915
- Likes
- 119