Skip to content

a686d380/h-corpus-raw

General NLPZH

The a686d380/h-corpus-raw dataset is a ZH General NLP resource from a686d380 at 2023. With 88 downloads and 48 likes, it is actively used by the community.

About a686d380/h-corpus-raw

未清洗的中文H小说 数据 文章数 解压后大小 来源 质量 备注 jjsw 73,432 4.0 GB 禁忌书屋 高 - pixiv-selected 2,935 174.3 MB pixiv排行版 高 - shubao 6,776 1.6 GB 网络 低 - sis-long 4,555 3.5 GB sis 中 - sis-short 111,237 4.1 GB sis 中 - xbookcn 39,798 1.0 GB xbookcn 高 -...

Details

Task
General NLP
Language
ZH
Format
Parquet
Rows / instances
N/A
Creator
a686d380
Year
2023
Downloads
88
Likes
48
Download Homepage

Related General NLP datasets

FAQ