proj-persona/PersonaHub
Text GenerationText ClassificationToken ClassificationFill MaskTable Question AnsweringEN, ZH
Proj-persona/PersonaHub is a text generation dataset in EN, ZH from proj-persona in Parquet format.
About proj-persona/PersonaHub
Scaling Synthetic Data Creation with 1,000,000,000 Personas
This repo releases data introduced in our paper Scaling Synthetic Data Creation with 1,000,000,000 Personas:
We propose a novel persona-driven data synthesis methodology that leverages...
Details
- Task
- Text Generation, Text Classification, Token Classification, Fill Mask, Table Question Answering
- Language
- EN, ZH
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- proj-persona
- Year
- 2024