Semantic Textual Similarity Datasets
There are 5 semantic textual similarity datasets in our directory. Each links to its source, paper, and download — browse the full list below or filter by language.
Semantic Textual Similarity is a machine-learning task covered in our directory. We catalog 5 datasets for it.
Updated June 2026
- BIOSSESSemantic Textual SimilarityEnglish
- ParaBankSemantic Textual SimilarityEnglish
- A Novel Approach to a Semantically-Aware Representation of Items (NASARI)Semantic Textual SimilarityMulti-Lingual
- Quora Question PairsSemantic Textual SimilarityEnglish
- Semantic Textual Similarity BenchmarkSemantic Textual SimilarityEnglish