Running 118 118 TxT360: Trillion Extracted Text π Create a large-scale deduplicated text dataset for LLM training
bunkalab/Phi-3-mini-128k-instruct-LinearBunkaScore-4.6k-DPO Text Generation β’ 4B β’ Updated May 30, 2024 β’ 6 β’ 2
OrdalieTech/Solon-embeddings-large-0.1 Feature Extraction β’ 0.6B β’ Updated Mar 26, 2024 β’ 12.4k β’ β’ 52
MoritzLaurer/deberta-v3-base-zeroshot-v1 Zero-Shot Classification β’ 0.2B β’ Updated Nov 29, 2023 β’ 4.28k β’ β’ 40