RegMix: Data Mixture as Regression for Language Model Pre-training Paper β’ 2407.01492 β’ Published Jul 1, 2024 β’ 37
Contra (Bottleneck T5) Collection Text autoencoders capable of embedding and generating text in a fixed-size latent space, useful for embeddings and latent space text editing. β’ 4 items β’ Updated Oct 3, 2023 β’ 28
Trained Models ποΈ Collection They may be small, but they're training like giants! β’ 8 items β’ Updated Dec 3, 2024 β’ 20