OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models Paper • 2503.08686 • Published 9 days ago • 18
Foundation Text-Generation Models Below 360M Parameters Collection Great candidates for fine-tuning targeting Wllama and Transformers.js for mobile devices, ordered by number of parameters. • 35 items • Updated 5 days ago • 30
SpaceByte: Towards Deleting Tokenization from Large Language Modeling Paper • 2404.14408 • Published Apr 22, 2024 • 8
Message in a Bottle -- An Update to the Golden Record Paper • 2306.01765 • Published May 27, 2023 • 1
Impressions: Understanding Visual Semiotics and Aesthetic Impact Paper • 2310.17887 • Published Oct 27, 2023 • 2
Normalization of Lithuanian Text Using Regular Expressions Paper • 2312.17660 • Published Dec 29, 2023 • 1
SHARE: Shared Memory-Aware Open-Domain Long-Term Dialogue Dataset Constructed from Movie Script Paper • 2410.20682 • Published Oct 28, 2024 • 1
EmoBench-M: Benchmarking Emotional Intelligence for Multimodal Large Language Models Paper • 2502.04424 • Published Feb 6 • 1
Advancing Multi-Party Dialogue Systems with Speaker-ware Contrastive Learning Paper • 2501.11292 • Published Jan 20 • 1
SS-MPC: A Sequence-Structured Multi-Party Conversation System Paper • 2502.16920 • Published 24 days ago • 2
DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation Paper • 1911.00536 • Published Nov 1, 2019 • 1
An Empirical Study of Pre-Trained Model Reuse in the Hugging Face Deep Learning Model Registry Paper • 2303.02552 • Published Mar 5, 2023 • 1
ScandEval: A Benchmark for Scandinavian Natural Language Processing Paper • 2304.00906 • Published Apr 3, 2023 • 5