Together

company

Verified

https://together.ai

togethercompute

togethercomputer

Inference Provider

3,277,620 monthly requests

AI & ML interests

Foundation Models, Decentralized Computing, Open Source AI.

Papers

Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking

View all Papers

Articles

Fine-tune Any LLM from the Hugging Face Hub with Together AI

authored a paper 7 months ago

Cartridges: Lightweight and general-purpose long context representations via self-study

Paper • 2506.06266 • Published Jun 6, 2025 • 7

posted an update 7 months ago

Post

335

🚀 Full-Quality Wan2.2 Video Generation on a single 24GB GPU — Powered by DFloat11

We just released the DFloat11 compressed Wan2.2 models. Now you can run full-quality Wan2.2 video generation on a single 24GB GPU, thanks to DFloat11 compression and CPU offloading.

🔗 Image-to-Video: DFloat11/Wan2.2-I2V-A14B-DF11
🔗 Text-to-Video: DFloat11/Wan2.2-T2V-A14B-DF11

authored 11 papers 8 months ago

FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU

Paper • 2303.06865 • Published Mar 13, 2023 • 1

Auto-Differentiation of Relational Computations for Very Large Scale Machine Learning

Paper • 2306.00088 • Published May 31, 2023 • 1

Holistic Evaluation of Language Models

Paper • 2211.09110 • Published Nov 16, 2022 • 1

Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads

Paper • 2410.01805 • Published Oct 2, 2024

Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild

Paper • 2410.05357 • Published Oct 7, 2024

Zero-Indexing Internet Search Augmented Generation for Large Language Models

Paper • 2411.19478 • Published Nov 29, 2024

HEXGEN-TEXT2SQL: Optimizing LLM Inference Request Scheduling for Agentic Text-to-SQL Workflow

Paper • 2505.05286 • Published May 8, 2025 • 1

AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning

Paper • 2505.24298 • Published May 30, 2025 • 34

Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning

Paper • 2506.07227 • Published Jun 8, 2025

Multi-Step Visual Reasoning with Visual Tokens Scaling and Verification

Paper • 2506.07235 • Published Jun 8, 2025 • 3

Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny

Paper • 2507.16331 • Published Jul 22, 2025 • 22

authored a paper 11 months ago

70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float

Paper • 2504.11651 • Published Apr 15, 2025 • 31

authored a paper over 1 year ago

RedPajama: an Open Dataset for Training Large Language Models

Paper • 2411.12372 • Published Nov 19, 2024 • 57

posted an update over 1 year ago

Post

2848

https://huggingface.co/organizations/nerdyface/share/xvWxWxYmYpCLqZlvNJEZbJHFsDITAicJAT

posted an update over 1 year ago

Post

3656

hi florent and livestream!

5 replies

·

authored a paper over 1 year ago

BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing

Paper • 2206.15076 • Published Jun 30, 2022 • 5

authored 2 papers over 1 year ago

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

Paper • 2306.11698 • Published Jun 20, 2023 • 13

Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT

Paper • 2402.07440 • Published Feb 12, 2024 • 1