Abstract
CoDA, a 1.7B-parameter diffusion coder, achieves competitive performance with smaller models through confidence-guided sampling and is released with open-source tools.
Diffusion language models promise bidirectional context and infilling capabilities that autoregressive coders lack, yet practical systems remain heavyweight. We introduce CoDA, a 1.7B-parameter diffusion coder trained on TPU with a fully open-source training pipeline. CoDA pairs large-scale diffusion pre-training with code-centric mid-training and instruction tuning, enabling confidence-guided sampling that keeps inference latency competitive. On Humaneval, MBPP, and EvalPlus, CoDA-1.7B-Instruct matches or surpasses diffusion models up to 7B parameters. Our release includes model checkpoints, evaluation harnesses, and TPU training pipelines to accelerate research on lightweight diffusion-based coding assistants.
Community
CoDA-1.7B is born for code editing ✍️tasks while overall coding performances on par with 7B models. The cool part is it does parallel decoding so it’s blazingly fast ⚡️during inference!
The models, pre/mid/post-training code and frameworks have all been open-sourced:
→ 🤗 𝗛𝘂𝗴𝗴𝗶𝗻𝗴 𝗙𝗮𝗰𝗲: https://huggingface.co/Salesforce/CoDA-v0-Instruct
→ 🤖 𝗚𝗶𝘁𝗛𝘂𝗯: https://github.com/SalesforceAIResearch/CoDA/
→ 📑 𝗧𝗲𝗰𝗵 𝗥𝗲𝗽𝗼𝗿𝘁: https://www.arxiv.org/abs/2510.03270
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Dream-Coder 7B: An Open Diffusion Language Model for Code (2025)
- Dream 7B: Diffusion Large Language Models (2025)
- Fast-dLLM v2: Efficient Block-Diffusion LLM (2025)
- LLaDA-MoE: A Sparse MoE Diffusion Language Model (2025)
- Sequential Diffusion Language Models (2025)
- dParallel: Learnable Parallel Decoding for dLLMs (2025)
- Set Block Decoding is a Language Model Inference Accelerator (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 2
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper