mboehle commited on
Commit
bc2cd8c
·
verified ·
1 Parent(s): 5903417

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -14,7 +14,7 @@ base_model:
14
 
15
  ### Model Description
16
 
17
- MoshiVis is a perceptually augmented version of Moshi, giving it the ability to freely discuss images whilst maintaining its natural conversation style and low latency.
18
  To achieve this, Moshi has been extended with a visual backbone and a cross-attention mechanism to infuse the visual information into the language model.
19
  To train MoshiVis, we add a few parameters (~200M) on top of a frozen Moshi backbone (for the text/speech modeling aspect, ~7B params)
20
  and a PaliGemma2 vision encoder (for the image encoding part, ~400M parameters).
@@ -31,6 +31,8 @@ We provide the same model weights for other backends and quantization formats in
31
 
32
  ### Model Sources
33
 
 
 
34
  - **Repository:** [Github kyutai-labs/moshivis](https://github.com/kyutai-labs/moshivis)
35
  - **Demo:** [Talk to Moshi](http://vis.moshi.chat)
36
 
 
14
 
15
  ### Model Description
16
 
17
+ **MoshiVis** ([Project Page](https://kyutai.org/moshivis) | [arXiv](https://arxiv.org/abs/2503.15633)) is a perceptually augmented version of Moshi, giving it the ability to freely discuss images whilst maintaining its natural conversation style and low latency.
18
  To achieve this, Moshi has been extended with a visual backbone and a cross-attention mechanism to infuse the visual information into the language model.
19
  To train MoshiVis, we add a few parameters (~200M) on top of a frozen Moshi backbone (for the text/speech modeling aspect, ~7B params)
20
  and a PaliGemma2 vision encoder (for the image encoding part, ~400M parameters).
 
31
 
32
  ### Model Sources
33
 
34
+ - **Project Page** [kyutai.org/moshivis](https://kyutai.org/moshivi)
35
+ - **Preprint** ([arXiv/abs/2503.15633](https://arxiv.org/abs/2503.15633))
36
  - **Repository:** [Github kyutai-labs/moshivis](https://github.com/kyutai-labs/moshivis)
37
  - **Demo:** [Talk to Moshi](http://vis.moshi.chat)
38