Can you post the patch?

by Lockout - opened Mar 15

Discussion

Lockout

Mar 15

I don't have discord so it would be nice to see what it is.

llmixer

Owner Mar 15

I believe this part is sufficient to run it:

diff --git a/exllamav2/architecture.py b/exllamav2/architecture.py
index b2a6280..67db6ef 100644
--- a/exllamav2/architecture.py
+++ b/exllamav2/architecture.py
@@ -496,7 +496,7 @@ class ExLlamaV2ArchParams:
 
         # Cohere
 
-        if arch_string == "CohereForCausalLM":
+        if arch_string in ("CohereForCausalLM", "Cohere2ForCausalLM"):
             arch_recognized = True
             self.lm.layer_keys += \
                 layer_keys_cohere_norms + \

lynnea1517

Mar 15

•

edited Mar 15

I took a look at the code, and I’m pretty sure that patch isn’t actually doing anything functional? The changes seem to be overwritten later on by the if arch_string == "Cohere2ForCausalLM" part.

I also tested the quants, and they sadly seem to be broken (repetition issues). I think there needs to be more explicit handling in exllamav2 for this model.

llmixer

Owner Mar 15

Ok, I have removed it from the model card. Hopefully the quants will still be fine when exllamav2 fixes command-a support.

llmixer changed discussion status to closed Mar 15

lynnea1517

Mar 15

Yeah, let's hope that turboderp can take a look, but he seems to be busy with developing exllamav3. I was planning to create more quantized models, but held off on it because the measurement pass looked problematic and I was concerned the resulting quants would be suboptimal, even if the support gets fixed.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment