aifeifei798
/

Llama-3.1-Nemotron-Nano-8B-v1-bnb-4bit

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions

aifeifei798 commited on Mar 24

Commit

20b9178

·

verified ·

1 Parent(s): 035246c

Update bit4-chat.py

Files changed (1) hide show

bit4-chat.py +1 -1

bit4-chat.py CHANGED Viewed

@@ -10,7 +10,7 @@ quantization_config = BitsAndBytesConfig(
 )
 # Define the model name and path for the quantized model
-model_name = "nvidia/Llama-3.1-Nemotron-Nano-8B-v1-bnb-4bit"
 # Load the quantized model with the specified configuration
 model = AutoModelForCausalLM.from_pretrained(

 )
 # Define the model name and path for the quantized model
+model_name = "./Llama-3.1-Nemotron-Nano-8B-v1-bnb-4bit"
 # Load the quantized model with the specified configuration
 model = AutoModelForCausalLM.from_pretrained(