Gryphe/MythoMax-L2-13b
Quantized version of Gryphe/MythoMax-L2-13b.
Creation
This model was created with llm-compressor by running the code snippet below.
from llmcompressor.modifiers.quantization import QuantizationModifier
from llmcompressor.transformers import oneshot
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load model
model_stub = "Gryphe/MythoMax-L2-13b"
model_name = model_stub.split("/")[-1]
model = AutoModelForCausalLM.from_pretrained(
model_stub,
torch_dtype="auto",
)
tokenizer = AutoTokenizer.from_pretrained(model_stub)
# Configure the quantization algorithm and scheme
recipe = QuantizationModifier(
targets="Linear",
scheme="FP8_DYNAMIC",
ignore=["lm_head"],
)
# Apply quantization
oneshot(
model=model,
recipe=recipe,
)
# Save to disk in compressed-tensors format
save_path = model_name + "-FP8-dynamic"
model.generation_config.do_sample = True
model.save_pretrained(save_path)
tokenizer.save_pretrained(save_path)
print(f"Model and tokenizer saved to: {save_path}")
- Downloads last month
- 320
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.