--- license: mit --- # Gryphe/MythoMax-L2-13b Quantized version of [Gryphe/MythoMax-L2-13b](https://huggingface.co/Gryphe/MythoMax-L2-13b). ## Creation This model was created with [llm-compressor](https://github.com/vllm-project/llm-compressor) by running the code snippet below. ```python from llmcompressor.modifiers.quantization import QuantizationModifier from llmcompressor.transformers import oneshot from transformers import AutoModelForCausalLM, AutoTokenizer # Load model model_stub = "Gryphe/MythoMax-L2-13b" model_name = model_stub.split("/")[-1] model = AutoModelForCausalLM.from_pretrained( model_stub, torch_dtype="auto", ) tokenizer = AutoTokenizer.from_pretrained(model_stub) # Configure the quantization algorithm and scheme recipe = QuantizationModifier( targets="Linear", scheme="FP8_DYNAMIC", ignore=["lm_head"], ) # Apply quantization oneshot( model=model, recipe=recipe, ) # Save to disk in compressed-tensors format save_path = model_name + "-FP8-dynamic" model.generation_config.do_sample = True model.save_pretrained(save_path) tokenizer.save_pretrained(save_path) print(f"Model and tokenizer saved to: {save_path}") ```