README.md · Echo9Zulu/Mistral-NeMo-Minitron-8B-Instruct-OpenVINO at main

metadata

license: apache-2.0
base_model:
  - nvidia/Mistral-NeMo-Minitron-8B-Instruct
tags:
  - OpenArc
  - OpenVINO
  - Intel

My Project OpenArc, an inference engine for OpenVINO, now supports this model and serves inference over OpenAI compatible endpoints for text to text and text with vision!

We have a growing Discord community of others interested in using Intel for AI/ML.

Find documentation on the Optimum-CLI export process here
Use my HF space Echo9Zulu/Optimum-CLI-Tool_tool to build commands and execute locally

This repo contains OpenVINO quantizations of nvidia/Mistral-NeMo-Minitron-8B-Instruct.

I reccomend starting with Mistral-NeMo-Minitron-8B-Instruct-int4_asym-awq-se-ov

To download individual models from this repo use the provided snippet:

from huggingface_hub import snapshot_download

repo_id = "Echo9Zulu/Mistral-NeMo-Minitron-8B-Instruct-OpenVINO"     

# Choose the weights you want
repo_directory = "Mistral-NeMo-Minitron-8B-Instruct-int4_asym-awq-se-ov"

# Where you want to save it
local_dir = "./Echo9Zulu_Mistral-NeMo-Minitron-8B-Instruct/Mistral-NeMo-Minitron-8B-Instruct-int4_asym-awq-se-ov"

snapshot_download(
    repo_id=repo_id,
    allow_patterns=[f"{repo_directory}/*"], 
    local_dir=local_dir,
    local_dir_use_symlinks=True
) 

print("Download complete!")