ReplaceMe
Collection
A set of model pruning experiments
•
1 item
•
Updated
A comprehensive system for evaluating local language models using standardized prompts and generating detailed markdown reports.
inference.py - Simple inference function: Text in, text outeval_prompts.json - A set of prompts to run evaluation on modelsrun_eval.py - Uses inference.py & eval_prompts.json to run evaluation on all local models and save responses to markdownrequirements.txt - Python dependenciesmyenv/ - Python virtual environmentActivate the virtual environment:
source myenv/bin/activate
Install dependencies (if not already installed):
pip install -r requirements.txt
Run evaluation on all models:
python run_eval.py
Test with a single model:
python inference.py
inference.py)
eval_prompts.json)
run_eval.py)
Models should be in Hugging Face format with these files:
config.jsonmodel.safetensors tokenizer.jsonvocab.jsonfrom inference import ModelInference, get_local_models
# Find all models
models = get_local_models()
print(f"Found {len(models)} models")
# Quick inference
from inference import simple_inference
result = simple_inference(models[0], "What is AI?", max_length=256)
print(result)
# Advanced usage
inference = ModelInference(models[0])
if inference.load_model():
response = inference.generate_text("Explain Python", max_length=512, temperature=0.7)
print(response)
inference.unload_model()
The evaluation generates a markdown report (evaluation_results.md) with: