feel-fl/feel-feedback
Viewer
•
Updated
•
1.22k
•
285
•
2
Human Feedback and LLMs
llama-server -hf unsloth/Devstral-Small-2505-GGUF:Q4_K_M
.continue/models/llama-max.yaml
file in your project to tell Continue how to use the local Ollama model.name: Llama.cpp model
version: 0.0.1
schema: v1
models:
- provider: llama.cpp
model: unsloth/Devstral-Small-2505-GGUF
apiBase: http://localhost:8080
defaultCompletionOptions:
contextLength: 8192
# Adjust based on the model
name: Llama.cpp Devstral-Small
roles:
- chat
- edit
.continue/mcpServers/playwright-mcp.yaml
file to integrate a tool, like the Playwright browser automation tool, with your assistant.name: Playwright mcpServer
version: 0.0.1
schema: v1
mcpServers:
- name: Browser search
command: npx
args:
- "@playwright/mcp@latest"
git+https://github.com/huggingface/transformers@main
git+https://github.com/huggingface/trl.git@main
bitsandbytes
peft
--no-deps
git+https://github.com/unslothai/unsloth-zoo.git@nightly
git+https://github.com/unslothai/unsloth.git@nightly
from trl import GRPOConfig
training_args = GRPOConfig(
learning_rate = 5e-6,
adam_beta1 = 0.9,
adam_beta2 = 0.99,
weight_decay = 0.1,
warmup_ratio = 0.1,
lr_scheduler_type = "cosine",
optim = "adamw_8bit",
logging_steps = 1,
per_device_train_batch_size = 2,
gradient_accumulation_steps = 1,
num_generations = 2,
max_prompt_length = 256,
max_completion_length = 1024 - 256,
num_train_epochs = 1,
max_steps = 250,
save_steps = 250,
max_grad_norm = 0.1,
report_to = "none",
)
from transformers import AutoModelForImageTextToText
model = AutoModelForImageTextToText.from_pretrained("google/gemma-3-4b-it)