|
--- |
|
license: apache-2.0 |
|
base_model: Qwen/Qwen3-8B |
|
tags: |
|
- lora |
|
- qwen3 |
|
- devops |
|
- kubernetes |
|
- docker |
|
- sre |
|
- infrastructure |
|
- peft |
|
- ci-cd |
|
- automation |
|
- troubleshooting |
|
- github-actions |
|
- production-ready |
|
library_name: peft |
|
pipeline_tag: text-generation |
|
language: |
|
- en |
|
datasets: |
|
- devops |
|
- stackoverflow |
|
- kubernetes |
|
- docker |
|
model-index: |
|
- name: qwen-devops-foundation-lora |
|
results: |
|
- task: |
|
type: text-generation |
|
name: DevOps Question Answering |
|
dataset: |
|
type: devops-evaluation |
|
name: DevOps Expert Evaluation |
|
metrics: |
|
- type: accuracy |
|
value: 0.60 |
|
name: Overall DevOps Accuracy |
|
- type: speed |
|
value: 40.4 |
|
name: Average Response Time (seconds) |
|
- type: specialization |
|
value: 6.0 |
|
name: DevOps Relevance Score (0-10) |
|
--- |
|
|
|
# Qwen DevOps Foundation Model - LoRA Adapter |
|
|
|
This is a LoRA (Low-Rank Adaptation) adapter for the Qwen3-8B model, fine-tuned on DevOps-related datasets. The model excels at CI/CD pipeline guidance, Docker security practices, and DevOps troubleshooting with **26% faster inference** than the base model. |
|
|
|
## π **Performance Highlights** |
|
|
|
- **π₯ Overall Score**: 0.60/1.00 (GOOD) - Ready for production DevOps assistance |
|
- **β‘ Speed**: 26% faster than base Qwen3-8B (40.4s vs 55.1s average response time) |
|
- **π― Specialization**: Focused DevOps expertise with practical, actionable guidance |
|
- **π» Compatibility**: Optimized for local deployment (requires ~21GB RAM) |
|
|
|
## π― Model Details |
|
|
|
- **Base Model**: `Qwen/Qwen3-8B` |
|
- **Training Method**: LoRA fine-tuning |
|
- **Hardware**: 4x NVIDIA L40S GPUs |
|
- **Training Checkpoint**: 400 |
|
- **Training Date**: 2025-08-07 |
|
- **Training Duration**: ~3 hours |
|
|
|
## π Quick Start |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
from peft import PeftModel |
|
|
|
# Load base model |
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
"Qwen/Qwen3-8B", |
|
torch_dtype="auto", |
|
device_map="auto" |
|
) |
|
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B") |
|
|
|
# Load LoRA adapter |
|
model = PeftModel.from_pretrained(base_model, "AMaslovskyi/qwen-devops-foundation-lora") |
|
|
|
# Use the model |
|
prompt = "How do I deploy a Kubernetes cluster?" |
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
outputs = model.generate(**inputs, max_length=200, temperature=0.7) |
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
print(response) |
|
``` |
|
|
|
## π **Comprehensive Evaluation Results** |
|
|
|
### π― **DevOps Expertise Breakdown** |
|
|
|
| **Category** | **Score** | **Rating** | **Comments** | |
|
| -------------------------- | --------- | ------------- | ------------------------------------------------------- | |
|
| **CI/CD Pipelines** | 1.00 | π **Perfect** | Complete GitHub Actions mastery, build automation | |
|
| **Docker Security** | 0.75 | β
**Strong** | Production security practices, container optimization | |
|
| **Troubleshooting** | 0.75 | β
**Strong** | Systematic debugging, log analysis, event investigation | |
|
| **Kubernetes Deployment** | 0.25 | β Needs Work | Limited deployment strategies, service configuration | |
|
| **Infrastructure as Code** | 0.25 | β Needs Work | Basic IaC concepts, needs more Terraform/Ansible | |
|
|
|
### β‘ **Performance vs Base Qwen3-8B** |
|
|
|
| **Metric** | **Fine-tuned Model** | **Base Qwen3-8B** | **Improvement** | |
|
| -------------------- | -------------------- | ----------------- | -------------------- | |
|
| **Response Time** | 40.4s | 55.1s | π **+26% Faster** | |
|
| **DevOps Relevance** | 6.0/10 | 6.8/10 | β οΈ Specialized focus | |
|
| **Specialization** | High | General | β
**DevOps-focused** | |
|
|
|
### π§ **System Requirements** |
|
|
|
- **Minimum RAM**: 21GB (base model + LoRA adapter + working memory) |
|
- **Recommended**: 48GB+ for optimal performance |
|
- **Storage**: 182MB (LoRA adapter only) + 16GB (base model) |
|
- **GPU**: Optional, CPU-optimized for Apple Silicon and x86 |
|
|
|
### π
**Strengths & Use Cases** |
|
|
|
**π₯ Excellent Performance:** |
|
- CI/CD pipeline setup and optimization |
|
- GitHub Actions workflow development |
|
- Build automation and deployment strategies |
|
|
|
**β
Strong Performance:** |
|
- Docker production security practices |
|
- Container vulnerability management |
|
- Kubernetes troubleshooting and debugging |
|
- DevOps incident response procedures |
|
|
|
**π― Ideal For:** |
|
- DevOps team assistance and mentoring |
|
- CI/CD pipeline guidance and automation |
|
- Docker security consultations |
|
- Infrastructure troubleshooting support |
|
- Developer training and knowledge sharing |
|
|
|
### β οΈ **Areas for Enhancement** |
|
|
|
- **Kubernetes Deployments**: Consider supplementing with official K8s documentation |
|
- **Infrastructure as Code**: Best paired with Terraform/Ansible resources |
|
- **Complex Multi-cloud**: May need additional context for advanced scenarios |
|
|
|
## π Training Data |
|
|
|
This model was trained on DevOps-related datasets including: |
|
- Stack Overflow DevOps questions and answers |
|
- Docker commands and configurations |
|
- Kubernetes deployment guides |
|
- Infrastructure as Code examples |
|
- SRE incident response procedures |
|
- CI/CD pipeline configurations |
|
|
|
## π§ Model Architecture |
|
|
|
- **LoRA Rank**: 16 |
|
- **LoRA Alpha**: 32 |
|
- **Target Modules**: All linear layers |
|
- **Trainable Parameters**: ~43M (0.53% of base model) |
|
|
|
## π **Production Deployment** |
|
|
|
### π¦ **Local Deployment (Recommended)** |
|
|
|
Perfect for personal use or small teams with sufficient hardware: |
|
|
|
```python |
|
import torch |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
from peft import PeftModel |
|
|
|
# Optimized for local deployment |
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
"Qwen/Qwen3-8B", |
|
torch_dtype=torch.float16, |
|
device_map="cpu", # Use "auto" if you have GPU |
|
trust_remote_code=True |
|
) |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B") |
|
model = PeftModel.from_pretrained(base_model, "AMaslovskyi/qwen-devops-foundation-lora") |
|
|
|
# DevOps-optimized generation |
|
def ask_devops_expert(question): |
|
prompt = f"<|im_start|>system\nYou are a DevOps expert. Provide practical, actionable advice.<|im_end|>\n<|im_start|>user\n{question}<|im_end|>\n<|im_start|>assistant\n" |
|
|
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
outputs = model.generate( |
|
**inputs, |
|
max_length=512, |
|
temperature=0.7, |
|
do_sample=True, |
|
pad_token_id=tokenizer.eos_token_id |
|
) |
|
|
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
return response[len(prompt):].strip() |
|
|
|
# Example usage |
|
print(ask_devops_expert("How do I set up a CI/CD pipeline with GitHub Actions?")) |
|
``` |
|
|
|
### βοΈ **Cloud Deployment Options** |
|
|
|
**Docker Container:** |
|
```dockerfile |
|
FROM python:3.11-slim |
|
RUN pip install torch transformers peft |
|
# Copy your inference script |
|
CMD ["python", "inference_server.py"] |
|
``` |
|
|
|
**API Server:** |
|
- FastAPI-based inference server included in evaluation suite |
|
- Kubernetes deployment manifests available |
|
- Auto-scaling and load balancing support |
|
|
|
### π **Production Readiness: π‘ Nearly Ready** |
|
|
|
**β
Ready For:** |
|
- Internal DevOps team assistance |
|
- CI/CD pipeline guidance |
|
- Docker security consultations |
|
- Developer training and mentoring |
|
|
|
**β οΈ Monitor For:** |
|
- Complex Kubernetes deployments |
|
- Advanced Infrastructure as Code |
|
- Multi-cloud architecture decisions |
|
|
|
## π Files Included |
|
|
|
- `adapter_model.safetensors`: LoRA adapter weights (main model file) |
|
- `adapter_config.json`: LoRA configuration parameters |
|
- `tokenizer.json`: Fast tokenizer configuration |
|
- `tokenizer_config.json`: Tokenizer settings and parameters |
|
- `special_tokens_map.json`: Special token mappings |
|
- `vocab.json`: Vocabulary mapping |
|
- `merges.txt`: BPE merge rules |
|
|
|
## π License |
|
|
|
Apache 2.0 |
|
|
|
## π **Evaluation & Testing** |
|
|
|
This model has been comprehensively evaluated across 21 DevOps scenarios with: |
|
- **5-question quick assessment**: Fast performance validation |
|
- **Comprehensive evaluation suite**: 7 DevOps categories tested |
|
- **Comparative analysis**: Side-by-side testing with base Qwen3-8B |
|
- **System compatibility testing**: Hardware requirement analysis |
|
- **Production readiness assessment**: Deployment recommendations |
|
|
|
**Evaluation Tools Available:** |
|
- Automated testing scripts |
|
- Performance benchmarking suite |
|
- Interactive chat interface |
|
- API server with health monitoring |
|
|
|
## π‘ **Example Conversations** |
|
|
|
**CI/CD Pipeline Setup:** |
|
``` |
|
User: How do I set up a CI/CD pipeline with GitHub Actions? |
|
Model: I'll help you set up a complete CI/CD pipeline with GitHub Actions... |
|
[Provides step-by-step workflow configuration, testing stages, deployment automation] |
|
``` |
|
|
|
**Docker Security:** |
|
``` |
|
User: What are Docker security best practices for production? |
|
Model: Here are the essential Docker security practices for production environments... |
|
[Covers non-root users, image scanning, minimal base images, secrets management] |
|
``` |
|
|
|
**Troubleshooting:** |
|
``` |
|
User: My Kubernetes pod is stuck in Pending state. How do I troubleshoot? |
|
Model: Let's systematically troubleshoot your pod scheduling issue... |
|
[Provides kubectl commands, event analysis, resource checking steps] |
|
``` |
|
|
|
## π **Related Resources** |
|
|
|
- **ποΈ Training Space**: [HuggingFace Space](https://huggingface.co/spaces/AMaslovskyi/qwen-devops-training) |
|
- **π Evaluation Suite**: Comprehensive testing tools and results |
|
- **π Deployment Scripts**: Ready-to-use inference servers and Docker configs |
|
- **π Documentation**: Detailed usage guides and best practices |
|
|
|
## π Acknowledgments |
|
|
|
- Base model: [Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) by Alibaba Cloud |
|
- Training infrastructure: HuggingFace Spaces (4x L40S GPUs) |
|
- Training framework: Transformers + PEFT |
|
- Evaluation: Comprehensive DevOps testing suite (21+ scenarios) |
|
|