---
base_model:
- meta-llama/Llama-3.2-3B-Instruct
- codellama/CodeLlama-7b-Instruct-hf
- codellama/CodeLlama-13b-Instruct-hf
library_name: peft
pipeline_tag: text-generation
tags:
- lora
- transformers
- configuration-management
- secrets-management
- devops
- multi-cloud
- gguf
- anysecret
license: mit
language:
- en
---

# AnySecret Assistant - Multi-Model Collection

A specialized AI assistant collection for AnySecret configuration management, available in multiple sizes and formats optimized for different use cases and deployment scenarios.

## 🚀 Available Models

| Model | Base Model | Parameters | Format | Best For | Memory |
|-------|------------|------------|--------|----------|--------|
| **3B** | Llama-3.2-3B-Instruct | 3B | PyTorch/GGUF | Fast responses, edge deployment | 4-6GB |
| **7B** | CodeLlama-7B-Instruct | 7B | PyTorch/GGUF | Balanced performance, code focus | 8-12GB |
| **13B** | CodeLlama-13B-Instruct | 13B | PyTorch/GGUF | Highest quality, complex queries | 16-24GB |

### Model Variants

#### PyTorch Models (LoRA Adapters)
- `anysecret-io/anysecret-assistant/3B/` - Llama-3.2-3B base
- `anysecret-io/anysecret-assistant/7B/` - CodeLlama-7B base  
- `anysecret-io/anysecret-assistant/13B/` - CodeLlama-13B base

#### GGUF Models (Quantized)
- `anysecret-io/anysecret-assistant/3B-GGUF/` - Q4_K_M, Q8_0 formats
- `anysecret-io/anysecret-assistant/7B-GGUF/` - Q4_K_M, Q8_0 formats
- `anysecret-io/anysecret-assistant/13B-GGUF/` - Q4_K_M, Q8_0 formats

## 🎯 Model Description

These models are fine-tuned specifically to assist with AnySecret configuration management across AWS, GCP, Azure, and Kubernetes environments. Each model can help with CLI commands, configuration setup, CI/CD integration, and Python SDK usage.

- **Developed by:** anysecret-io
- **Model type:** Causal Language Model (LoRA Adapters + GGUF)
- **Language(s):** English
- **License:** MIT
- **Specialized for:** Multi-cloud secrets and configuration management

## 📦 Quick Start

### Option 1: Using Ollama (Recommended for GGUF)

```bash
# 7B model (balanced performance)
ollama pull anysecret-io/anysecret-assistant/7B-GGUF
ollama run anysecret-io/anysecret-assistant/7B-GGUF

# 13B model (best quality)
ollama pull anysecret-io/anysecret-assistant/13B-GGUF
ollama run anysecret-io/anysecret-assistant/13B-GGUF
```

### Option 2: Using Transformers (PyTorch)

```python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Choose your model size (3B/7B/13B)
model_size = "7B"  # or "3B", "13B"
base_models = {
    "3B": "meta-llama/Llama-3.2-3B-Instruct",
    "7B": "codellama/CodeLlama-7b-Instruct-hf",
    "13B": "codellama/CodeLlama-13b-Instruct-hf"
}

base_model_name = base_models[model_size]
adapter_path = f"anysecret-io/anysecret-assistant/{model_size}"

# Load model
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)
model = PeftModel.from_pretrained(base_model, adapter_path)
tokenizer = AutoTokenizer.from_pretrained(base_model_name)

# Generate response
def ask_anysecret(question):
    prompt = f"### Instruction:\n{question}\n\n### Response:\n"
    inputs = tokenizer(prompt, return_tensors="pt")
    outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.1)
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return response.split("### Response:\n")[-1].strip()

# Example usage
print(ask_anysecret("How do I configure AnySecret for AWS?"))
```

### Option 3: Using llama.cpp (GGUF)

```bash
# Download GGUF model
wget https://huggingface.co/anysecret-io/anysecret-assistant/resolve/main/7B-GGUF/anysecret-7b-q4_k_m.gguf

# Run with llama.cpp
./llama-server -m anysecret-7b-q4_k_m.gguf --port 8080
```

## 🎯 Use Cases

### Direct Use

All models are designed to provide expert assistance with:

- **AnySecret CLI** - Commands, usage patterns, troubleshooting
- **Multi-cloud Configuration** - AWS Secrets Manager, GCP Secret Manager, Azure Key Vault
- **Kubernetes Integration** - Secrets, ConfigMaps, operators
- **CI/CD Pipelines** - GitHub Actions, Jenkins, GitLab CI
- **Python SDK** - Implementation guidance, best practices
- **Security Patterns** - Secret rotation, access controls, compliance

### Example Queries

```
"How do I set up AnySecret with AWS Secrets Manager?"
"Show me how to use anysecret in a GitHub Actions workflow"
"How do I rotate secrets across multiple cloud providers?"
"What's the difference between storing secrets vs parameters?"
"How do I configure AnySecret for a Kubernetes deployment?"
```

## 🏗️ Training Details

### Training Data

Models were trained on **150+ curated examples** across 7 categories:
- **CLI Commands** (25 examples) - Command usage and patterns
- **AWS Configuration** (25 examples) - Secrets Manager integration
- **GCP Configuration** (25 examples) - Secret Manager setup
- **Azure Configuration** (25 examples) - Key Vault integration  
- **Kubernetes** (25 examples) - Secrets and ConfigMaps
- **CI/CD Integration** (15 examples) - Pipeline workflows
- **Python Integration** (10 examples) - SDK usage patterns

### Training Configuration

#### Hyperparameters
- **LoRA Rank:** 16
- **LoRA Alpha:** 32
- **Learning Rate:** 2e-4
- **Batch Size:** 1 (with gradient accumulation)
- **Epochs:** 2-3
- **Precision:** fp16 mixed precision with 4-bit quantization

#### Target Modules
- **Llama-3.2 (3B):** q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- **CodeLlama (7B/13B):** q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

## 🔧 Model Selection Guide

### Choose 3B if you need:
- ✅ Fast inference (< 1 second)
- ✅ Low memory usage (4-6GB)
- ✅ Edge deployment
- ✅ Basic AnySecret queries

### Choose 7B if you need:
- ✅ Balanced performance/speed
- ✅ Better code understanding
- ✅ Moderate memory (8-12GB)
- ✅ Complex configuration queries

### Choose 13B if you need:
- ✅ Highest quality responses
- ✅ Complex multi-step guidance
- ✅ Advanced troubleshooting
- ✅ Production deployments

## 🚀 Deployment Options

### Local Development
- **GGUF + Ollama:** Easiest setup, good performance
- **PyTorch + GPU:** Best quality, requires CUDA

### Production Deployment
- **Docker + llama.cpp:** Scalable, CPU/GPU support
- **Kubernetes:** Auto-scaling, load balancing
- **Cloud APIs:** Serverless, pay-per-use

### Memory Requirements

| Model | GGUF Q4_K_M | GGUF Q8_0 | PyTorch FP16 |
|-------|-------------|-----------|--------------|
| 3B    | 2.3GB       | 3.2GB     | 6GB          |
| 7B    | 4.1GB       | 7.2GB     | 14GB         |
| 13B   | 7.8GB       | 13.8GB    | 26GB         |

## 📚 Model Sources

- **Repository:** https://github.com/anysecret-io/anysecret-lib
- **Documentation:** https://docs.anysecret.io
- **Training Code:** https://github.com/anysecret-io/anysecret-llm
- **Website:** https://anysecret.io

## 🔍 Framework Versions

- **PEFT:** 0.17.1+
- **Transformers:** 4.35.0+
- **PyTorch:** 2.0.0+
- **llama.cpp:** Latest
- **Ollama:** 0.1.0+

## 📊 Performance Benchmarks

| Model | Tokens/sec | Quality Score | Memory (GGUF Q4) |
|-------|------------|---------------|------------------|
| 3B    | ~45        | 7.2/10        | 2.3GB           |
| 7B    | ~25        | 8.5/10        | 4.1GB           |
| 13B   | ~15        | 9.1/10        | 7.8GB           |

*Benchmarks run on RTX 3090 with GGUF Q4_K_M quantization*

## ⚖️ License

MIT License - See individual model folders for specific license details.

---

For support, visit our [GitHub Issues](https://github.com/anysecret-io/anysecret-lib/issues) or [Documentation](https://docs.anysecret.io).