hyllus123

Update README files for multi-model repository structure

85a624f 19 days ago

7.72 kB

	---
	base_model:
	- meta-llama/Llama-3.2-3B-Instruct
	- codellama/CodeLlama-7b-Instruct-hf
	- codellama/CodeLlama-13b-Instruct-hf
	library_name: peft
	pipeline_tag: text-generation
	tags:
	- lora
	- transformers
	- configuration-management
	- secrets-management
	- devops
	- multi-cloud
	- gguf
	- anysecret
	license: mit
	language:
	- en
	---

	# AnySecret Assistant - Multi-Model Collection

	A specialized AI assistant collection for AnySecret configuration management, available in multiple sizes and formats optimized for different use cases and deployment scenarios.

	## 🚀 Available Models

	\| Model \| Base Model \| Parameters \| Format \| Best For \| Memory \|
	\|-------\|------------\|------------\|--------\|----------\|--------\|
	\| 3B \| Llama-3.2-3B-Instruct \| 3B \| PyTorch/GGUF \| Fast responses, edge deployment \| 4-6GB \|
	\| 7B \| CodeLlama-7B-Instruct \| 7B \| PyTorch/GGUF \| Balanced performance, code focus \| 8-12GB \|
	\| 13B \| CodeLlama-13B-Instruct \| 13B \| PyTorch/GGUF \| Highest quality, complex queries \| 16-24GB \|

	### Model Variants

	#### PyTorch Models (LoRA Adapters)
	- `anysecret-io/anysecret-assistant/3B/` - Llama-3.2-3B base
	- `anysecret-io/anysecret-assistant/7B/` - CodeLlama-7B base
	- `anysecret-io/anysecret-assistant/13B/` - CodeLlama-13B base

	#### GGUF Models (Quantized)
	- `anysecret-io/anysecret-assistant/3B-GGUF/` - Q4_K_M, Q8_0 formats
	- `anysecret-io/anysecret-assistant/7B-GGUF/` - Q4_K_M, Q8_0 formats
	- `anysecret-io/anysecret-assistant/13B-GGUF/` - Q4_K_M, Q8_0 formats

	## 🎯 Model Description

	These models are fine-tuned specifically to assist with AnySecret configuration management across AWS, GCP, Azure, and Kubernetes environments. Each model can help with CLI commands, configuration setup, CI/CD integration, and Python SDK usage.

	- Developed by: anysecret-io
	- Model type: Causal Language Model (LoRA Adapters + GGUF)
	- Language(s): English
	- License: MIT
	- Specialized for: Multi-cloud secrets and configuration management

	## 📦 Quick Start

	### Option 1: Using Ollama (Recommended for GGUF)

	```bash
	# 7B model (balanced performance)
	ollama pull anysecret-io/anysecret-assistant/7B-GGUF
	ollama run anysecret-io/anysecret-assistant/7B-GGUF

	# 13B model (best quality)
	ollama pull anysecret-io/anysecret-assistant/13B-GGUF
	ollama run anysecret-io/anysecret-assistant/13B-GGUF
	```

	### Option 2: Using Transformers (PyTorch)

	```python
	from peft import PeftModel
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	# Choose your model size (3B/7B/13B)
	model_size = "7B" # or "3B", "13B"
	base_models = {
	"3B": "meta-llama/Llama-3.2-3B-Instruct",
	"7B": "codellama/CodeLlama-7b-Instruct-hf",
	"13B": "codellama/CodeLlama-13b-Instruct-hf"
	}

	base_model_name = base_models[model_size]
	adapter_path = f"anysecret-io/anysecret-assistant/{model_size}"

	# Load model
	base_model = AutoModelForCausalLM.from_pretrained(
	base_model_name,
	torch_dtype=torch.float16,
	device_map="auto"
	)
	model = PeftModel.from_pretrained(base_model, adapter_path)
	tokenizer = AutoTokenizer.from_pretrained(base_model_name)

	# Generate response
	def ask_anysecret(question):
	prompt = f"### Instruction:\n{question}\n\n### Response:\n"
	inputs = tokenizer(prompt, return_tensors="pt")
	outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.1)
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	return response.split("### Response:\n")[-1].strip()

	# Example usage
	print(ask_anysecret("How do I configure AnySecret for AWS?"))
	```

	### Option 3: Using llama.cpp (GGUF)

	```bash
	# Download GGUF model
	wget https://huggingface.co/anysecret-io/anysecret-assistant/resolve/main/7B-GGUF/anysecret-7b-q4_k_m.gguf

	# Run with llama.cpp
	./llama-server -m anysecret-7b-q4_k_m.gguf --port 8080
	```

	## 🎯 Use Cases

	### Direct Use

	All models are designed to provide expert assistance with:

	- AnySecret CLI - Commands, usage patterns, troubleshooting
	- Multi-cloud Configuration - AWS Secrets Manager, GCP Secret Manager, Azure Key Vault
	- Kubernetes Integration - Secrets, ConfigMaps, operators
	- CI/CD Pipelines - GitHub Actions, Jenkins, GitLab CI
	- Python SDK - Implementation guidance, best practices
	- Security Patterns - Secret rotation, access controls, compliance

	### Example Queries

	```
	"How do I set up AnySecret with AWS Secrets Manager?"
	"Show me how to use anysecret in a GitHub Actions workflow"
	"How do I rotate secrets across multiple cloud providers?"
	"What's the difference between storing secrets vs parameters?"
	"How do I configure AnySecret for a Kubernetes deployment?"
	```

	## 🏗️ Training Details

	### Training Data

	Models were trained on 150+ curated examples across 7 categories:
	- CLI Commands (25 examples) - Command usage and patterns
	- AWS Configuration (25 examples) - Secrets Manager integration
	- GCP Configuration (25 examples) - Secret Manager setup
	- Azure Configuration (25 examples) - Key Vault integration
	- Kubernetes (25 examples) - Secrets and ConfigMaps
	- CI/CD Integration (15 examples) - Pipeline workflows
	- Python Integration (10 examples) - SDK usage patterns

	### Training Configuration

	#### Hyperparameters
	- LoRA Rank: 16
	- LoRA Alpha: 32
	- Learning Rate: 2e-4
	- Batch Size: 1 (with gradient accumulation)
	- Epochs: 2-3
	- Precision: fp16 mixed precision with 4-bit quantization

	#### Target Modules
	- Llama-3.2 (3B): q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
	- CodeLlama (7B/13B): q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

	## 🔧 Model Selection Guide

	### Choose 3B if you need:
	- ✅ Fast inference (< 1 second)
	- ✅ Low memory usage (4-6GB)
	- ✅ Edge deployment
	- ✅ Basic AnySecret queries

	### Choose 7B if you need:
	- ✅ Balanced performance/speed
	- ✅ Better code understanding
	- ✅ Moderate memory (8-12GB)
	- ✅ Complex configuration queries

	### Choose 13B if you need:
	- ✅ Highest quality responses
	- ✅ Complex multi-step guidance
	- ✅ Advanced troubleshooting
	- ✅ Production deployments

	## 🚀 Deployment Options

	### Local Development
	- GGUF + Ollama: Easiest setup, good performance
	- PyTorch + GPU: Best quality, requires CUDA

	### Production Deployment
	- Docker + llama.cpp: Scalable, CPU/GPU support
	- Kubernetes: Auto-scaling, load balancing
	- Cloud APIs: Serverless, pay-per-use

	### Memory Requirements

	\| Model \| GGUF Q4_K_M \| GGUF Q8_0 \| PyTorch FP16 \|
	\|-------\|-------------\|-----------\|--------------\|
	\| 3B \| 2.3GB \| 3.2GB \| 6GB \|
	\| 7B \| 4.1GB \| 7.2GB \| 14GB \|
	\| 13B \| 7.8GB \| 13.8GB \| 26GB \|

	## 📚 Model Sources

	- Repository: https://github.com/anysecret-io/anysecret-lib
	- Documentation: https://docs.anysecret.io
	- Training Code: https://github.com/anysecret-io/anysecret-llm
	- Website: https://anysecret.io

	## 🔍 Framework Versions

	- PEFT: 0.17.1+
	- Transformers: 4.35.0+
	- PyTorch: 2.0.0+
	- llama.cpp: Latest
	- Ollama: 0.1.0+

	## 📊 Performance Benchmarks

	\| Model \| Tokens/sec \| Quality Score \| Memory (GGUF Q4) \|
	\|-------\|------------\|---------------\|------------------\|
	\| 3B \| ~45 \| 7.2/10 \| 2.3GB \|
	\| 7B \| ~25 \| 8.5/10 \| 4.1GB \|
	\| 13B \| ~15 \| 9.1/10 \| 7.8GB \|

	Benchmarks run on RTX 3090 with GGUF Q4_K_M quantization

	## ⚖️ License

	MIT License - See individual model folders for specific license details.

	---

	For support, visit our [GitHub Issues](https://github.com/anysecret-io/anysecret-lib/issues) or [Documentation](https://docs.anysecret.io).