--- base_model: - meta-llama/Llama-3.2-3B-Instruct - codellama/CodeLlama-7b-Instruct-hf - codellama/CodeLlama-13b-Instruct-hf library_name: peft pipeline_tag: text-generation tags: - lora - transformers - configuration-management - secrets-management - devops - multi-cloud - gguf - anysecret license: mit language: - en --- # AnySecret Assistant - Multi-Model Collection A specialized AI assistant collection for AnySecret configuration management, available in multiple sizes and formats optimized for different use cases and deployment scenarios. ## 🚀 Available Models | Model | Base Model | Parameters | Format | Best For | Memory | |-------|------------|------------|--------|----------|--------| | **3B** | Llama-3.2-3B-Instruct | 3B | PyTorch/GGUF | Fast responses, edge deployment | 4-6GB | | **7B** | CodeLlama-7B-Instruct | 7B | PyTorch/GGUF | Balanced performance, code focus | 8-12GB | | **13B** | CodeLlama-13B-Instruct | 13B | PyTorch/GGUF | Highest quality, complex queries | 16-24GB | ### Model Variants #### PyTorch Models (LoRA Adapters) - `anysecret-io/anysecret-assistant/3B/` - Llama-3.2-3B base - `anysecret-io/anysecret-assistant/7B/` - CodeLlama-7B base - `anysecret-io/anysecret-assistant/13B/` - CodeLlama-13B base #### GGUF Models (Quantized) - `anysecret-io/anysecret-assistant/3B-GGUF/` - Q4_K_M, Q8_0 formats - `anysecret-io/anysecret-assistant/7B-GGUF/` - Q4_K_M, Q8_0 formats - `anysecret-io/anysecret-assistant/13B-GGUF/` - Q4_K_M, Q8_0 formats ## 🎯 Model Description These models are fine-tuned specifically to assist with AnySecret configuration management across AWS, GCP, Azure, and Kubernetes environments. Each model can help with CLI commands, configuration setup, CI/CD integration, and Python SDK usage. - **Developed by:** anysecret-io - **Model type:** Causal Language Model (LoRA Adapters + GGUF) - **Language(s):** English - **License:** MIT - **Specialized for:** Multi-cloud secrets and configuration management ## 📦 Quick Start ### Option 1: Using Ollama (Recommended for GGUF) ```bash # 7B model (balanced performance) ollama pull anysecret-io/anysecret-assistant/7B-GGUF ollama run anysecret-io/anysecret-assistant/7B-GGUF # 13B model (best quality) ollama pull anysecret-io/anysecret-assistant/13B-GGUF ollama run anysecret-io/anysecret-assistant/13B-GGUF ``` ### Option 2: Using Transformers (PyTorch) ```python from peft import PeftModel from transformers import AutoModelForCausalLM, AutoTokenizer import torch # Choose your model size (3B/7B/13B) model_size = "7B" # or "3B", "13B" base_models = { "3B": "meta-llama/Llama-3.2-3B-Instruct", "7B": "codellama/CodeLlama-7b-Instruct-hf", "13B": "codellama/CodeLlama-13b-Instruct-hf" } base_model_name = base_models[model_size] adapter_path = f"anysecret-io/anysecret-assistant/{model_size}" # Load model base_model = AutoModelForCausalLM.from_pretrained( base_model_name, torch_dtype=torch.float16, device_map="auto" ) model = PeftModel.from_pretrained(base_model, adapter_path) tokenizer = AutoTokenizer.from_pretrained(base_model_name) # Generate response def ask_anysecret(question): prompt = f"### Instruction:\n{question}\n\n### Response:\n" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.1) response = tokenizer.decode(outputs[0], skip_special_tokens=True) return response.split("### Response:\n")[-1].strip() # Example usage print(ask_anysecret("How do I configure AnySecret for AWS?")) ``` ### Option 3: Using llama.cpp (GGUF) ```bash # Download GGUF model wget https://huggingface.co/anysecret-io/anysecret-assistant/resolve/main/7B-GGUF/anysecret-7b-q4_k_m.gguf # Run with llama.cpp ./llama-server -m anysecret-7b-q4_k_m.gguf --port 8080 ``` ## 🎯 Use Cases ### Direct Use All models are designed to provide expert assistance with: - **AnySecret CLI** - Commands, usage patterns, troubleshooting - **Multi-cloud Configuration** - AWS Secrets Manager, GCP Secret Manager, Azure Key Vault - **Kubernetes Integration** - Secrets, ConfigMaps, operators - **CI/CD Pipelines** - GitHub Actions, Jenkins, GitLab CI - **Python SDK** - Implementation guidance, best practices - **Security Patterns** - Secret rotation, access controls, compliance ### Example Queries ``` "How do I set up AnySecret with AWS Secrets Manager?" "Show me how to use anysecret in a GitHub Actions workflow" "How do I rotate secrets across multiple cloud providers?" "What's the difference between storing secrets vs parameters?" "How do I configure AnySecret for a Kubernetes deployment?" ``` ## 🏗️ Training Details ### Training Data Models were trained on **150+ curated examples** across 7 categories: - **CLI Commands** (25 examples) - Command usage and patterns - **AWS Configuration** (25 examples) - Secrets Manager integration - **GCP Configuration** (25 examples) - Secret Manager setup - **Azure Configuration** (25 examples) - Key Vault integration - **Kubernetes** (25 examples) - Secrets and ConfigMaps - **CI/CD Integration** (15 examples) - Pipeline workflows - **Python Integration** (10 examples) - SDK usage patterns ### Training Configuration #### Hyperparameters - **LoRA Rank:** 16 - **LoRA Alpha:** 32 - **Learning Rate:** 2e-4 - **Batch Size:** 1 (with gradient accumulation) - **Epochs:** 2-3 - **Precision:** fp16 mixed precision with 4-bit quantization #### Target Modules - **Llama-3.2 (3B):** q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj - **CodeLlama (7B/13B):** q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj ## 🔧 Model Selection Guide ### Choose 3B if you need: - ✅ Fast inference (< 1 second) - ✅ Low memory usage (4-6GB) - ✅ Edge deployment - ✅ Basic AnySecret queries ### Choose 7B if you need: - ✅ Balanced performance/speed - ✅ Better code understanding - ✅ Moderate memory (8-12GB) - ✅ Complex configuration queries ### Choose 13B if you need: - ✅ Highest quality responses - ✅ Complex multi-step guidance - ✅ Advanced troubleshooting - ✅ Production deployments ## 🚀 Deployment Options ### Local Development - **GGUF + Ollama:** Easiest setup, good performance - **PyTorch + GPU:** Best quality, requires CUDA ### Production Deployment - **Docker + llama.cpp:** Scalable, CPU/GPU support - **Kubernetes:** Auto-scaling, load balancing - **Cloud APIs:** Serverless, pay-per-use ### Memory Requirements | Model | GGUF Q4_K_M | GGUF Q8_0 | PyTorch FP16 | |-------|-------------|-----------|--------------| | 3B | 2.3GB | 3.2GB | 6GB | | 7B | 4.1GB | 7.2GB | 14GB | | 13B | 7.8GB | 13.8GB | 26GB | ## 📚 Model Sources - **Repository:** https://github.com/anysecret-io/anysecret-lib - **Documentation:** https://docs.anysecret.io - **Training Code:** https://github.com/anysecret-io/anysecret-llm - **Website:** https://anysecret.io ## 🔍 Framework Versions - **PEFT:** 0.17.1+ - **Transformers:** 4.35.0+ - **PyTorch:** 2.0.0+ - **llama.cpp:** Latest - **Ollama:** 0.1.0+ ## 📊 Performance Benchmarks | Model | Tokens/sec | Quality Score | Memory (GGUF Q4) | |-------|------------|---------------|------------------| | 3B | ~45 | 7.2/10 | 2.3GB | | 7B | ~25 | 8.5/10 | 4.1GB | | 13B | ~15 | 9.1/10 | 7.8GB | *Benchmarks run on RTX 3090 with GGUF Q4_K_M quantization* ## ⚖️ License MIT License - See individual model folders for specific license details. --- For support, visit our [GitHub Issues](https://github.com/anysecret-io/anysecret-lib/issues) or [Documentation](https://docs.anysecret.io).