Model Card for SicMundus

Model Details

Model Description

This model, Pinnacle, is a fine-tuned version of unsloth/Llama-3.2-1B-Instruct utilizing Parameter Efficient Fine-Tuning (PEFT) with LoRA (Low-Rank Adaptation). It has been trained on the Open-Platypus dataset with a structured Alpaca-style prompt format. The primary goal is to enhance instruction-following capabilities while maintaining efficiency through 4-bit quantization.

  • Developed by: Ragul
  • Funded by: Self-funded
  • Organization: Pinnacle Organization
  • Shared by: Ragul
  • Model type: Instruction-tuned Language Model
  • Language(s) (NLP): English
  • License: Apache 2.0 (or specify if different)
  • Finetuned from model: unsloth/Llama-3.2-1B-Instruct

Model Sources

Uses

Direct Use

  • General-purpose instruction-following tasks
  • Text generation
  • Code generation assistance
  • Conversational AI applications

Downstream Use

  • Further fine-tuning on domain-specific datasets
  • Deployment in chatbot applications
  • Text summarization or document completion

Out-of-Scope Use

  • Not designed for real-time critical applications (e.g., medical or legal advice)
  • May not be suitable for handling highly sensitive data

Bias, Risks, and Limitations

While the model is designed to be a general-purpose assistant, it inherits biases from the pre-trained Llama model and the Open-Platypus dataset. Users should be aware of potential biases in generated responses, particularly regarding sensitive topics.

Recommendations

  • Use in conjunction with human oversight.
  • Avoid deploying in high-stakes scenarios without additional testing.

How to Get Started with the Model

To use the fine-tuned model, follow these steps:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_path = "path/to/SicMundus"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16, device_map="auto")

def generate_response(prompt):
    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
    output = model.generate(**inputs, max_new_tokens=100)
    return tokenizer.decode(output[0], skip_special_tokens=True)

prompt = "Explain the concept of reinforcement learning."
print(generate_response(prompt))

Training Details

Training Data

  • Dataset: garage-bAInd/Open-Platypus
  • Preprocessing: The dataset was formatted using Alpaca-style prompts with instruction, input, and output fields.

Training Procedure

  • Training Framework: Hugging Face transformers + trl (PEFT + LoRA)
  • Precision: Mixed precision (FP16/BF16 based on hardware support)
  • Batch size: 2 per device with gradient accumulation
  • Learning rate: 2e-4
  • Max Steps: 100
  • Optimizer: AdamW 8-bit
  • LoRA Config: Applied to key transformer layers (q_proj, k_proj, v_proj, etc.)

Speeds, Sizes, Times

  • Checkpoint Size: ~2GB (LoRA adapters stored separately)
  • Fine-tuning Time: ~1 hour on A100 GPU

Evaluation

Testing Data, Factors & Metrics

  • Testing Data: A subset of Open-Platypus
  • Factors: Performance on general instruction-following tasks
  • Metrics:
    • Perplexity (PPL)
    • Response Coherence
    • Instruction-following accuracy

Results

  • Perplexity: TBD
  • Response Quality: Qualitatively improved over base model on test prompts

Model Examination

  • Interpretability: Standard transformer-based behavior with LoRA fine-tuning.
  • Explainability: Outputs can be analyzed with attention visualization tools.

Environmental Impact

  • Hardware Type: A100 GPU
  • Hours used: ~1 hour
  • Cloud Provider: Local GPU / AWS / Hugging Face Accelerate
  • Carbon Emitted: Estimated using Machine Learning Impact Calculator

Technical Specifications

Model Architecture and Objective

  • Transformer-based architecture (Llama-3.2-1B)
  • Instruction-following optimization with PEFT-LoRA

Compute Infrastructure

  • Hardware: A100 (or specify if different)
  • Software: Python, PyTorch, transformers, unsloth, peft

Citation

If using this model, please cite:

@misc{SicMundus,
  author = {Ragul},
  title = {SicMundus: Fine-Tuned Llama-3.2-1B-Instruct},
  year = {2025},
  url = {https://huggingface.co/ragul2607/SicMundus}
}

More Information

Model Card Authors

  • Ragul
Downloads last month
51
GGUF
Model size
1.24B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ragul2607/Pinnacle

Quantizations
1 model

Dataset used to train ragul2607/Pinnacle