---
license: apache-2.0
datasets:
- Josephgflowers/Finance-Instruct-500k
language:
- en
base_model:
- tarun7r/Finance-Llama-8B
pipeline_tag: text-generation
library_name: transformers
tags:
- text-generation-inference
- finance
- economics
---

# Model Card for Finance-Llama-8B-q4_k_m-GGUF

This model is a quantized version of `tarun7r/Finance-Llama-8B` fine-tuned on the `Josephgflowers/Finance-Instruct-500k` dataset. It's designed for financial tasks, reasoning, and multi-turn conversations.

## Key Features

*   **Extensive Coverage:** Trained on over 500,000 entries spanning financial QA, reasoning, sentiment analysis, topic classification, multilingual NER, and conversational AI.📚
*   **Multi-Turn Conversations:** Capable of rich dialogues emphasizing contextual understanding and reasoning.
*   **Diverse Data Sources:** Includes entries from Cinder, Sujet-Finance-Instruct-177k, Phinance Dataset, BAAI/IndustryInstruction_Finance-Economics, Josephgflowers/Financial-NER-NLP, and many other high-quality datasets.
*   **Financial Specialization:** Tailored for financial reasoning, question answering, entity recognition, sentiment analysis, and more.

## Dataset Details 💾

### Finance-Instruct-500k Dataset

**Overview**
Finance-Instruct-500k is a comprehensive and meticulously curated dataset designed to train advanced language models for financial tasks, reasoning, and multi-turn conversations. Combining data from numerous high-quality financial datasets, this corpus provides over 500,000 entries, offering unparalleled depth and versatility for finance-related instruction tuning and fine-tuning.

The dataset includes content tailored for financial reasoning, question answering, entity recognition, sentiment analysis, address parsing, and multilingual natural language processing (NLP). Its diverse and deduplicated entries make it suitable for a wide range of financial AI applications, including domain-specific assistants, conversational agents, and information extraction systems.

**Key Features of the Dataset**
*   **Extensive Coverage:** Over 500,000 entries spanning financial QA, reasoning, sentiment analysis, topic classification, multilingual NER, and conversational AI.🌍
*   **Multi-Turn Conversations:** Rich dialogues emphasizing contextual understanding and reasoning.🗣️
*   **Diverse Data Sources:** Includes entries from Cinder, Sujet-Finance-Instruct-177k, Phinance Dataset, BAAI/IndustryInstruction_Finance-Economics, Josephgflowers/Financial-NER-NLP, and many other high-quality datasets. 📖


### Ollama

You can also use this model with Ollama. Pre-built GGUF versions (FP16 and Q4_K_M) are available at:
[ollama.com/martain7r/finance-llama-8b](https://ollama.com/martain7r/finance-llama-8b)

To run the FP16 version:
```bash
ollama run martain7r/finance-llama-8b:fp16
```

To run the Q4_K_M quantized version (smaller and faster, with a slight trade-off in quality):
```bash
ollama run martain7r/finance-llama-8b:q4_k_m
```

To run the Finance-Llama-8B-q4_k_m-GGUF quantized model, you'll use llama.cpp via the llama-cpp-python library instead of Hugging Face Transformers. Here's the step-by-step solution:

1. Install Required Libraries
```bash
pip install llama-cpp-python huggingface-hub
```

2. Download the GGUF Model
Use the Hugging Face Hub to download the quantized model file:

```bash
from huggingface_hub import hf_hub_download

model_name = "tarun7r/Finance-Llama-8B-q4_k_m-GGUF"  # Check for the correct repository
model_file = "Finance-Llama-8B-GGUF-q4_K_M.gguf"     # Exact GGUF filename

model_path = hf_hub_download(
    repo_id=model_name,
    filename=model_file,
    local_dir="./models"
)
```

3. Run the Quantized Model

```bash
from llama_cpp import Llama

# Initialize the model
llm = Llama(
    model_path=model_path,
    n_ctx=8192,           # Context window size
    n_threads=8,          # CPU threads for inference
    n_gpu_layers=-1,      # Offload all layers to GPU
    verbose=False         # Disable verbose logging
)

# Define the prompt template
finance_prompt_template = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{instruction}
### Input:
{input}
### Response:
"""

# Format the prompt
system_message = "You are a highly knowledgeable finance chatbot. Your purpose is to provide accurate, insightful, and actionable financial advice."
user_question = "What strategies can an individual investor use to diversify their portfolio effectively in a volatile market?"

prompt = finance_prompt_template.format(
    instruction=system_message,
    input=user_question
)

# Generate response
output = llm(
    prompt,
    max_tokens=2500,       # Limit response length
    temperature=0.7,      # Creativity control
    top_p=0.9,            # Nucleus sampling
    echo=False,           # Return only the completion (not prompt)
    stop=["###"]          # Stop at "###" to avoid extra text
)

# Extract and print the response
response = output["choices"][0]["text"].strip()
print("\n--- Response ---")
print(response)
```

**Citation 📌**
````
@misc{tarun7r/Finance-Llama-8B,
  author    = {tarun7r},
  title     = {tarun7r/Finance-Llama-8B: A Llama 3.1 8B Model Fine-tuned on Josephgflowers/Finance-Instruct-500k},
  year      = {2025},
  publisher = {Hugging Face},
  journal   = {Hugging Face Model Hub},
  howpublished = {\url{https://huggingface.co/tarun7r/Finance-Llama-8B}}
}

````