Phinance-Phi-3.5-mini-instruct-finance-v0.3

image/png

Overview

Phinance-Phi-3.5-mini-instruct-finance-v0.3 is a fine-tuned mini language model built specifically for financial tasks, reasoning, and multi-turn conversations. This version improves upon v0.2 by leveraging additional curated datasets and incorporating enhancements to better align with real-world Retrieval-Augmented Generation (RAG) workflows. It offers superior instruction-following capabilities and financial expertise while maintaining a lightweight architecture.

Key Updates in v0.3:

  • Updated RAG Formatting: Retrieved context is now included at the start of the user field, aligning with widely used practices in RAG workflows.
  • Expanded Dataset: Trained on the updated Finance-Instruct-500k dataset, incorporating broader multilingual and financial tagging examples.
  • Improved Instruction Tuning: Enhanced handling of multi-turn conversations and context retention for financial reasoning tasks.
  • Structured Output in JSON Format: Most NER and parsing tasks prompt the model to return structured JSON output, enabling seamless extraction of structured data from unstructured input.

Key Features

  • Finance-Focused Reasoning: Handles tasks like portfolio analysis, market trends, and financial question answering.
  • Instruction Following: Tailored for fine-grained instruction-based tasks within the financial domain.
  • Multi-Turn Conversations: Optimized for context-aware dialogue, supporting long interactions on financial topics.
  • RAG-Compatible: Prepares retrieved context at the beginning of the user field, improving integration with RAG systems.
  • Lightweight Architecture: Efficient performance on resource-constrained systems while maintaining robust output quality.
  • JSON Structured Output: Excels in returning structured JSON data for parsing and NER tasks.

Training Data

The model was fine-tuned on the Finance-Instruct-500k dataset, a diverse and meticulously curated financial corpus. The dataset features multi-turn conversations and instruction-tuning examples formatted for modern RAG workflows.

Dataset Highlights

  • Topics: Market trends, investment strategies, financial analysis, and more.
  • Format: Conversations structured as system, user, assistant, with retrieved context prepended to the user field for RAG use cases.
  • Filtering: High-quality financial content curated through advanced methods.
  • NER and Parsing Tasks: Prompts often structured to encourage JSON-formatted outputs, aiding structured data extraction.

Supported Tasks

  1. Financial Question Answering: Address complex queries about markets, terminology, and strategies.
  2. Multi-Turn Conversations: Engage in coherent, context-rich dialogues.
  3. Instruction Following: Execute finance-specific prompts with precision.
  4. RAG Applications: Seamlessly integrate external data for enhanced responses.
  5. NER and Parsing: Extract structured JSON data from unstructured financial inputs.
  6. Lightweight Financial Assistant: Serve as an efficient domain expert for finance-related tasks.

Usage

This model is ideal for:

  • Financial advisory tools and assistants
  • Chatbots for customer interactions
  • Financial QA systems
  • Lightweight, domain-specific applications

Example Code

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Josephgflowers/Phinance-Phi-3.5-mini-instruct-finance-v0.3"

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Example usage
inputs = tokenizer("System: You are a financial assistant.\nUser: What is the difference between stocks and bonds?", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Limitations

  • Niche Knowledge: Best suited for financial topics; may underperform on general-purpose tasks.
  • Bias: Data filtering could introduce biases toward specific financial sectors.
  • Validation Needed: Outputs should be verified for critical use cases.

Model Details

  • Base Model: phi-3.5-mini
  • Fine-Tuned Dataset: Finance-Instruct-500k
  • Version: v0.3
  • Parameters: Mini-sized architecture for efficient performance
  • Training Framework: Hugging Face Transformers

License

This model is released under the Apache 2.0 license.


Citation

If you use this model, please cite:

@model{josephgflowers2025phinance,
  title={Phinance-Phi-3.5-mini-instruct-finance-v0.3},
  author={Joseph G. Flowers},
  year={2025},
  url={https://huggingface.co/Josephgflowers/Phinance-Phi-3.5-mini-instruct-finance-v0.3}
}
Downloads last month
46
Safetensors
Model size
3.82B params
Tensor type
FP16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for Josephgflowers/Phinance-Phi-3.5-mini-instruct-finance-v0.3

Datasets used to train Josephgflowers/Phinance-Phi-3.5-mini-instruct-finance-v0.3