Qwen2.5-Coder-32B-Glaive-ToolCall

image/png

Model Description

This model is a fine-tuned version of Qwen/Qwen2.5-Coder-32B-Instruct specifically enhanced for tool calling capabilities. The model has been trained using the Glaive Function Calling v2 dataset (glaiveai/glaive-function-calling-v2) to significantly improve its ability to understand, generate, and execute function calls in various programming and automation contexts.

Model Details

  • Base Model: Qwen/Qwen2.5-Coder-32B-Instruct
  • Model Type: Large Language Model (LLM) with enhanced tool calling capabilities
  • Architecture: Transformer-based decoder model
  • Parameters: 32 billion parameters
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Training Dataset: glaive-function-calling-v2
  • Language Support: Multilingual

Training Configuration

  • Fine-tuning Type: LoRA with rank 8, alpha 16
  • Training Epochs: 3.0
  • Learning Rate: 5e-5 with cosine scheduler
  • Batch Size: 2 per device with 8 gradient accumulation steps
  • Context Length: 2048 tokens
  • Optimizer: AdamW
  • Precision: BF16
  • Max Samples: 100,000

Enhanced Capabilities

Tool Calling Improvements

This model demonstrates significant improvements in:

  1. Function Schema Understanding: Enhanced ability to parse and understand complex function signatures and parameter requirements
  2. Context-Aware Tool Selection: Improved decision-making for selecting appropriate tools based on user queries
  3. Parameter Extraction: Better extraction and formatting of function parameters from natural language inputs
  4. Multi-step Tool Orchestration: Enhanced capability to chain multiple tool calls for complex tasks
  5. Error Handling: Improved error detection and recovery in tool calling scenarios

Key Features

  • Robust JSON Generation: Produces well-formatted JSON for function calls with proper schema adherence
  • Natural Language Integration: Seamlessly integrates tool calls within conversational responses
  • Code Generation with Tools: Enhanced ability to generate code that incorporates external tool usage
  • API Integration: Improved understanding of REST APIs, GraphQL, and other web service interfaces

Use Cases

This model is particularly well-suited for:

  • AI Assistants: Building conversational AI that can interact with external systems
  • Automation Workflows: Creating intelligent automation scripts with dynamic tool usage
  • Code Generation: Generating code that integrates with APIs and external services
  • Data Processing: Automating data analysis and processing tasks with appropriate tools
  • System Integration: Building bridges between different software systems and services

Usage Example

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load the model and tokenizer
model_name = "RekklesAI/Qwen2.5-Coder-32B-Glaive-ToolCall"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

# Example prompt for tool calling
prompt = """You have access to a weather API. Help me get the current weather for New York City.

Available tools:
- get_weather(location: str, units: str = "metric") -> dict

User: What's the weather like in New York City?"""

# Generate response
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
    outputs = model.generate(
        inputs.input_ids,
        max_new_tokens=512,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)

Performance Metrics

The model shows significant improvements in tool calling benchmarks:

  • Function Call Accuracy: Enhanced precision in generating syntactically correct function calls
  • Parameter Extraction: Improved accuracy in extracting relevant parameters from user queries
  • Tool Selection: Better performance in selecting appropriate tools for given tasks
  • JSON Formatting: Reduced errors in JSON structure and formatting

Training Loss

The following chart shows the training loss progression during the fine-tuning process:

image/png

Training loss curve demonstrating stable convergence over 3 epochs with the Glaive Function Calling v2 dataset.

Limitations

  • The model's tool calling capabilities are primarily trained on the patterns present in the Glaive Function Calling v2 dataset
  • Performance may vary for highly specialized or domain-specific tools not represented in the training data
  • Like all LLMs, the model may occasionally generate plausible-sounding but incorrect tool calls
  • The model requires careful prompt engineering for optimal tool calling performance

Ethical Considerations

  • Tool Safety: Users should implement proper validation and sandboxing when allowing the model to execute actual tool calls
  • Access Control: Implement appropriate access controls and permissions for tools accessible to the model
  • Data Privacy: Be mindful of sensitive data that might be passed through tool calls
  • Monitoring: Implement logging and monitoring for tool usage in production environments

Training Data

The model was fine-tuned using the Glaive Function Calling v2 dataset (glaiveai/glaive-function-calling-v2), a comprehensive and high-quality dataset specifically designed for training language models in function calling capabilities.

Dataset Overview

  • Dataset Size: 113,000 training examples
  • Format: JSON with structured conversations
  • Language: English
  • License: Apache 2.0
  • Source: Glaive AI

Dataset Characteristics

The Glaive Function Calling v2 dataset is meticulously curated to provide diverse and realistic function calling scenarios:

Conversation Structure

  • System Messages: Define the assistant's role and available functions with detailed schemas
  • Multi-turn Dialogues: Natural conversations between users and AI assistants
  • Function Calls: Properly formatted JSON function invocations
  • Function Responses: Realistic API responses and result handling
  • Error Scenarios: Examples of graceful error handling and capability limitations

Function Diversity

The dataset covers a wide range of function types and use cases:

  • Utility Functions: Email sending, calendar management, password generation
  • Data Retrieval: News headlines, stock prices, weather information
  • Computational Tasks: Mathematical calculations, unit conversions, data analysis
  • Search Operations: Movie searches, book lookups, general information retrieval
  • Communication Tools: Contact management, messaging systems
  • Financial Services: Exchange rates, loan calculations, investment data
  • Content Creation: Text generation, formatting, summarization

Quality Features

  1. Realistic Scenarios: Conversations mirror real-world user interactions with AI assistants
  2. Proper Error Handling: Examples of polite refusals when functions are unavailable
  3. Parameter Validation: Correct handling of required and optional function parameters
  4. Context Awareness: Functions are called appropriately based on conversation context
  5. Natural Language Integration: Seamless integration of function results into conversational responses

Training Examples Include:

  • Single Function Calls: Simple, direct function invocations
  • Multi-step Workflows: Complex scenarios requiring multiple function calls
  • Parameter Extraction: Converting natural language requests into structured function parameters
  • Response Formatting: Presenting function results in user-friendly formats
  • Capability Boundaries: Clear communication of system limitations

Dataset Impact on Model Performance

This carefully curated dataset enables the model to:

  • Understand Function Schemas: Parse and comprehend complex function definitions
  • Extract Parameters: Accurately identify and format required function arguments from user queries
  • Generate Valid JSON: Produce syntactically correct function calls
  • Handle Edge Cases: Manage scenarios where requested functions are unavailable
  • Maintain Conversational Flow: Integrate function calling seamlessly into natural dialogue
  • Provide Helpful Responses: Transform function results into meaningful user communications

Technical Implementation

The dataset follows industry-standard formats for function calling:

  • OpenAI-compatible function schemas
  • Structured JSON for function definitions and calls
  • Clear separation between system instructions, user queries, and function responses
  • Consistent formatting across all examples

This comprehensive training data ensures the model can handle real-world function calling scenarios with high accuracy and reliability, making it suitable for production deployment in AI assistant applications, automation workflows, and API integration tasks.

Technical Specifications

  • Framework: Built using LLaMA-Factory
  • Hardware Requirements: Recommended 80GB+ VRAM for inference
  • Quantization: Compatible with various quantization methods (GPTQ, AWQ, etc.)
  • Deployment: Suitable for both cloud and on-premise deployment

Citation

If you use this model in your research or applications, please cite:

@misc{qwen25-coder-glaive-toolcall,
  title={Qwen2.5-Coder-32B-Glaive-ToolCall},
  author={[RekklesAI]},
  year={2025},
  note={Fine-tuned version of Qwen2.5-Coder-32B-Instruct with enhanced tool calling capabilities using Glaive dataset}
}

License

apache-2.0

Acknowledgments

  • Qwen Team: For the excellent base model Qwen2.5-Coder-32B-Instruct
  • Glaive: For providing the high-quality tool calling dataset
  • LLaMA-Factory: For the efficient fine-tuning framework

This model card follows the guidelines for responsible AI model documentation and transparency.

Downloads last month
8
Safetensors
Model size
32.8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RekklesAI/Qwen2.5-Coder-32B-Glaive-ToolCall

Base model

Qwen/Qwen2.5-32B
Finetuned
(82)
this model

Dataset used to train RekklesAI/Qwen2.5-Coder-32B-Glaive-ToolCall