QuantFactory Banner

aashish1904/LFM2-1.2B-Tool-GGUF

This is quantized version of LiquidAI/LFM2-1.2B-Tool created using llama.cpp

Original Model Card

Liquid AI

LFM2-1.2B-Tool

Based on LFM2-1.2B, LFM2-1.2B-Tool is designed for concise and precise tool calling. The key challenge was designing a non-thinking model that outperforms similarly sized thinking models for tool use.

Use cases:

  • Mobile and edge devices requiring instant API calls, database queries, or system integrations without cloud dependency.
  • Real-time assistants in cars, IoT devices, or customer support, where response latency is critical.
  • Resource-constrained environments like embedded systems or battery-powered devices needing efficient tool execution.

You can find more information about other task-specific models in this blog post.

πŸ“„ Model details

Generation parameters: We recommend using greedy decoding with a temperature=0.

System prompt: The system prompt must provide all the available tools

Supported languages: English, Arabic, Chinese, French, German, Japanese, Korean, Portuguese, and Spanish.

68d41b9699b7e1fafd645300_Model Library-Prompt + Answer

Tool use: It consists of four main steps:

  1. Function definition: LFM2 takes JSON function definitions as input (JSON objects between <|tool_list_start|> and <|tool_list_end|> special tokens), usually in the system prompt
  2. Function call: LFM2 writes Pythonic function calls (a Python list between <|tool_call_start|> and <|tool_call_end|> special tokens), as the assistant answer.
  3. Function execution: The function call is executed and the result is returned (string between <|tool_response_start|> and <|tool_response_end|> special tokens), as a "tool" role.
  4. Final answer: LFM2 interprets the outcome of the function call to address the original user prompt in plain text.

Here is a simple example of a conversation using tool use:

<|startoftext|><|im_start|>system
List of tools: <|tool_list_start|>[{"name": "get_candidate_status", "description": "Retrieves the current status of a candidate in the recruitment process", "parameters": {"type": "object", "properties": {"candidate_id": {"type": "string", "description": "Unique identifier for the candidate"}}, "required": ["candidate_id"]}}]<|tool_list_end|><|im_end|>
<|im_start|>user
What is the current status of candidate ID 12345?<|im_end|>
<|im_start|>assistant
<|tool_call_start|>[get_candidate_status(candidate_id="12345")]<|tool_call_end|>Checking the current status of candidate ID 12345.<|im_end|>
<|im_start|>tool
<|tool_response_start|>{"candidate_id": "12345", "status": "Interview Scheduled", "position": "Clinical Research Associate", "date": "2023-11-20"}<|tool_response_end|><|im_end|>
<|im_start|>assistant
The candidate with ID 12345 is currently in the "Interview Scheduled" stage for the position of Clinical Research Associate, with an interview date set for 2023-11-20.<|im_end|>

⚠️ The model supports both single-turn and multi-turn conversations.

πŸ“ˆ Performance

For edge inference, latency is a crucial factor in delivering a seamless and satisfactory user experience. Consequently, while test-time-compute inherently provides more accuracy, it ultimately compromises the user experience due to increased waiting times for function calls.

Therefore, the goal was to develop a tool calling model that is competitive with thinking models, yet operates without any internal chain-of-thought process.

image

We evaluated each model on a proprietary benchmark that was specifically designed to prevent data contamination. The benchmark ensures that performance metrics reflect genuine tool-calling capabilities rather than memorized patterns from training data.

πŸƒ How to run

πŸ“¬ Contact

If you are interested in custom solutions with edge deployment, please contact our sales team.

Downloads last month
67
GGUF
Model size
1B params
Architecture
lfm2
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for QuantFactory/LFM2-1.2B-Tool-GGUF

Base model

LiquidAI/LFM2-1.2B
Quantized
(26)
this model