Salesforce
/

Llama-xLAM-2-8b-fc-r-gguf

@@ -31,14 +31,13 @@ library_name: transformers
 # Welcome to the xLAM-2 Model Family!
-## Model Summary
 This repo provides the GGUF format for the Llama-xLAM-2-8b-fc-r model. Here's a link to original model [Llama-xLAM-2-8b-fc-r](https://huggingface.co/Salesforce/Llama-xLAM-2-8b-fc-r).
 [Large Action Models (LAMs)](https://blog.salesforceairesearch.com/large-action-models/) are advanced language models designed to enhance decision-making by translating user intentions into executable actions. As the **brains of AI agents**, LAMs autonomously plan and execute tasks to achieve specific goals, making them invaluable for automating workflows across diverse domains.
 **This model release is for research purposes only.**
-The new **xLAM-2** series, built on our most advanced data synthesis, processing, and training pipelines, marks a significant leap in **multi-turn conversation** and **tool usage**. Trained using our novel APIGen-MT framework, which generates high-quality training data through simulated agent-human interactions. Our models achieve state-of-the-art performance on **BFCL** and **τ-bench** benchmarks, outperforming frontier models like GPT-4o and Claude 3.5. Notably, even our smaller models demonstrate superior capabilities in multi-turn scenarios while maintaining exceptional consistency across trials.
 We've also refined the **chat template** and **vLLM integration**, making it easier to build advanced AI agents. Compared to previous xLAM models, xLAM-2 offers superior performance and seamless deployment across applications.
@@ -55,36 +54,26 @@ We've also refined the **chat template** and **vLLM integration**, making it eas
 - [Benchmark Results](#benchmark-results)
 - [Citation](#citation)
----
 ## Model Series
-[xLAM](https://huggingface.co/collections/Salesforce/xlam-models-65f00e2a0a63bbcd1c2dade4) series are significant better at many things including general tasks and function calling.
 For the same number of parameters, the model have been fine-tuned across a wide range of agent tasks and scenarios, all while preserving the capabilities of the original model.
-| Model                  | # Total Params | Context Length |Release Date | Category | Download Model  | Download GGUF files |
-|------------------------|----------------|------------|-------------|-------|----------------|----------|
-| Llama-xLAM-2-70b-fc-r | 70B            | 128k            | Mar. 26, 2025 | Multi-turn Conversation, Function-calling   | [🤗 Link](https://huggingface.co/Salesforce/Llama-xLAM-2-70b-fc-r)         |      NA               |
-| Llama-xLAM-2-8b-fc-r      | 8B             | 128k            | Mar. 26, 2025 | Multi-turn Conversation, Function-calling     | [🤗 Link](https://huggingface.co/Salesforce/Llama-xLAM-2-8b-fc-r)              |   [🤗 Link](https://huggingface.co/Salesforce/Llama-xLAM-2-8b-fc-r-gguf)    |
-| xLAM-2-32b-fc-r     | 32B            | 32k (max 128k)*            | Mar. 26, 2025 |  Multi-turn Conversation, Function-calling   | [🤗 Link](https://huggingface.co/Salesforce/xLAM-2-32b-fc-r)             |      NA               |
-| xLAM-2-3b-fc-r      | 3B             | 32k (max 128k)*            | Mar. 26, 2025 |  Multi-turn Conversation, Function-calling    | [🤗 Link](https://huggingface.co/Salesforce/xLAM-2-3b-fc-r)              |      [🤗 Link](https://huggingface.co/Salesforce/xLAM-2-3b-fc-r-gguf)               |
-| xLAM-2-1b-fc-r      | 1B             | 32k (max 128k)*            | Mar. 26, 2025 |  Multi-turn Conversation, Function-calling | [🤗 Link](https://huggingface.co/Salesforce/xLAM-2-1b-fc-r)              |      [🤗 Link](https://huggingface.co/Salesforce/xLAM-2-1b-fc-r-gguf)               |
-| xLAM-7b-r           | 7.24B          | 32k            | Sep. 5, 2024|General,  Function-calling | [🤗 Link](https://huggingface.co/Salesforce/xLAM-7b-r) | -- |
-| xLAM-8x7b-r           | 46.7B          | 32k           | Sep. 5, 2024|General,  Function-calling | [🤗 Link](https://huggingface.co/Salesforce/xLAM-8x7b-r) | -- |
-| xLAM-8x22b-r           | 141B          | 64k           | Sep. 5, 2024|General,  Function-calling | [🤗 Link](https://huggingface.co/Salesforce/xLAM-8x22b-r) | -- |
-| xLAM-1b-fc-r           | 1.35B          | 16k           | July 17, 2024 | Function-calling| [🤗 Link](https://huggingface.co/Salesforce/xLAM-1b-fc-r) | [🤗 Link](https://huggingface.co/Salesforce/xLAM-1b-fc-r-gguf) |
-| xLAM-7b-fc-r           | 6.91B          | 4k            | July 17, 2024| Function-calling| [🤗 Link](https://huggingface.co/Salesforce/xLAM-7b-fc-r) | [🤗 Link](https://huggingface.co/Salesforce/xLAM-7b-fc-r-gguf) |
-| xLAM-v0.1-r           | 46.7B          | 32k            | Mar. 18, 2024 |General,  Function-calling | [🤗 Link](https://huggingface.co/Salesforce/xLAM-v0.1-r) | -- |
 ***Note:** The default context length for Qwen-2.5-based models is 32k, but you can use techniques like YaRN (Yet Another Recursive Network) to achieve maximum 128k context length. Please refer to [here](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct#processing-long-texts) for more details.
-### 📦 Model Naming Conventions
-- `xLAM-7b-r`: A general-purpose v1.0 or v2.0 release of the **Large Action Model**, fine-tuned for broad agentic capabilities. The `-r` suffix indicates it is a **research** release.
-- `xLAM-7b-fc-r`: A specialized variant where `-fc` denotes fine-tuning for **function calling** tasks, also marked for **research** use.
-- ✅ All models are fully compatible with VLLM, FastChat, and Transformers-based inference frameworks.
----
 ## Using GGUF Files
@@ -206,7 +195,7 @@ print(output['choices'][0]['message'])
 <p align="center">
 <img width="80%" alt="BFCL Results" src="https://github.com/apigen-mt/apigen-mt.github.io/blob/main/img/bfcl-result.png?raw=true">
 <br>
-<small><i>Performance comparison of different models on BFCL leaderboard. The rank is based on the overall accuracy, which is a weighted average of different evaluation categories. "FC" stands for function-calling mode in contrast to using a customized "prompt" to extract the function calls.</i></small>
 </p>
 ### τ-bench Benchmark
@@ -237,14 +226,16 @@ For all Llama relevant models, please also follow corresponding Llama license an
 If you use our model or dataset in your work, please cite our paper:
 ```bibtex
-@article{prabhakar2025apigen,
-  title={APIGen-MT: Agentic PIpeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay},
-  author={Prabhakar, Akshara and Liu, Zuxin and Zhu, Ming and Zhang, Jianguo and Awalgaonkar, Tulika and Wang, Shiyu and Liu, Zhiwei and Chen, Haolin and Hoang, Thai and others},
   journal={arXiv preprint arXiv:2504.03601},
   year={2025}
 }
 ```
 ```bibtex
 @article{zhang2025actionstudio,
   title={ActionStudio: A Lightweight Framework for Data and Training of Action Models},
@@ -261,10 +252,7 @@ If you use our model or dataset in your work, please cite our paper:
   journal={arXiv preprint arXiv:2409.03215},
   year={2024}
 }
 ```
-Additionally, please check our other related works regarding xLAM and consider citing them as well:
 ```bibtex
 @article{liu2024apigen,
@@ -286,4 +274,3 @@ Additionally, please check our other related works regarding xLAM and consider c
 }
 ```

 # Welcome to the xLAM-2 Model Family!
 This repo provides the GGUF format for the Llama-xLAM-2-8b-fc-r model. Here's a link to original model [Llama-xLAM-2-8b-fc-r](https://huggingface.co/Salesforce/Llama-xLAM-2-8b-fc-r).
 [Large Action Models (LAMs)](https://blog.salesforceairesearch.com/large-action-models/) are advanced language models designed to enhance decision-making by translating user intentions into executable actions. As the **brains of AI agents**, LAMs autonomously plan and execute tasks to achieve specific goals, making them invaluable for automating workflows across diverse domains.
 **This model release is for research purposes only.**
+The new **xLAM-2** series, built on our most advanced data synthesis, processing, and training pipelines, marks a significant leap in **multi-turn conversation** and **tool usage**. Trained using our novel APIGen-MT framework, which generates high-quality training data through simulated agent-human interactions. Our models achieve state-of-the-art performance on [**BFCL**](https://gorilla.cs.berkeley.edu/leaderboard.html) and **τ-bench** benchmarks, outperforming frontier models like GPT-4o and Claude 3.5. Notably, even our smaller models demonstrate superior capabilities in multi-turn scenarios while maintaining exceptional consistency across trials.
 We've also refined the **chat template** and **vLLM integration**, making it easier to build advanced AI agents. Compared to previous xLAM models, xLAM-2 offers superior performance and seamless deployment across applications.
 - [Benchmark Results](#benchmark-results)
 - [Citation](#citation)
 ## Model Series
+[xLAM](https://huggingface.co/collections/Salesforce/xlam-models-65f00e2a0a63bbcd1c2dade4) series are significantly better at many things including general tasks and function calling.
 For the same number of parameters, the model have been fine-tuned across a wide range of agent tasks and scenarios, all while preserving the capabilities of the original model.
+| Model                  | # Total Params | Context Length | Category | Download Model  | Download GGUF files |
+|------------------------|----------------|------------|-------|----------------|----------|
+| Llama-xLAM-2-70b-fc-r | 70B            | 128k            | Multi-turn Conversation, Function-calling   | [🤗 Link](https://huggingface.co/Salesforce/Llama-xLAM-2-70b-fc-r)         |      NA               |
+| Llama-xLAM-2-8b-fc-r      | 8B             | 128k            | Multi-turn Conversation, Function-calling     | [🤗 Link](https://huggingface.co/Salesforce/Llama-xLAM-2-8b-fc-r)              |   [🤗 Link](https://huggingface.co/Salesforce/Llama-xLAM-2-8b-fc-r-gguf)    |
+| xLAM-2-32b-fc-r     | 32B            | 32k (max 128k)*            |  Multi-turn Conversation, Function-calling   | [🤗 Link](https://huggingface.co/Salesforce/xLAM-2-32b-fc-r)             |      NA               |
+| xLAM-2-3b-fc-r      | 3B             | 32k (max 128k)*            |  Multi-turn Conversation, Function-calling    | [🤗 Link](https://huggingface.co/Salesforce/xLAM-2-3b-fc-r)              |      [🤗 Link](https://huggingface.co/Salesforce/xLAM-2-3b-fc-r-gguf)               |
+| xLAM-2-1b-fc-r      | 1B             | 32k (max 128k)*            |  Multi-turn Conversation, Function-calling | [🤗 Link](https://huggingface.co/Salesforce/xLAM-2-1b-fc-r)              |      [🤗 Link](https://huggingface.co/Salesforce/xLAM-2-1b-fc-r-gguf)               |
 ***Note:** The default context length for Qwen-2.5-based models is 32k, but you can use techniques like YaRN (Yet Another Recursive Network) to achieve maximum 128k context length. Please refer to [here](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct#processing-long-texts) for more details.
+You can also explore our previous xLAM series [here](https://huggingface.co/collections/Salesforce/xlam-models-65f00e2a0a63bbcd1c2dade4).
+The `-fc` suffix indicates that the models are fine-tuned for **function calling** tasks, while the `-r` suffix signifies a **research** release.
+✅ All models are fully compatible with vLLM and Transformers-based inference frameworks.
 ## Using GGUF Files
 <p align="center">
 <img width="80%" alt="BFCL Results" src="https://github.com/apigen-mt/apigen-mt.github.io/blob/main/img/bfcl-result.png?raw=true">
 <br>
+<small><i>Performance comparison of different models on [BFCL leaderboard](https://gorilla.cs.berkeley.edu/leaderboard.html). The rank is based on the overall accuracy, which is a weighted average of different evaluation categories. "FC" stands for function-calling mode in contrast to using a customized "prompt" to extract the function calls.</i></small>
 </p>
 ### τ-bench Benchmark
 If you use our model or dataset in your work, please cite our paper:
 ```bibtex
+@article{prabhakar2025apigenmt,
+  title={APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay},
+  author={Prabhakar, Akshara and Liu, Zuxin and Yao, Weiran and Zhang, Jianguo and Zhu, Ming and Wang, Shiyu and Liu, Zhiwei and Awalgaonkar, Tulika and Chen, Haolin and Hoang, Thai and Niebles, Juan Carlos and Heinecke, Shelby and Wang, Huan and Savarese, Silvio and Xiong, Caiming},
   journal={arXiv preprint arXiv:2504.03601},
   year={2025}
 }
 ```
+Additionally, please check our other amazing works regarding xLAM series and consider citing them as well:
 ```bibtex
 @article{zhang2025actionstudio,
   title={ActionStudio: A Lightweight Framework for Data and Training of Action Models},
   journal={arXiv preprint arXiv:2409.03215},
   year={2024}
 }
 ```
 ```bibtex
 @article{liu2024apigen,
 }
 ```