metadata

base_model: bigcode/starcoder2-3b
datasets:
  - bigcode/the-stack-v2-train
library_name: transformers
license: bigcode-openrail-m
pipeline_tag: text-generation
tags:
  - code
  - llama-cpp
  - matrixportal
inference: true
widget:
  - text: 'def print_hello_world():'
    example_title: Hello world
    group: Python
model-index:
  - name: starcoder2-3b
    results:
      - task:
          type: text-generation
        dataset:
          name: CruxEval-I
          type: cruxeval-i
        metrics:
          - type: pass@1
            value: 32.7
      - task:
          type: text-generation
        dataset:
          name: DS-1000
          type: ds-1000
        metrics:
          - type: pass@1
            value: 25
      - task:
          type: text-generation
        dataset:
          name: GSM8K (PAL)
          type: gsm8k-pal
        metrics:
          - type: accuracy
            value: 27.7
      - task:
          type: text-generation
        dataset:
          name: HumanEval+
          type: humanevalplus
        metrics:
          - type: pass@1
            value: 27.4
      - task:
          type: text-generation
        dataset:
          name: HumanEval
          type: humaneval
        metrics:
          - type: pass@1
            value: 31.7
      - task:
          type: text-generation
        dataset:
          name: RepoBench-v1.1
          type: repobench-v1.1
        metrics:
          - type: edit-smiliarity
            value: 71.19

ysn-rfd/starcoder2-3b-GGUF

This model was converted to GGUF format from bigcode/starcoder2-3b using llama.cpp via the ggml.ai's all-gguf-same-where space. Refer to the original model card for more details on the model.

✅ Quantized Models Download List

🔍 Recommended Quantizations

✨ General CPU Use: Q4_K_M (Best balance of speed/quality)
📱 ARM Devices: Q4_0 (Optimized for ARM CPUs)
🏆 Maximum Quality: Q8_0 (Near-original quality)

📦 Full Quantization Options

🚀 Download	🔢 Type	📝 Notes
Download		Basic quantization
Download		Small size
Download		Balanced quality
Download		Better quality
Download		Fast on ARM
Download		Fast, recommended
Download	⭐	Best balance
Download		Good quality
Download		Balanced
Download		High quality
Download	🏆	Very good quality
Download	⚡	Fast, best quality
Download		Maximum accuracy

💡 Tip: Use F16 for maximum precision when quality is critical

🚀 Applications and Tools for Locally Quantized LLMs

🖥️ Desktop Applications

Application	Description	Download Link
Llama.cpp	A fast and efficient inference engine for GGUF models.	GitHub Repository
Ollama	A streamlined solution for running LLMs locally.	Website
AnythingLLM	An AI-powered knowledge management tool.	GitHub Repository
Open WebUI	A user-friendly web interface for running local LLMs.	GitHub Repository
GPT4All	A user-friendly desktop application supporting various LLMs, compatible with GGUF models.	GitHub Repository
LM Studio	A desktop application designed to run and manage local LLMs, supporting GGUF format.	Website
GPT4All Chat	A chat application compatible with GGUF models for local, offline interactions.	GitHub Repository

📱 Mobile Applications

Application	Description	Download Link
ChatterUI	A simple and lightweight LLM app for mobile devices.	GitHub Repository
Maid	Mobile Artificial Intelligence Distribution for running AI models on mobile devices.	GitHub Repository
PocketPal AI	A mobile AI assistant powered by local models.	GitHub Repository
Layla	A flexible platform for running various AI models on mobile devices.	Website

🎨 Image Generation Applications

Application	Description	Download Link
Stable Diffusion	An open-source AI model for generating images from text.	GitHub Repository
Stable Diffusion WebUI	A web application providing access to Stable Diffusion models via a browser interface.	GitHub Repository
Local Dream	Android Stable Diffusion with Snapdragon NPU acceleration. Also supports CPU inference.	GitHub Repository
Stable-Diffusion-Android (SDAI)	An open-source AI art application for Android devices, enabling digital art creation.	GitHub Repository