Jan-nano-F32-GGUF / README.md
prithivMLmods's picture
Update README.md
8d70ad7 verified
metadata
license: apache-2.0
language:
  - en
base_model:
  - Menlo/Jan-nano
pipeline_tag: text-generation
library_name: transformers
tags:
  - text-generation-inference
  - MCP

Jan-nano-GGUF

Jan-Nano is a compact 4-billion parameter language model specifically designed and trained for deep research tasks. This model has been optimized to work seamlessly with Model Context Protocol (MCP) servers, enabling efficient integration with various research tools and data sources.

Model Files

File Name Size Format Description
Jan-nano.F32.gguf 16.1 GB F32 Full precision 32-bit floating point
Jan-nano.F16.gguf 8.05 GB F16 Half precision 16-bit floating point
Jan-nano.BF16.gguf 8.05 GB BF16 Brain floating point 16-bit

Usage

These GGUF format files are optimized for use with llama.cpp and compatible inference engines. Choose the appropriate precision level based on your hardware capabilities and quality requirements:

  • F32: Highest quality, requires most memory
  • F16/BF16: Good balance of quality and memory efficiency

Configuration

The model configuration is available in config.json.

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

image.png