metadata

license: apache-2.0
language:
  - en
base_model:
  - Menlo/Jan-nano
pipeline_tag: text-generation
library_name: transformers
tags:
  - text-generation-inference
  - MCP

Jan-nano-GGUF

Jan-Nano is a compact 4-billion parameter language model specifically designed and trained for deep research tasks. This model has been optimized to work seamlessly with Model Context Protocol (MCP) servers, enabling efficient integration with various research tools and data sources.

Model Files

File Name	Size	Format	Description
Jan-nano.F32.gguf	16.1 GB	F32	Full precision 32-bit floating point
Jan-nano.F16.gguf	8.05 GB	F16	Half precision 16-bit floating point
Jan-nano.BF16.gguf	8.05 GB	BF16	Brain floating point 16-bit

Usage

These GGUF format files are optimized for use with llama.cpp and compatible inference engines. Choose the appropriate precision level based on your hardware capabilities and quality requirements:

F32: Highest quality, requires most memory
F16/BF16: Good balance of quality and memory efficiency

Configuration

The model configuration is available in config.json.

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):