BetaCeti-Beta-4B-Prime1

BetaCeti-Beta-4B-Prime1 is a compact, coding-optimized language model built on the Qwen3-4B architecture, tailored for high-accuracy code generation, debugging, and technical reasoning. With 4 billion parameters, it strikes a balance between performance and efficiency, making it an ideal assistant for developers, educators, and engineers working in constrained environments or requiring fast inference.

GGUF: https://huggingface.co/prithivMLmods/BetaCeti-Beta-4B-Prime1-GGUF

Key Features

Qwen3-4B Architecture Core Built on the robust and scalable Qwen3 transformer backbone, offering strong performance on both single-turn and multi-step code workflows.
Code-First Training Focus Fine-tuned primarily on coding datasets across Python, JavaScript, C++, and Bash, with additional coverage of software documentation, APIs, and debugging tasks.
Multi-Step Reasoning in Code Capable of breaking down complex programming problems, explaining logic, and correcting bugs—ideal for students, engineers, and software instructors.
Structured Format Proficiency Outputs syntactically correct code blocks, JSON, YAML, and Markdown—streamlining integration into tools, notebooks, and docs.
Lightweight Yet Powerful At 4B parameters, it provides strong results without the heavy resource demands of larger models, and is deployable on most modern GPUs or powerful CPUs.
Cross-Language Coding Support Generates and interprets code in 10+ languages with emphasis on real-world application, scripting, and algorithmic problem-solving.

Quickstart with Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "prithivMLmods/BetaCeti-Beta-4B-Prime1"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Write a Python function to check if a number is prime."

messages = [
    {"role": "system", "content": "You are a helpful coding assistant."},
    {"role": "user", "content": prompt}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

Intended Use

Code generation, translation, and refactoring
Teaching and tutoring in programming concepts
Technical documentation generation and API auto-fill
Debugging assistant with error analysis and fixes
Lightweight deployment in IDEs, coding platforms, and offline environments

Limitations

Smaller context length compared to larger coding models (e.g., >7B)
May require prompt engineering for deeply nested or obscure code patterns
Limited fluency in non-programming natural language dialogue
Not optimized for purely creative writing or storytelling tasks

References

[Qwen2.5 Technical Report (https://arxiv.org/pdf/2412.15115)]
YaRN: Efficient Context Window Extension of Large Language Models

prithivMLmods
/

BetaCeti-Beta-4B-Prime1

BetaCeti-Beta-4B-Prime1

Key Features

Quickstart with Transformers

Intended Use

Limitations

References

Model tree for prithivMLmods/BetaCeti-Beta-4B-Prime1

Collection including prithivMLmods/BetaCeti-Beta-4B-Prime1

The Prime X