SmolGRPO-135M

This model was a practical exercise based on the reasoning models course on HuggingFace.

Badge Description

Usage

from transformers import pipeline

prompt = """
# A detailed overview of the Solar System

The Solar System is the gravitationally bound system of the Sun and the objects that orbit it, 
either directly or indirectly. It formed approximately 4.6 billion years ago from the gravitational 
collapse of a giant interstellar molecular cloud. The vast majority of the system's mass is in 
the Sun, with the majority of the remaining mass contained in Jupiter. The four smaller inner 
planets—Mercury, Venus, Earth, and Mars—are terrestrial planets, being primarily composed of 
rock and metal. The four outer planets are giant planets, being substantially more massive 
than the terrestrials. The two largest, Jupiter and Saturn, are gas giants, being composed 
mainly of hydrogen and helium; the two outermost planets, Uranus and Neptune, are ice giants, 
being composed mostly of substances with relatively high melting points compared with 
hydrogen and helium, called volatiles, such as water, ammonia, and methane. All eight planets 
have almost circular orbits that lie within a nearly flat disc called the ecliptic. The Solar 
System also contains smaller objects, including dwarf planets like Pluto, moons, asteroids, 
and comets.
"""

messages = [
    {"role": "user", "content": prompt},
]

generator = pipeline("text-generation", model="ezzaldeen/SmolGRPO-135M")

generated_text = generator(
    prompt,
    max_new_tokens=256,
    do_sample=True,
    temperature=0.5,
    min_p=0.1
)

print(generated_text)
Downloads last month
10
Safetensors
Model size
135M params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for ezzaldeen/SmolGRPO-135M

Finetuned
(122)
this model

Dataset used to train ezzaldeen/SmolGRPO-135M