Model Card for Model ID

This is a LoRA fine-tuned version of Qwen3:4B, adapted for educational and scientific question-answering. The model has been fine-tuned on the Science On a Sphere (SOS) QA Dataset, which includes thousands of prompt/completion pairs derived from NOAA’s Science On a Sphere support content and dataset catalog. The model is designed to support Earth science education and enable AI-powered SOS content experiences.

Model Details

Base Model: Qwen/Qwen3-4B Fine-Tuned by: Eric Hackathorn (NOAA) Architecture: Transformer decoder-only (Qwen) Finetuning Type: Parameter-efficient fine-tuning using LoRA Language(s): English License: MIT

Model Description

Model Status: Work in Progress

This model is currently under active development. Please note:

The “More Information” URLs are provisional — they currently overemphasize support pages rather than high-level "What is..." resources.
The links will be refined in upcoming updates to better align with the model's purpose and intended audience.
Feedback is welcome to help improve this aspect and others.

This model is a LoRA fine-tuned version of Qwen/Qwen3-4B, optimized for question answering over content related to NOAA’s Science On a Sphere (SOS) initiative, including Earth science metadata, dataset descriptions, support documentation, and educational guidance. It is designed to be integrated into museum kiosks, classroom assistants, educational chatbots, and SOS Explorer environments to make complex environmental data more accessible and engaging.

Developed by: Eric Hackathorn (NOAA Global Systems Laboratory)
Shared by: https://huggingface.co/HacksHaven/sos-qwen3-4b-lora
Model type: Decoder-only transformer (LLM) with LoRA fine-tuning
Language(s): English
License: MIT
Finetuned from model: Qwen/Qwen3-4B

Uses

Educational Chatbots

Use: Plug into an LLM-powered assistant (like ChatGPT or a custom app) in a science museum, classroom, or mobile app.

Example: Student: “What causes a tsunami?” Model: Tsunamis are typically caused by underwater earthquakes, often at subduction zones. More information: https://sos.noaa.gov/catalog/datasets/tsunami-locations-2000-bce-2014/
Interactive Museum Kiosks

Use: Replace static displays with conversational kiosks powered by your model.

Example: A touchscreen exhibit next to an SOS globe where users ask, “What does this animation show?” and the model responds with a summary of that dataset.
SOS Explorer Integration

Use: Embed QA inside SOS Explorer or a future AI-powered version to describe datasets, provide learning guidance, or guide exploratory interactions.

Example: When a user clicks on a dataset, a bot could summarize it, suggest classroom activities, or quiz the user.
Curriculum and Lesson Plan Support

Use: Teachers ask the model for summaries, concepts, or classroom activities based on a specific dataset.

Example: “Describe a classroom activity using the dataset about ocean acidification.”
Research Assistant for Outreach Teams

Use: Internal NOAA outreach and comms teams use the model to quickly surface descriptions, summaries, related content, or activity suggestions.
Voice-activated Assistants

Use: Deploy in AR/VR environments or installations with voice input, e.g., “Tell me about sea surface temperature datasets.”

Direct Use

This model is optimized for:

Question-answering on Earth science content
SOS educational kiosk applications
Embedding into chatbots or classroom tools for informal STEM education

Downstream Use

It can be further fine-tuned for:

Domain-specific science outreach bots
Custom SOS Explorer content recommendation engines
Multimodal extensions (e.g., image+QA)

Out-of-Scope Use

Real-time decision-making or scientific analysis requiring exact precision
High-stakes classroom assessment without human verification
Non-English QA without additional fine-tuning

Bias, Risks, and Limitations

Some responses may oversimplify complex topics
Answers are based on generated content, not human-authored explanations
May reflect biases from the underlying LLM or training set structure

Recommendations

Use model outputs with educator supervision in formal settings
Cross-check completions against authoritative SOS materials
Avoid deployment in mission-critical scenarios without further vetting

How to Get Started with the Model

This is a merged and quantization-ready version of Qwen3-4B fine-tuned on the Science On a Sphere (SOS) instruction dataset using LoRA + PEFT. You can load it using:

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4"
)

model = AutoModelForCausalLM.from_pretrained(
    "HacksHaven/sos-qwen3-4b-lora",
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
)

tokenizer = AutoTokenizer.from_pretrained("HacksHaven/sos-qwen3-4b-lora", trust_remote_code=True)

Use the code below to chat with the model.

qa = pipeline("text-generation", model=model, tokenizer=tokenizer)
qa("What is NOAA's Science On a Sphere?")

Training Details

Training Data

Source Website: https://sos.noaa.gov/
Repository: https://huggingface.co/datasets/HacksHaven/science-on-a-sphere-prompt-completions/

Preprocessing

Prompts and completions were embedded in a Qwen-style conversational format using <|im_start|> and <|im_end|> tokens.

<|im_start|>user
[Prompt text]
<|im_end|>
<|im_start|>assistant
[Completion text]
<|im_end|>

Tokenization used padding="longest" and max_length=8192.
Labels were copied directly from input IDs for causal language modeling.

Training Hyperparameters

Parameter	Value
Base model	`Qwen/Qwen3-4B`
Finetuning method	LoRA (Low-Rank Adaptation)
LoRA Rank (`r`)	8
LoRA Alpha	32
LoRA Dropout	0.05
Target Modules	`q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
Gradient Checkpointing	Enabled
Max sequence length	8192
Precision	bfloat16
Quantization	4-bit NF4 via BitsAndBytes
Optimizer	`paged_adamw_8bit`
Learning Rate	2e-4
Epochs	3
Batch Size	1 (with gradient accumulation = 4)
Logging & Eval Strategy	Every 10 steps
Evaluation Metric	`bertscore_f1` (maximize)
Load Best Model at End	✅ Yes

Evaluation

Testing Data, Factors & Metrics

Testing Data

Evaluated on a 10% held-out split of the training dataset (stratified).

Factors

This model was fine-tuned to support instructional content for NOAA's Science On a Sphere (SOS) exhibits, which span a diverse set of topics and audiences. Relevant factors that may affect model performance include:

Scientific Domain: The model has seen examples across atmospheric science, oceanography, climate change, space weather, and Earth system interactions. Responses may vary depending on the domain depth in the fine-tuning set.
Instruction Type: Prompts vary in style, including explanations of scientific processes, definitions, causal reasoning, and narrative-style descriptions for public displays.
Intended Audience: While many prompts are written at a general public or middle school level, the model may perform differently for early learners, specialists, or multilingual audiences.
Input Format: The model is trained with structured instruction format tags (e.g., <|im_start|>user). Results may vary if these are not used consistently.
Data Origin: The training set draws from curated NOAA science narratives, educational materials, and exhibit scripts. Domains or tones not represented in these sources may yield less accurate responses.

Future evaluations could assess performance across these axes to better understand model reliability in SOS-like deployment environments.

Metrics

ROUGE-1, ROUGE-2, ROUGE-L: N-gram overlap
BLEU: Token-based overlap precision
BERTScore F1: Semantic similarity of completions
Perplexity: If eval loss is available

Results

Evaluation was performed using ROUGE, BLEU, BERTScore, and perplexity on a held-out 10% test set. BERTScore F1 was used to select the best checkpoint during training. Unfortunately it made my GPU burst into flames.

Quantitative results TBD in future update.

Summary

Summary will be added when quantitative evaluation is complete.

Citation

BibTeX:

@model{hackathorn_2025_sosqwen,
  title = {Science On a Sphere QA Model (Qwen3-4B, LoRA)},
  author = {Hackathorn, Eric},
  year = {2025},
  url = {https://huggingface.co/HacksHaven/sos-qwen3-4b-lora}
}

APA:

Hackathorn, E. (2025). Science On a Sphere QA Model (Qwen3-4B, LoRA). Hugging Face. https://huggingface.co/HacksHaven/sos-qwen3-4b-lora

Model Card Contact

Author: Eric Hackathorn Email: [email protected] Affiliation: NOAA Global Systems Laboratory

HacksHaven
/

sos-qwen3-4b-lora