Model Card for ML-GT/Llama-3.3-70B-Instruct-Edubuddy

EduBuddy is a Socratic AI Teaching Assistant designed for Georgia Tech’s CS 4641/7641 Machine Learning course.
It integrates fine-tuning and retrieval-augmented generation (RAG) to replicate TA-style pedagogical communication — guiding students through reasoning rather than providing direct answers.

Model Details

Model Description

EduBuddy fine-tunes Llama-3.3-70B-Instruct to emulate the Socratic teaching style observed in 8,197 authentic student–TA conversations from five years of Georgia Tech ML coursework.
It uses a dual-RAG architecture to retrieve:

Course-specific lecture and homework materials.
Structured learning subgoals for scaffolding conceptual understanding.

EduBuddy’s responses are contextually grounded, Socratic in nature, and aligned with the structure of the ML course’s assignments and lectures.

Developed by: Georgia Tech ML TA Team
Model type: Instruction-tuned large language model (LLM) fine-tuned for pedagogical dialogue
Language(s): English
License: No license (research-only use)
Finetuned from: meta-llama/Llama-3.3-70B-Instruct
Intended users: Instructors, researchers, and educational technologists exploring Socratic AI tutoring
Primary domain: University-level Machine Learning education (Georgia Tech CS 4641/7641)

Model Sources

Paper: EduBuddy: A Socratic AI Teaching Assistant combining Fine-Tuning and Retrieval-Augmented Generation for ML Education
Repository: [Not released publicly]
Demo: None (research prototype only)

Uses

Direct Use

To assist ML students by asking guiding questions and offering step-by-step reasoning in conceptual and coding queries.
To emulate TA responses for course-related Q&A forums or tutoring systems.

Downstream Use

Research on Socratic dialogue systems, AI pedagogy, or educational LLM design.
Fine-tuning base models for similar instructional contexts.

Out-of-Scope Use

Providing direct answers or code solutions.
Use outside educational contexts (e.g., general tutoring, grading automation, or production deployment).

Bias, Risks, and Limitations

Course specificity: Designed for Georgia Tech ML course materials; may perform poorly on unrelated topics.
Data artifacts: Training data includes informal language, emojis, and stylistic variation from real forums.
Scale dependency: Smaller models (e.g., 8B) underperform significantly; pedagogical reasoning emerges reliably only at large scale (70B+).
RAG noise sensitivity: Retrieval documents can introduce irrelevant context when not aligned with queries.

Recommendations

Users should treat EduBuddy as a pedagogical research prototype, not a general-purpose tutor.
Before adaptation to new domains, retraining on course-specific dialogues and materials is recommended.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support