Model Card for ML-GT/Llama-3.3-70B-Instruct-Edubuddy

EduBuddy is a Socratic AI Teaching Assistant designed for Georgia Tech’s CS 4641/7641 Machine Learning course.
It integrates fine-tuning and retrieval-augmented generation (RAG) to replicate TA-style pedagogical communication — guiding students through reasoning rather than providing direct answers.


Model Details

Model Description

EduBuddy fine-tunes Llama-3.3-70B-Instruct to emulate the Socratic teaching style observed in 8,197 authentic student–TA conversations from five years of Georgia Tech ML coursework.
It uses a dual-RAG architecture to retrieve:

  1. Course-specific lecture and homework materials.
  2. Structured learning subgoals for scaffolding conceptual understanding.

EduBuddy’s responses are contextually grounded, Socratic in nature, and aligned with the structure of the ML course’s assignments and lectures.

  • Developed by: Georgia Tech ML TA Team
  • Model type: Instruction-tuned large language model (LLM) fine-tuned for pedagogical dialogue
  • Language(s): English
  • License: No license (research-only use)
  • Finetuned from: meta-llama/Llama-3.3-70B-Instruct
  • Intended users: Instructors, researchers, and educational technologists exploring Socratic AI tutoring
  • Primary domain: University-level Machine Learning education (Georgia Tech CS 4641/7641)

Model Sources

  • Paper: EduBuddy: A Socratic AI Teaching Assistant combining Fine-Tuning and Retrieval-Augmented Generation for ML Education
  • Repository: [Not released publicly]
  • Demo: None (research prototype only)

Uses

Direct Use

  • To assist ML students by asking guiding questions and offering step-by-step reasoning in conceptual and coding queries.
  • To emulate TA responses for course-related Q&A forums or tutoring systems.

Downstream Use

  • Research on Socratic dialogue systems, AI pedagogy, or educational LLM design.
  • Fine-tuning base models for similar instructional contexts.

Out-of-Scope Use

  • Providing direct answers or code solutions.
  • Use outside educational contexts (e.g., general tutoring, grading automation, or production deployment).

Bias, Risks, and Limitations

  • Course specificity: Designed for Georgia Tech ML course materials; may perform poorly on unrelated topics.
  • Data artifacts: Training data includes informal language, emojis, and stylistic variation from real forums.
  • Scale dependency: Smaller models (e.g., 8B) underperform significantly; pedagogical reasoning emerges reliably only at large scale (70B+).
  • RAG noise sensitivity: Retrieval documents can introduce irrelevant context when not aligned with queries.

Recommendations

Users should treat EduBuddy as a pedagogical research prototype, not a general-purpose tutor.
Before adaptation to new domains, retraining on course-specific dialogues and materials is recommended.


Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support