Back to AI Research

AI Research

Learning to Prompt: Improving Student Engagement wi... | AI Research

Key Takeaways

  • Learning to Prompt: Improving Student Engagement with Adaptive LLM-based High-School Tutoring This paper introduces a new system designed to make AI-powered...
  • LLMs can personalize education, although current static-prompt tutoring systems struggle to adapt to diverse academic disciplines.
  • We develop and test a system with subject-aware prompting, based on 14 pedagogical features (e.g., tutor scaffolding, student understanding) extracted from raw transcripts.
  • We first train a prompt routing model in a simulation environment, and then deploy it for online adaptation with actual high-school students.
  • The simulation benchmark shows the router outperforming two static baselines ($0.694$ vs.
Paper AbstractExpand

LLMs can personalize education, although current static-prompt tutoring systems struggle to adapt to diverse academic disciplines. We develop and test a system with subject-aware prompting, based on 14 pedagogical features (e.g., tutor scaffolding, student understanding) extracted from raw transcripts. We first train a prompt routing model in a simulation environment, and then deploy it for online adaptation with actual high-school students. The simulation benchmark shows the router outperforming two static baselines ($0.694$ vs. $0.647$ and $0.64$, $p<0.001$). A/B testing ($N=656$ conversations from 359 students) shows sim-to-real transfer where the model switches from analytical to scaffolding learning strategies. Our adaptive prompt selection mechanism improves instructional efficiency, maintains pedagogical quality and reduces interactions by around 3 turns ($p=0.007$). While a greedy router achieves a comparable exercise conversion rate with the baseline ($19.1\%$ vs. $19.6\%$), a stochastic router that samples strategies leads to a higher conversion rate ($28.1\%$).

Learning to Prompt: Improving Student Engagement with Adaptive LLM-based High-School Tutoring
This paper introduces a new system designed to make AI-powered tutoring more effective by moving away from "one-size-fits-all" instructions. While many current AI tutors use a single, static prompt for every subject, this research develops an adaptive "router" that selects the best pedagogical strategy—such as Socratic scaffolding or role reversal—based on the specific academic subject and the student's needs. By training this system in a simulated environment and testing it with real high-school students, the authors demonstrate how AI can better tailor its teaching style to improve student engagement and instructional efficiency.

How the System Works

The core of the system is a "prompt router" that acts as a decision-maker. When a student interacts with the tutor, the system analyzes the subject matter and the current conversation to choose the most appropriate teaching strategy from a pool of 20 expert-designed prompts. To train this model, the researchers created an AI evaluator that breaks down tutoring quality into 14 specific pedagogical features, such as how well the tutor scaffolds information or checks for student understanding. This evaluator provides a reliable score that helps the system learn which strategies work best, even when traditional test scores are unavailable.

Bridging the Gap Between Simulation and Reality

A major challenge in AI education is the "sim-to-real" gap, where a model performs well in a controlled simulation but struggles with the unpredictable nature of real students. To address this, the researchers used a "score calibration" technique to ensure the AI's feedback remains consistent between the simulation and the real world. They also implemented a dual-path architecture that uses both topic embeddings and subject-specific IDs, preventing the model from losing focus when switching between different disciplines like mathematics, history, or geography.

Key Findings and Impact

In an A/B test involving 359 Dutch high-school students, the adaptive system proved its value. The router successfully improved instructional efficiency, helping students reach their goals in about three fewer turns than the standard baseline. While a "greedy" version of the router (which always picks the most likely best strategy) performed similarly to existing systems, a "stochastic" version—which occasionally samples different strategies—led to a significantly higher rate of students choosing to move on to formal exercises. The study also confirmed that higher pedagogical scores from the AI evaluator were statistically linked to a higher likelihood of students engaging with follow-up practice.

Important Considerations

The researchers noted several limitations that guide future work. Because many students do not immediately complete formal exercises after a tutoring session, it remains difficult to measure direct knowledge gains or long-term learning outcomes. Additionally, the system occasionally struggled with data sparsity in less common subjects, leading the model to default to "safe" but perhaps less optimal teaching strategies. Future research aims to address these domain imbalances and incorporate more detailed conversation history to further refine how the AI adapts to individual student needs.

Comments (0)

No comments yet

Be the first to share your thoughts!