Learning to Prompt: Improving Student Engagement with Adaptive LLM-based High-School Tutoring
This paper introduces a new system designed to make AI-powered tutoring more effective by moving away from "one-size-fits-all" instructions. While many current AI tutors use a single, static prompt for every subject, this research develops an adaptive "router" that selects the best pedagogical strategy—such as Socratic scaffolding or role reversal—based on the specific academic subject and the student's needs. By training this system in a simulated environment and testing it with real high-school students, the authors demonstrate how AI can better tailor its teaching style to improve student engagement and instructional efficiency.
How the System Works
The core of the system is a "prompt router" that acts as a decision-maker. When a student interacts with the tutor, the system analyzes the subject matter and the current conversation to choose the most appropriate teaching strategy from a pool of 20 expert-designed prompts. To train this model, the researchers created an AI evaluator that breaks down tutoring quality into 14 specific pedagogical features, such as how well the tutor scaffolds information or checks for student understanding. This evaluator provides a reliable score that helps the system learn which strategies work best, even when traditional test scores are unavailable.
Bridging the Gap Between Simulation and Reality
A major challenge in AI education is the "sim-to-real" gap, where a model performs well in a controlled simulation but struggles with the unpredictable nature of real students. To address this, the researchers used a "score calibration" technique to ensure the AI's feedback remains consistent between the simulation and the real world. They also implemented a dual-path architecture that uses both topic embeddings and subject-specific IDs, preventing the model from losing focus when switching between different disciplines like mathematics, history, or geography.
Key Findings and Impact
In an A/B test involving 359 Dutch high-school students, the adaptive system proved its value. The router successfully improved instructional efficiency, helping students reach their goals in about three fewer turns than the standard baseline. While a "greedy" version of the router (which always picks the most likely best strategy) performed similarly to existing systems, a "stochastic" version—which occasionally samples different strategies—led to a significantly higher rate of students choosing to move on to formal exercises. The study also confirmed that higher pedagogical scores from the AI evaluator were statistically linked to a higher likelihood of students engaging with follow-up practice.
Important Considerations
The researchers noted several limitations that guide future work. Because many students do not immediately complete formal exercises after a tutoring session, it remains difficult to measure direct knowledge gains or long-term learning outcomes. Additionally, the system occasionally struggled with data sparsity in less common subjects, leading the model to default to "safe" but perhaps less optimal teaching strategies. Future research aims to address these domain imbalances and incorporate more detailed conversation history to further refine how the AI adapts to individual student needs.
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!