NUMI: An AI-Tutor for Learning How Best to Deliver Transformative Personalized Education at Scale

There is growing interest in whether Generative AI technology has the potential to transform education and offer personalized learning for all. Early research, however, shows challenges with implementation and structure for getting students to use it effectively. To test AI’s potential and develop best practices for engaging students, we propose a student-level, within-class randomized evaluation of NUMI, an online platform designed for research that pairs mastery pacing with a guard-railed AI math tutor. In partnership with the Hamilton County Department of Education (HCDE) and at least 50–70 Grades 4–9 teachers (≈1,500–2,100 students), each trimester students are randomly assigned in a 2×2 design—Mastery vs. no Mastery; AI-Tutor vs. no AI-Tutor. We will measure immediate learning on lagged “improvement checks,” behavioral mechanisms such as persistence after mistakes and time-on-task, and district outcomes including course grades and, where available, standardized assessments. Embedded randomized tests within the AI arms will allow us to assess a limited number of pre-specified design variations, such as prompting style and timing of support, in order to identify which forms of AI guidance increase productive engagement and learning. The study builds on previous JPAL-collaborations and pivots to a flexible, customizable platform that will causally identify the value-added of AI tutoring over well-implemented CAL, and inform policy makers and program designers which AI and structural features deliver the largest durable, equitable, gains at scale.