What we heard—and learned—at the India AI Impact Summit
Our team had a whirlwind week at the India AI Impact Summit 2026 last month, the largest annual global gathering on AI. Held in the Global South for the first time, the summit welcomed not only country delegations and technology leaders, but also students and members of the public from across India and beyond.
We hosted an official, day-long seminar: AI for Social Good: Impact that Works. We shared how measuring real-world impact is essential to realizing AI’s potential while avoiding wasted resources and preventing harm. Joined by several thousand participants throughout the day, our sessions highlighted the power of randomized evaluations to generate timely, actionable feedback that can improve AI-enabled social programs across sectors from health to agriculture.
At the summit, we saw how our experience evaluating social programs — including many that incorporate cutting-edge technology — can help governments, funders, and implementers distinguish between the hype and ideas that are actually improving lives.
Policymakers are increasingly enthusiastic about adopting AI to expand the scope of what the government is able to achieve. But they are also inundated with proposals for AI tools that promise to improve teaching, caregiving, employment, and more, often without limited guidance on what tools add value, who benefits, or the conditions required for success.
While AI developers run their own evaluations, those take place in controlled settings with the best digital infrastructure. What we need to be measuring are the impacts of AI in programs across a variety of contexts, with different levels of digital access.
Launching the AI Evidence Playbook
The seminar marked the launch of our AI Evidence Playbook, a practical resource for policymakers, practitioners, and funders investing in or building AI-enabled programs. The playbook synthesizes lessons from randomized evaluations of programs that involve AI, alongside broader evidence on human behavior and technology adoption, to support three decisions:
- Where can AI best help improve lives?
- How can we design AI-enabled programs for real-world settings?
- How can we evaluate AI tools to avoid wasted resources and share benefits more broadly?
The playbook is a living resource that we will update regularly as new evidence emerges from the field.
Key takeaways from the Summit
We also contributed to broader conversations outside of our seminar, building relationships, and asking pressing questions. We shared ideas on public-sector implementation and designing for scale from the start. This included the plenary panel at the Google.org Impact Summit: APAC, a World Bank panel on “How AI Drives Innovation and Economic Growth”, and OpenAI’s panel on “AI and Data for Development”.
Our leadership, affiliates, and staff weren’t just there to share our thoughts. The summit presented us with an invaluable learning experience. Across conversations with academics, experts, and implementers, several themes emerged.
First, the biggest opportunity for AI in development may not be replacing workers, but instead helping overburdened teachers, nurses, caregivers, judges, and other frontline providers do their jobs more efficiently. Across sectors, the most compelling examples were not fully autonomous systems operating on their own, but tools that could expand the reach and effectiveness of human capacity. Examples ranged from AI-supported diagnostic tools and personalized follow-up for patients to workflow streamlining and court procedure transcription that frees up judges’ time so they can focus on the case at hand.
A second takeaway was that achieving strong technical performance is only the beginning. Whether an AI-enabled program improves lives depends on whether it works in real settings, with real constraints: whether people trust it, whether it fits into existing workflows, whether public systems can procure and use it well, and whether it actually reaches the people it is intended to serve.
For example, a machine learning algorithm predicting flood risks can be highly accurate, but only has limited value if the flood forecasts do not reach households in time and in a way people can act on. The hardest part of making AI work may not be the model itself, but the institutional adaptation, implementation, and last-mile delivery required to make it useful in practice.
Finally, context matters enormously. Local languages, data, institutions, and constraints shape whether an AI system is useful, fair, and scalable. Rather than assuming that tools developed in one setting will automatically translate to another, many of the most promising examples were rooted in efforts to build AI for the realities of low- and middle-income countries.
That includes designing around limited connectivity, adapting to local workflows, and building with users rather than for them. It also means paying attention to inclusion from the start: understanding who is able to access these tools, whose needs they are designed to address, and whether they reduce or deepen existing inequalities.
The role of randomized evaluations
These trends point to where randomized evaluations can be most useful: not just testing whether a model is technically impressive, but whether an AI-enabled program changes real outcomes for people in the settings where they live and work.
The most important questions now are not just what AI can do in principle. We need to know: Which applications actually improve lives? And how can governments, funders, and implementers tell the difference early enough to scale what works and avoid wasting scarce funds on what does not?
J-PAL is already producing this key evidence on how AI-enabled programs work in real-world settings. We have just completed our second request for proposals under Project AI Evidence for randomized evaluations of AI-enabled programs, with a dozen projects already underway. We are sharing learnings for investing and developing programs that involve AI in our AI Evidence Playbook.
Looking ahead, we want to build on the momentum from the summit by working as the evaluation partner with partner organizations (governments, nonprofits, and AI developers themselves) to evaluate AI applications as they are deployed in social programs across sectors. This involves collaborating closely with J-PAL’s many offices and embedded labs around the world to ensure that AI solutions are grounded in contextual realities.
It also means looking ahead to the broader questions AI raises for labor markets, education systems, and social protection. As AI reshapes economies and public services, J-PAL will help identify not only the conditions that ensure AI-enabled programs work in real settings, but also what policies will help people adapt and benefit more broadly. We are committed to working with policymakers, practitioners, and funders navigating rapid technological change and to ensure we can all take advantage of the wealth of evidence from randomized evaluations of AI applications.