AI self-replacement: what happens when we delegate our thoughts to artificial intelligence?

Reda Sadki Avatar
By

Reda Sadki

OECD Digital Education Outlook 2026 Day 2

In my Day 1 article, I wrote that the OECD Digital Education Outlook 2026 conference documented performance gains alongside learning losses, efficiency alongside declining human competence, and the emergence of what Dragan Gasevic called “metacognitive laziness.” I described a day that did not offer comfort.

Where the first day established the tension between performance and learning, the second day forced the question of what to do about it. Nine sessions brought practitioners, researchers, young people, AI companies, and policymakers face to face with the growing evidence that generative AI in education is producing a widening gap between what students can do with AI and what they understand without it. The most striking contribution came not from a professor or a minister but from Beatriz Moutinho, a young woman from Cabo Verde, who said: “I am very worried about AI replacing young people in the job market. But I am even more worried about young people preemptively replacing themselves.”

That sentence reframed the entire day: what happens when people become indistinguishable from the AI itself?

Self-replacement risk: Young people see what adults are slow to name

Beatriz Moutinho, moderating and speaking in the youth session, articulated risks that the research sessions had danced around. She described an escalation pattern: students begin by using AI for discrete tasks, progress to using it for structuring their thinking, and eventually use it to form opinions and make personal decisions. “We are giving our first drafts of our first thoughts in our brain directly to AI before even fully structuring them,” she said.

Her concept of “self-replacement” was the most original intellectual contribution of the day. It is not that AI will take young people’s jobs. It is that young people will preemptively delegate the formation of their own professional voice to AI, producing homogenised output that makes them indistinguishable from the machine. “This loss of differentiation might be something to look out for,” Moutinho said, “especially in the job market.”

She also identified what she called a “flipped AI divide”: wealthier students retain access to human support while lower-income students become increasingly reliant on AI alone. This inverts the optimistic narrative of AI as an equaliser.

Elisa Lorenzini, a student from Italy, and Kenji Inoue, a student from Japan, both reported that their schools had provided no formal AI literacy instruction. Lorenzini said her teachers prohibited AI because they did not understand it. “It would be useful if teachers knew how to use it,” she said, “because maybe they can understand why it is a useful tool even for students.”

The performance-learning gap deepens

The central finding of the OECD Digital Education Outlook 2026, presented as a keynote by lead editor Stephan Vincent-Lancrin, is blunt. General-purpose generative AI tools reliably improve short-term task performance but do not reliably produce learning gains. The mechanism is metacognitive laziness: when AI produces fluent, confident output, learners stop monitoring their own thinking.

Vincent-Lancrin reported that high school and vocational students in several countries approach 80 percent usage rates for generative AI. He described a study in which students using ChatGPT for homework scored zero additional points on a subsequent knowledge test. “Our traditional education model assumes that if we perform better, then that means we have the knowledge and skills,” he said. “Which is very problematic.”

Dragan Gasevic, presenting in the assessment session, provided the sharpest experimental evidence. A randomised controlled trial lasting nearly a full semester with medical students showed that those given immediate AI access performed no better than the AI working alone. Only students who developed their clinical reasoning skills before AI was introduced achieved genuine human-AI synergy. “Hybrid intelligence is not that you just automate a task to AI,” Gasevic said. “If your ability is completely automated, that means you are obsolete as well yourself.”

Inge Molenaar of Radboud University explained the mechanism. The fluency of AI output suppresses the metacognitive cues that normally trigger critical evaluation. “The metacognitive cues that generative AI responses give to humans do not allow us to engage or do not trigger us to engage in critical evaluation and in learning activities,” she said. “It increases the chance of accepting it and moving backwards.”

The zone of proximal development collapses: AI output is often beyond what a student can process, and instead of scaffolding learning, it replaces it.

Practitioners redesign everything from scratch

If Day 1 established the theory, Day 2 showed the practice. The opening session brought teachers from Iceland, England, and India who are living with AI in their classrooms every day.

Frida Gylfadottir and Tinna Osp Arnardottir, from a secondary school in Gardabae, Iceland, described a national pilot involving 255 teachers across 31 schools. They have redesigned assessment so that written essays count for only 20 percent of the grade, with oral draft interviews and oral defences making up the rest. “If they have not written the essay, if the text is written by AI, it is really difficult for them to point out where the thesis statement is located or the topic sentences,” Gylfadottir said. “They cannot fake it.”

Christian Turton of the Chiltern Learning Trust in England was equally direct. “Every assignment and every test, every task we used to rely on has to be rethrown from scratch,” he said. Turton introduced the concept of “digital metacognition,” thinking about where the thinking happens when using AI. He also reported that his trust trialled AI marking tools and found the error rate unacceptable.

Souptik Pal of the Learning Links Foundation in India described classrooms of 100 students where differentiation without AI is nearly impossible. After two-day teacher training sessions, the majority of trained teachers began using AI for daily lesson planning. But Pal emphasised that the biggest barrier is not technical. It is attitudinal. “The most important challenge is coming with the mindset that AI will replace the teachers,” he said.

Gylfadottir captured a practitioner reality in one sentence: “The truth is right now we are spending more time, not less.”

Bricolage: assessment must change, but the evidence base is dangerously thin

Ryan Baker proposed “invigilation on an audit basis” as one way forward. Let students use AI to produce artefacts, but periodically ask them to explain their work without the technology present. “If they cannot talk about it, then they do not really understand it,” he said. Nikol Rummel described a collaborative approach in which students using different AI prompts must reconcile divergent outputs, creating what she called the “IKEA effect,” ownership through effortful engagement bricolage.

Gasevic pushed further, arguing for two parallel assessment streams: one measuring standalone human skills, and another measuring human-AI synergy. He reported that LLM-based analysis of process data, including chat logs and keystroke patterns, already achieves approximately 80 percent of expert-quality results, making scalable process assessment technically feasible.

But behind these proposals sits an uncomfortable truth that Isabelle Hau of the Stanford Accelerator for Learning made explicit in the safety session. Her systematic review found only 22 causal-quality studies on AI and learning. No longitudinal data exist. “We are currently running a massive uncontrolled experiment on our children,” said Stephie Herlin of KORA, “and you cannot improve what you do not measure.” KORA has benchmarked more than 30 AI models. Closed-source models average 49 percent on child safety scores. Open-source models average 25 percent. Seven models score zero.

AI literacy as everyone’s responsibility means it is nobody’s responsibility

The AI literacy session, moderated by Laura Lindberg of European Schoolnet, revealed a paradox that Daniela Hau of Luxembourg’s Ministry of Education stated plainly: “If we say everybody, we risk saying nobody.”

The EC-OECD AI Literacy Framework defines 22 competences across four domains. Mario Piacentini of the OECD described how this framework will be translated into a PISA 2029 assessment. Simona Petkova of the European Commission reported that young people in Europe are twice as likely to use generative AI as the general population, yet three out of four teachers do not feel well prepared to address AI in the classroom. Teachers are estimated to be more exposed to AI than 90 percent of workers across the EU.

The most significant empirical contribution came from Lixiang Yan of Tsinghua University, who presented a national study of nearly 2.4 million Chinese vocational students. Yan found that institutional AI readiness only improves student AI literacy when it runs through teachers who have developed genuine instructional competence with AI. “The teacher is the indispensable engine in this transformation,” Yan said. General attitudinal acceptance is not enough. The system must build collective instructional capability.

AI in research is already everywhere, and the risks mirror education

Dominique Guellec of the University of Strasbourg documented the penetration of AI in scientific research: from 2 percent of publications in 2015 to 8 percent in 2022, and approaching two-thirds of all researchers using AI by 2025. He described AI as no longer a tool but part of the infrastructure of doing research. “There is a risk on the human side to over-rely on AI, especially when it does the writing for you,” Guellec said. “Writing is also a part of thinking.”

In a moment that captured the pace of change more vividly than any statistic, Guellec acknowledged on stage that sections of his own OECD Digital Education Outlook 2026 chapter were already outdated. “What I put in the slide, which is that AI does not yet do research-level mathematics, is already outdated,” he said.

Yuko Harayama of the Global Partnership on AI argued that the researcher’s identity needs to shift from generating solutions to evaluating them. “What you have to re-explore and re-empower will be the out-of-the-box thinking,” she said, “not just following and becoming dependent on the output coming from AI.” A study published in Science Magazine, cited in the session, found homogenisation of research topics in the fields most intensive in AI use.

The equity question is structural, not peripheral

The session on educational GenAI in low and middle-income areas, moderated by Cristobal Cobo of the World Bank, confronted a question that Day 1 raised but did not resolve: will AI close or widen the educational divide?

Paul Atherton laid out the infrastructure gap. Children in low-income countries are up to 14 times less likely to have internet at home. But Atherton argued that the more fundamental barrier is literacy itself. “If you cannot read, you cannot access a language model that is done through reading,” he said. The Matthew effect applies: those with the most capability to use AI gain the most.

Seiji Isotani of the University of Pennsylvania presented the most compelling positive evidence. His AIED Unplugged system reached more than 500,000 students across 20,000 schools in Brazil using only teacher mobile phones and printed feedback sheets. No student devices or internet were required. “Instead of putting the burden on governments, we put the burden on people who develop technologies,” Isotani said.

Maria Florencia Ripani argued that language and culture are not technical parameters. “Language is part of a certain culture,” she said. “It is very important to work with user-centred design and use culturally relevant elements.” She described how models in Lugandan already outperform GPT-3.5 from two years ago, despite substantial performance degradation compared to English.

Juan-Pablo Giraldo Ospino of UNICEF delivered the most direct challenge: “Teachers cannot be replaced in the education system and cannot be replaced in the way our brain develops, particularly in the early years.” He warned that framing AI as a solution to teacher shortage risks exacerbating burnout, because “if we increase productivity, actually we are going to make teachers work the same hours or more to be able to teach more kids.”

Learning science points toward slow AI

The final session, on applying learning science with AI, offered the clearest design direction of the day. Ronald Beghetto of Arizona State University introduced the concept of “slow AI,” a deliberate counterpoint to the transactional “fast AI” mode in which users delegate cognitive and creative work entirely. “A lot of people think creativity is just kind of unbridled originality, but really creativity is constrained originality,” he said. His framework asks learners to do the mental work first, then turn to AI as a provocateur or scaffold, then return to human teams.

Dora Demszky of Stanford presented the first large-scale randomised controlled trial of automated feedback in physical classrooms. Teachers using her TeachFX platform received real-time feedback on their use of focusing questions, and the behaviour increased by 15 to 20 percent. But she also noted a structural problem: “One of the issues with machine learning systems is that they are trained to say what you want to hear rather than adding the productive friction that is necessary for learning.” Sycophancy in large language models is not a bug. It is a design feature that undermines learning.

Nikol Rummel and Sebastian Strauss presented a systematic review of GenAI in collaborative learning that found only two experimental studies measuring domain-specific knowledge outcomes. The evidence base for one of the most-discussed applications of AI in education barely exists.

Beyond K-12: what OECD Digital Education Outlook’s dialogue means for humanitarian and health systems

The OECD conference focused on schools. But every finding from Day 2 reaches into the world I work in, where health workers and humanitarian practitioners learn from each other across more than 130 countries in the peer learning networks coordinated by The Geneva Learning Foundation.

The Day 1 article mapped three implications. Day 2 deepened each of them and surfaces new ones.

Self-replacement is already happening in global health

Moutinho’s concept of self-replacement is not speculative in our context. It describes what I have already observed. In our Teach to Reach programmes, highly committed health workers have begun submitting narratives that clearly bear the mark of generative AI. They are not cheating. They are doing what every professional does when a tool appears that can produce faster, more polished output. But the result is a loss of the situated, experiential knowledge that makes their contributions irreplaceable.

I wrote about this as the “transparency paradox” in my work on AI, accountability, and authenticity in global health. If a health worker discloses AI use, their work is devalued as inauthentic. If they conceal it, they carry the ethical tension alone.

Moutinho’s framing adds a dimension I had not fully articulated: the risk is not only institutional but developmental. When practitioners delegate the act of writing about their own experience to AI, they may lose the capacity to recognise what they know that AI does not.

In crisis contexts, this is not an abstraction. A health worker who cannot articulate the reasoning behind a vaccination micro-plan, because the writing was done by a chatbot and the thinking was never fully formed, is a health worker less able to adapt when the plan meets reality on the ground.

The evidence gap is wider in global health than in K-12

Isabelle Hau’s finding that only 22 causal-quality studies on AI and learning exist is alarming for education. In global health and humanitarian response, the number is effectively zero. AI tools are being deployed to support health worker training, translate guidance, and even generate response protocols, but I am not aware of a single randomised controlled trial measuring whether these tools produce genuine learning gains among health professionals in low-resource settings.

Gasevic’s finding that students given immediate AI access performed no better than AI alone has a direct analogue. If a health worker uses a general-purpose chatbot to draft an outbreak response protocol without first developing the clinical reasoning that the protocol requires, the output may be fluent and authoritative while the human understanding behind it is empty. In K-12, this undermines learning. In health systems and in humanitarian response, it can cost lives.

At The Geneva Learning Foundation, we introduced our first AI co-worker, Claude Cardot, in March 2026, deliberately naming and governing the role. We are treating Claude’s onboarding as a structured experiment, asking in public whether an AI co-worker can reduce the cognitive load on a small team without diluting authenticity or erasing local voice. But we are under no illusion that this is anything other than a design question that the evidence base cannot yet answer.

The flipped AI divide is the central equity problem for global health

Moutinho’s “flipped AI divide” is the most precise description I have encountered of the equity challenge in global health AI. In the countries where The Geneva Learning Foundation works, access to advanced models is already limited by geofencing, pricing, and risk aversion by international organisations. When practitioners in these settings do use AI, they use general-purpose chatbots without pedagogical intent, institutional support, or safety standards. This is exactly the configuration that the OECD evidence shows produces performance gains without learning gains.

Meanwhile, organisations in Geneva, New York, and Washington have access to purpose-built AI tools, teams of data scientists, and legal departments that can negotiate safety standards. The result is that the most resource-rich actors get AI that is designed to support human capability, while the practitioners who face the most severe challenges get AI that is designed for consumer engagement. This is the flipped AI divide in global health.

Isotani’s AIED Unplugged model offers a counterpoint that speaks directly to our work. His system proves that it is possible to design AI for resource-constrained environments at national scale, reaching half a million students with no student devices and no classroom internet. If it is possible in Brazilian public schools, it is possible in the health systems where we work. The design principle is the same one we apply at The Geneva Learning Foundation: the burden of adaptation must fall on technology designers, not on the practitioners and communities who are often already stretched to their limits.

Peer learning is the missing architecture

Across two days of the OECD conference, one word barely appeared: peers. The conference discussed teachers, students, researchers, companies, and policymakers. It discussed tutoring, assessment, safety, and governance. What it did not discuss, with rare exceptions, was what happens when learners support each other, becoming both teachers and learners.

This is the gap that our work fills. In the peer learning networks that The Geneva Learning Foundation has built over a decade, health workers develop context-specific projects, review each other’s work using structured rubrics, and engage in facilitated dialogue that surfaces patterns across thousands of contexts. We envision AI not as a tutor or an oracle but as a co-worker that helps with tasks that peers have neither time nor bandwidth to perform at scale.

Gasevic’s experimental finding confirms the design logic we have been following. Students who developed their skills before AI was introduced achieved genuine synergy. In our networks, practitioners build their capacities through structured peer interaction before AI enters the picture. The human architecture comes first. AI amplifies and augments what the network has already built. Its boundaries are defined by the network.

Beghetto’s “slow AI” resonates with this approach. In a peer learning network, the “productive friction” that commercial AI removes is precisely what the network is designed to generate. Peer review, facilitated dialogue, and iterative project development are all forms of friction that produce learning. If we strip these out and replace them with chatbot-generated feedback, we lose what makes the system work.

A leadership agenda for Day 2

Day 1 produced a leadership agenda focused on the performance-learning distinction, the need for pedagogy before technology, and the urgency of equity. Day 2 extends it.

First, leaders must confront the self-replacement problem directly. Moutinho described it in young people. I see it in health and humanitarian professionals. The response is not to ban AI or to ignore it, but to create conditions in which practitioners can use AI openly and with pedagogical intent. This means moving from “shadow AI” to governed AI, as we are doing with Claude Cardot. It also means designing learning experiences that require practitioners to do the cognitive work before AI enters, not after.

Second, leaders must demand evidence. Twenty-two causal studies is not a sufficient foundation for policy. In global health and humanitarian response, where the evidence base is even thinner, leaders should insist that any AI deployment in training or capacity-building includes a credible evaluation design. Efficiency gains are not learning gains. The two must be measured separately.

Third, leaders must resist the flipped AI divide. If the most resource-constrained practitioners end up with unguided access to general-purpose chatbots while the most resource-rich organisations get purpose-built, safety-tested, pedagogy-driven AI tools, the result will be a deepening of the inequity that peer learning networks are designed to overcome. The Isotani model shows that another path is possible. Leaders should demand it.

Fourth, leaders must invest in peer learning infrastructure alongside AI deployment. Every finding from the OECD conference confirms that AI is most powerful when embedded in human systems that provide the friction, the context, and the accountability that AI alone cannot supply. Peer learning networks are not optional. They are the architecture that determines whether AI amplifies human capability or replaces it.

What the second day left unresolved

The second day of the OECD conference did not resolve the question that Moutinho raised. It sharpened it. If young people are preemptively replacing themselves, and if health workers in crisis settings are quietly delegating their situated knowledge to machines, then the question is not whether AI can help human beings learn and grow. It is whether we will design the systems that make that possible before the window closes.

Guellec’s observation that his own OECD chapter was outdated before the conference took place is not only a comment about the pace of change in AI. It is a warning about the pace of change required in every institution that claims to support learning. The evidence is now clear that doing nothing, or doing the wrong thing, is not neutral. It is actively harmful. And the people most at risk are, as always, those with the least institutional support and the most to lose.

References

  1. Isotani S, Bittencourt II, Challco GC, Dermeval D, Mello RF. AIED Unplugged: Leapfrogging the Digital Divide to Reach the Underserved. In: Wang N, Rebolledo-Mendez G, Dimitrova V, Matsuda N, Santos OC, editors. Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky. Cham: Springer Nature Switzerland; 2023. p. 772–9. (Communications in Computer and Information Science). https://doi.org/10.1007/978-3-031-36336-8_118
  2. Kusumegi K, Yang X, Ginsparg P, De Vaan M, Stuart T, Yin Y. Scientific production in the era of large language models. Science. 2025 Dec 18;390(6779):1240–3. https://doi.org/10.1126/science.adw3000
  3. OECD. OECD Digital Education Outlook 2026: Exploring Effective Uses of Generative AI in Education. OECD Publishing, 2026. https://doi.org/10.1787/062a7394-en.
  4. Reda Sadki (2025). The great unlearning: notes on the Empower Learners for the Age of AI conference. Reda Sadki: Learning to make a difference. https://doi.org/10.59350/859ed-e8148
  5. Reda Sadki (2025). Artificial intelligence, accountability, and authenticity: knowledge production and power in global health crisis. Reda Sadki: Learning to make a difference. https://doi.org/10.59350/w1ydf-gd85
  6. Reda Sadki (2025). When funding shrinks, impact must grow: the economic case for peer learning networks. Reda Sadki: Learning to make a difference. https://doi.org/10.59350/redasadki.20995
  7. Reda Sadki (2025). Why peer learning is critical to survive the Age of Artificial Intelligence. Reda Sadki: Learning to make a difference. https://doi.org/10.59350/redasadki.21123
  8. Reda Sadki (2026). Introducing Claude Cardot, our first AI co-worker to support frontline health and humanitarian leaders. Reda Sadki: Learning to make a difference. https://doi.org/10.59350/6rjnm-1rd08
  9. Reda Sadki (2026). OECD Digital Education Outlook 2026: How can AI help human beings learn and grow?. Reda Sadki: Learning to make a difference. https://doi.org/10.59350/1bqm0-1d126

How to cite this article

As the primary source for this original work, this article is permanently archived with a DOI to meet rigorous standards of verification in the scholarly record. Please cite this stable reference to ensure ethical attribution of the theoretical concepts to their origin. Learn more

Fediverse reactions

Discover more from Reda Sadki

Subscribe now to keep reading and get access to the full archive.

Continue reading