The great multimedia content deception
Learning teams spend millions on dressing up content with multimedia.
The premise is always the same: better graphics equal better learning.
The evidence tells a different story.
The focus on the presentation and transmission of content represents a fundamental misunderstanding of how learning actually works in our complex world.
Multimedia content: the stakes have changed
In a world confronting unprecedented challenges—from climate change to global health crises, from artificial intelligence to geopolitical instability—the stakes for learning have never been higher.
We need citizens and professionals capable of critical thinking, navigating uncertainty, grappling with complex systems, and collaborating effectively with artificial intelligence as a co-worker.
Yet much of our educational technology investment continues to chase the glittering promise of multimedia enhancement, as if adding more visual stimulation and interactive elements will somehow transform passive consumers into active knowledge creators.
The traditional transmissive model—knowledge flowing one-way from expert to learner—has become counterproductive.
In a world where information is abundant but wisdom is scarce, the critical question is not how to transmit information efficiently, but how to create environments that cultivate higher-order capabilities.
If not multimedia content, then what?
Bill Cope and Mary Kalantzis identify seven affordances that distinguish effective digital learning from traditional instruction.
None involve multimedia enhancement.
Instead, they emphasize ubiquitous learning that transcends boundaries; active knowledge production by learners themselves; recursive feedback that transforms assessment into dialogue; collaborative intelligence that emerges from structured interaction; metacognitive reflection that builds learning capacity; and differentiated pathways that personalize without sacrificing community.
This framework reframes education’s purpose: not delivering content, but designing ecologies for knowledge creation.
Consuming multimedia content is not learning
The critical distinction lost in educational technology discussions is between learning resources and learning processes.
A video or simulation is content—not learning itself.
Learning is the activity that the learner does.
At The Geneva Learning Foundation, we work with over 70,000 health practitioners globally using a structured cycle of action and reflection.
The main medium is text.
But the role of text is far more profound than content delivery.
In our climate and health programme, for example, the primary learning resource is a collection of text-based eyewitness accounts from learners in our Teach to Reach programme.
A practitioner in Nigeria shares a written story of how extreme heat forces people to sleep outdoors, increasing their exposure to malaria-carrying mosquitoes.
Learners read this and many other real-world experiences.
The learning activity is not to memorize this fact.
Instead, a learner in Brazil will analyze a “chain reaction” from change in climate to health consequences in writing, grounded in their own experience with flooding and diarrheal disease.
Then, she will receive structured, written feedback from colleagues in Chad, Ghana, and India, guided by a detailed rubric.
The “content” is the collective written experience of the peer group.
Similarly, in our 16-day peer learning exercise on health equity, learners do not study abstract theories of justice from a textbook.
Instead, they write a detailed project analyzing a real-world inequity they face.
A health worker might document how their system’s design consistently fails to reach nomadic pastoralist communities.
The learning happens in the subsequent, text-based peer review, where colleagues use a rubric to help the author deepen their root cause analysis and refine their action plan.
In both cases, the engine of learning is the activity—creating, analyzing, evaluating, collaborating—and text is the medium for that activity.
We do not invest in costly multimedia production because the engagement happens in robust, structured peer interactions that drive authentic learning.
The experiences shared by learners, what they construct individually, becomes the collective corpus through which learning becomes continuous – and helps turn knowledge into action.
The cognitive case for the superiority of text over multimedia content
Cognitive Load Theory explains that working memory—where we process new information—is extremely limited.
This mental capacity has three components: intrinsic load (the material’s inherent difficulty), extraneous load (effort wasted on poorly designed instruction), and germane load (productive effort leading to deep learning).
Critical thinking, analysis, and metacognition have very high intrinsic loads.
Learners are already engaged in demanding mental work.
Any instructional element adding unnecessary complexity steals finite cognitive resources from actual learning.
Multimedia “enhancements”—distracting animations, irrelevant images, redundant text—do precisely this.
They may feel engaging, but research shows this perceived engagement does not translate to better outcomes and can be detrimental.
Well-structured text is cognitively “quiet.”
It presents information cleanly, allowing learners to dedicate maximum mental energy to understanding and applying complex ideas.
The unique affordances of text
Text possesses structural characteristics exceptionally suited for higher-order thinking.
Its linear nature builds coherent, sequential, evidence-based arguments, modeling logical reasoning processes.
Unlike transient video or audio, text is stable—it can be revisited, scrutinized, annotated, and cross-referenced at the learner’s pace, enabling the deep analysis required by our peer review rubrics.
Written language excels at conveying abstract concepts, nuanced theories, and complex principles—the building blocks of fields requiring sophisticated thinking and “thick knowledge”.
Studies consistently show writing improves critical thinking skills like analysis and inference.
Comparative studies in Problem-Based Learning (PBL) reveal that adding multimedia does not reliably improve outcomes.
Some find no significant difference between text-based and multimedia-enhanced cases.
Others find video actively hinders learning by making it harder to identify and review key information during collaborative analysis.
The virtual reality paradox
Some education innovators continue to be mesmerized by the promise of virtual or augmented reality.
They are often the same individuals who previously touted “gamification” as a panacea for learning.
Virtual reality represents the ultimate multimedia format, promising immersive simulations that proponents claim will revolutionize education.
Yet the biggest investments so far have been spectacular failures.
For example, Mark Zuckerberg’s massive bet on virtual learning environments, despite billions invested, failed to demonstrate educational superiority over traditional methods.
The pattern repeats across educational technology: the more immersive and visually impressive the technology, the more it distracts from the cognitive work learning requires.
This helps to understand why, by contrast, text-based generative AI chatbots so rapidly became part of teaching and learning.
Students may be amazed by virtual experiences, but amazement does not translate to learning outcomes.
The AI factor
As artificial intelligence becomes capable of generating sophisticated multimedia content, human learners need complementary skills: critical analysis of AI-generated materials, collaborative meaning-making across perspectives, and creative synthesis of complex information.
Text-based learning environments naturally develop these capabilities.
When students analyze written arguments, provide peer feedback through structured rubrics, and revise thinking based on diverse perspectives, they practice the analytical and collaborative thinking that will distinguish them in an AI-enhanced world.
The economic dead end of multimedia content
Multimedia content may become obsolete quickly, requiring constant updates.
A typical multimedia learning module is expensive to develop and maintain.
A thoughtfully structured text-based peer review process costs a fraction of that amount but creates value every time learners engage with it, building individual skills and collective knowledge that compound over time.
In our programmes spanning multiple continents and diverse health contexts—from emergency response training to climate health education—we demonstrate measurably better learning outcomes with text-based approaches.
Our methodology focuses on evidence-based peer learning emphasizing learner autonomy, competence, and community connection—outcomes that text-based environments support more effectively than multimedia alternatives.
Beyond the false choice
This argument does not advocate technological poverty in education.
Digital platforms enable collaboration and knowledge sharing impossible in previous eras.
Innovation and investment are vital.
The key lies in distinguishing between technology that amplifies human interaction and technology that attempts to substitute for it.
Text-based learning environments scale to support thousands while maintaining human connections essential for deep learning.
They accommodate diverse learning styles without sacrificing intellectual rigor.
They integrate seamlessly with AI tools that help organize and synthesize ideas without replacing human judgment and creativity.
Most importantly, they focus investment where learning happens: in structured interaction between learners, feedback loops that refine understanding, collaborative processes that create knowledge, and metacognitive reflection that builds learning capacity.
The path forward
The multimedia deception persists because it aligns with intuitive but erroneous beliefs about learning and technology.
More sophisticated presentations seem like obvious improvements.
But learning operates by different rules than information processing.
Institutions serious about educational effectiveness should reject the multimedia mirage.
This means redirecting technology budgets from content production to learning infrastructure.
It means training experts to facilitate text-based dialogue scaffolded by rubrics and experience, rather than spend time building multimedia presentations.
It means measuring learning outcomes rather than student satisfaction scores.
In a world demanding critical thinking, systems awareness, and collaborative intelligence, we need approaches that develop these capabilities directly.
The multimedia bells and whistles that capture our attention and resources actively impede the kind of learning our complex world requires.
The future of educational technology lies in thoughtful structuring of human interaction and knowledge creation.
Text provides the foundation precisely because it demands the active cognitive engagement that multimedia often circumvents.
References
Berrocal, Y., Regan, J., Fisher, J., Darr, A., Hammersmith, L., Aiyer, M., 2021. Implementing Rubric-Based Peer Review for Video Microlecture Design in Health Professions Education. Med.Sci.Educ. 31, 1761–1765. https://doi.org/10.1007/s40670-021-01437-1
Cope, B., Kalantzis, M., 2013. Towards a New Learning: The Scholar Social Knowledge Workspace, in Theory and Practice. E-Learning and Digital Media 10, 332–356. https://doi.org/10.2304/elea.2013.10.4.332
Cope, B., & Kalantzis, M. (Eds.). (2016). e-Learning Ecologies: Principles for New Learning and Assessment. Routledge. https://doi.org/10.4324/9781315699935
Feenberg, A., 1989. The written world: On the theory and practice of computer conferencing, in: Mason, R., Kaye, A. (Eds.), Mindweave: Communication, Computers, and Distance Education. Pergamon Press, pp. 22–39.
Fenesi, B., Sana, F., Kim, J. A., & Shore, D. I. (2014). Learners misperceive the benefits of redundant text in multimedia learning. Frontiers in Psychology, 5, 710. https://doi.org/10.3389/fpsyg.2014.00710
Mayer, R. E. (2008). Applying the science of learning: Evidence-based principles for the design of multimedia instruction. American Psychologist, 63(8), 760-769. https://doi.org/10.1037/0003-066X.63.8.760
Pereles, A., Ortega-Ruipérez, B., Lázaro, M. (2024). The power of metacognitive strategies to enhance critical thinking in online learning. Journal of Technology and Science Education, 14(3), 831-843. https://doi.org/10.3926/jotse.2721
Rivas, S. F., Saiz, C., & Ossa, C. (2022). Metacognitive strategies and development of critical thinking in higher education. Frontiers in Psychology, 13, 913219. https://doi.org/10.3389/fpsyg.2022.913219
Sweller, J. (2005). Implications of cognitive load theory for multimedia learning. In R. E. Mayer (Ed.), The Cambridge Handbook of Multimedia Learning (pp. 19-30). Cambridge University Press. https://doi.org/10.1017/CBO9780511816819.003
Sweller, J., Ayres, P., & Kalyuga, S. (2011). Cognitive Load Theory. Springer. https://doi.org/10.1007/978-1-4419-8126-4
Tarchi, C. (2021). Learning from text, video, or subtitles: A comparative analysis. Computers & Education, 160, 104034. https://doi.org/10.1016/j.compedu.2020.104034
Image: The Geneva Learning Foundation Collection © 2025