How to measure real-world outcomes in learning initiatives

DOI: 10.59350/btfdf-1bf06

March 15, 2026

Learning strategy, The Geneva Learning Foundation

Reda Sadki

How can we measure real-world outcomes in learning initiatives?

This is the second of two articles about assessment, exploring how The Geneva Learning Foundation (TGLF) measures real-world outcomes in learning initiatives. The first article examines the structural limitations of pre- and post-test designs, commonly used in global health and humanitarian response training, which cannot provide evidence of impact.

The question behind the question

When professionals working in global health or humanitarian response ask whether a learning program can “track what participants have learned,” they are usually asking something more specific.

They want to know whether their investment in training and professional development is actually making a difference.

They want a credible, quantifiable answer that they can share with funders, partners, and decision-makers.

And because pre- and post-tests are the format they have seen most often, they have come to associate that format with rigor, evidence, and accountability.

This rests on a misunderstanding of what knowledge tests can measure.

As the first article in this series documented, pre- and post-tests focused on knowledge recall cannot establish that a training program caused a change, are vulnerable to systematic biases that distort results, measure only the lowest levels of cognitive activity, and tell us nothing about whether learning will translate into changed practice.

For programs that exist to improve how professionals do their work, and through their work, improve health outcomes for the people they serve, this is a significant limitation.

The question is not whether to measure.

It is what to measure, when to measure it, how to interpret what is found, and how to construct evidence that genuinely links learning to action and action to outcomes.

The Geneva Learning Foundation (TGLF) has spent a decade developing and validating answers to each of these questions, grounded in the learning sciences, implemented at scale, and tested across some of the most challenging professional and social environments in the world.

Why knowledge scores are not a proxy for capability

The gap between knowing and doing is not a logistical problem that better training design can close.

It is a structural feature of how professional capability develops.

Consider how a healthcare provider learns to counsel patients about menopause.

She might attend a seminar and score highly on a subsequent knowledge test about hormonal changes, symptom profiles, and treatment options.

That score tells us that she can recall the relevant facts, in the short term.

It tells us nothing about whether she will raise the topic with the middle-aged women in her practice, how she will navigate a patient’s skepticism about hormone therapy, whether she will adapt her counseling to a patient who has limited literacy, or whether she will persist through the discomfort of discussing a topic that is still stigmatized in many contexts.

These capabilities do not flow from knowledge.

They are built through practice, reflection, peer exchange, and the iterative experience of trying something, observing what happens, adjusting, and trying again.

This is not a philosophical claim.

It is the consensus finding of fifty years of research on professional learning, from the work of Donald Schön on reflective practice, to the communities of practice framework of Etienne Wenger, to the contemporary evidence base on deliberate practice, transfer of learning, and workplace-based assessment.

The implication is direct: if you want to know whether a learning program is producing impact, you have to look at what people are doing, not what they know.

And if you want to attribute improved outcomes to a learning intervention, you have to build a chain of evidence that runs from participation, through documented behavioral change, to measured outcomes in the real world.

This is what TGLF’s measurement system is designed to do.

A different understanding of what learning is

TGLF’s measurement system cannot be separated from our understanding of what learning is and how it works.

The foundation’s programs are built on a new learning theory developed by Reda Sadki. It is a framework that defines meaningful learning for complex work not as the successful retention of information transmitted by an expert, but as the active creation of knowledge and capability through structured peer interaction.

The practical implication of this position is captured in a phrase from TGLF’s program documentation: “The work is the learning, and the learning immediately informs the work.” Learning does not happen before work and transfer into it.

Learning and working are the same activity, understood as a continuous, reflective cycle in which action generates new experience, experience generates new understanding, and new understanding informs the next action.

Some people call this “learning-based work” or “work-based learning”.

TGLF’s framework draws on well-established theoretical foundations.

Social constructivism, represented in the work of Vygotsky, Dewey, and their successors, holds that knowledge is not transmitted but constructed through interaction and experience.
Connectivism, the learning theory formulated by George Siemens for networked environments, argues that knowledge is distributed across networks and that learning consists of the ability to construct and navigate those networks, not to store information in individual memory.
The work of Bill Cope and Mary Kalantzis on new learning and e-learning ecologies provides a concrete framework for redesigning assessment around what they call “recursive feedback,” continuous embedded evaluation from multiple perspectives that serves learning rather than merely judging it.

This theoretical grounding has a direct practical consequence: if learning occurs through structured action and peer interaction, then evidence of learning must be found in action and its consequences.

A multiple-choice test administered at the end of a training event cannot be evidence of this kind of learning because the learning has not yet occurred.

The learning happens in the weeks and months that follow, as practitioners apply new perspectives to their work, receive feedback from peers, and document what changes.

What the value creation framework measures

The measurement system TGLF uses to capture this evidence is based on the value creation framework, originally developed by Etienne Wenger, Beverly Trayner, and Maarten de Laat (2011) for assessing learning in communities of practice.

TGLF adapted this framework in 2020 for global health and humanitarian settings, in collaboration with the Centre for Change and Complexity in Learning at the University of South Australia, and has implemented it systematically across its programs since 2021.

The framework identifies five progressive dimensions through which peer learning creates value, each measured through a combination of standardized quantitative ratings on a Likert scale and open qualitative narratives.

Dimension	Survey question	What it measures
Immediate value	Participation changed me as a professional	Knowledge gain, skills development, immediate shifts in understanding
Potential value	Participation affected my social connections	Network strengthening, new professional relationships, peer support structures
Applied value	Participation helped my professional practice	Direct application of learning to work, behavioral change in practice settings
Realized value	Participation changed my ability to influence my world as a professional	Leadership capacity, ability to affect change in systems and organizations
Reframing value	Participation made me see my world differently	Transformed professional perspective, epistemic shift in how problems are understood and approached

The first dimension, immediate value, corresponds to what a pre- and post-test attempts to measure.

But it represents one fifth of the framework and captures only the starting point of the evidence chain.

The four subsequent dimensions document progressively deeper and more durable forms of impact: changed professional networks, changed practice, changed organizational influence, and changed worldview.

A program that moves only the first dimension, producing knowledge gains without affecting how participants work, connect with colleagues, or understand their professional role, is a program that has not yet achieved its purpose.

In 2022, TGLF conducted a global baseline measurement using this framework across 10,095 participants from 99 countries in peer learning activities over 16 weeks, establishing benchmark scores against which subsequent cohorts can be compared.

This baseline represents a significant evidence infrastructure that no collection of pre- and post-test scores can replicate.

The learning model that makes measurement meaningful

The value creation framework does not stand alone.

It is the measurement system for a learning model that is specifically designed to produce the outcomes it measures.

Understanding why the framework works requires understanding how the programs it evaluates are structured.

TGLF’s Full Learning Cycle integrates four complementary pedagogical approaches, each serving a distinct function in moving a practitioner from initial awareness to sustained, documented impact.

The Primer: active knowledge production, not passive consumption

The Primer is the entry point to TGLF’s learning ecosystem.

Its design rejects the transmission model of education, in which an expert delivers information to learners who receive and retain it, in favor of what Cope and Kalantzis call “active knowledge production”: learners as creators of knowledge, not consumers of it.

The Primer presents content in a deliberately stripped-down text format, applying Reda Sadki’s principle of “less is more”, to ensure that participants’ working memory is preserved for the demanding work of analysis and reflection rather than wasted on processing multimedia presentation.

But reading the text is explicitly framed as “priming, not learning.” The learning begins when participants are asked to write: to connect the concepts they have encountered to specific situations from their own professional practice, to bear witness to their own context, and to produce a short reflective piece that represents a genuine artifact of that intensive thinking.

This written reflection then enters an anonymous structured peer review process, in which each participant evaluates the submissions of several colleagues using a shared rubric, and receives structured feedback on their own work from several peers.

The learning that occurs in this phase is qualitatively different from what a lecture or a knowledge quiz can produce: reviewing others’ work requires applying concepts repeatedly to novel practical situations, building flexible understanding rather than rote recall.

Receiving feedback from peers across multiple contexts exposes the participant to perspectives and solutions they would never have encountered in a conventional training.

Teach to Reach: collective intelligence at scale

Teach to Reach brings thousands of professionals together in large-scale facilitated online events, structured around specific practical questions that participants answer from their own field experience.

Since January 2021, more than 80,000 health workers from over 70 countries have participated, sharing experiences on challenges ranging from immunization coverage to climate change adaptation to neglected tropical diseases.

The knowledge produced in these events is not delivered by experts to participants.

It is generated by participants themselves, through the sharing and facilitated synthesis of their own practice.

Facilitators surface patterns across hundreds of contributions, sharing collective insights back to the group and enabling cross-contextual learning at a speed and scale that no conventional training can approach.

This design directly operationalizes the connectivist principle that knowledge resides in and flows through networks, not only in individual minds.

The output is not a test score.

It is a growing evidence base of practical, field-tested knowledge produced by the community for the community, and made available to policymakers, program designers, and researchers.

The peer learning exercise: from experience to action plan

The peer learning exercise deepens the analytical work begun in the Primer and moves participants from reflection to structured planning.

Over sixteen days, participants move through four sequential phases: developing a written analysis of a specific, real-world professional challenge.

Providing structured peer review on three colleagues’ submissions using a shared rubric.

Revising their own analysis based on feedback received.

And submitting a final, refined action plan.

Again, the output is not a score.

It is a documented knowledge artifact, a professionally credible action plan, that has been strengthened through structured peer feedback.

This plan becomes the foundation for the next phase of the learning cycle.

The peer review process in this exercise is also a profound learning experience in itself.

The rubric asks reviewers to engage analytically with colleagues’ challenges, moving them from intuitive to systematic assessment.

Participants consistently report that reviewing others’ work helped them see their own situation with new clarity.

The Impact Accelerator: learning that becomes action, documented

The Impact Accelerator is where the learning cycle produces its most rigorous and attributable evidence of changes in outcomes.

It takes the action plan developed in the peer learning exercise and places it in a structured implementation support cycle.

The Accelerator’s “launch pad” has a structure: a recurring weekly rhythm.

On Monday, each participant sets one specific, concrete, and achievable goal for the week: not a general intention but a specific action they will complete by Friday.

On Wednesday, they check in with peers working on similar challenges, sharing what is working, what is not, and what they have learned.

On Friday, they submit a brief acceleration report documenting what they did, what happened, and what they learned from the result.

The following Monday, armed with that experience and peer feedback, they set a new goal.

This rhythm is not incidental to the measurement system.

It is the measurement system.

Each acceleration report is a piece of evidence.

Each week’s documented action, outcome, and reflection is a data point in the chain of evidence running from participation to behavioral change to improved outcomes.

The portfolio of weekly reports accumulated over months provides exactly the kind of contextually specific, temporally granular evidence of professional development that pre- and post-tests cannot produce.

After two to four weeks of the launch pad, most participants have established a rhythm that propels their implementation.

And they continue to share back to document this progress, together with evidence of changes in outcomes.

How TGLF solves the attribution problem

Demonstrating that a learning intervention caused an improvement in health or professional outcomes is the hardest problem in program evaluation.

Randomized controlled trials are rarely feasible in field settings.

Pre- and post-tests, as established in the companion article, cannot establish causation at all.

The Impact Accelerator addresses this challenge through a three-step process integrated into the program structure itself.

Before participants begin their weekly action cycles, they document a specific, measurable baseline: the current state of the situation they intend to improve.

A social worker might record how many children in her caseload are showing signs of acute trauma.

A health worker might document vaccination coverage rates in her area.

A radiation safety specialist might note the current frequency of safety incidents at her facility.

This baseline is explicit, documented, and agreed upon with peers at the outset.

Each week, acceleration reports capture both the specific actions participants took and any observable changes in their documented baseline indicators.

This creates a detailed, longitudinal record linking what participants do to what happens as a result, week by week over months.

The accumulation of this evidence produces something qualitatively different from a pre- and post-test score: a documented narrative of professional practice, with traceable connections between specific actions and specific outcomes.

The third step is the most methodologically distinctive.

When participants report improvements, they must make an explicit attribution argument to their peer group: exactly which actions led to which results, why those changes would not have happened without their specific intervention, and what evidence supports the claim.

This argument is then scrutinized by peers who understand the participant’s working context in detail.

Colleagues who have been following the same participant’s weekly reports can evaluate whether the claimed attribution is plausible, whether alternative explanations have been considered, and whether the evidence is credible.

This peer verification process is a form of evidence validation that is more contextually sensitive and more demanding than the comparison of two knowledge test scores.

Peers who know your setting, your patients or community members, your institutional constraints, and your previous actions are in a uniquely qualified position to evaluate whether your claimed outcomes are real and whether your claimed causal story holds.

They ask hard questions.

They suggest alternative interpretations.

And when the attribution claim survives this scrutiny, it is stronger for it.

What the evidence actually looks like

To illustrate what this measurement system produces, consider the evidence generated by the Certificate peer learning programme on Psychological First Aid (PFA) in support of children affected by the humanitarian crisis in Ukraine, developed in partnership with the International Federation of Red Cross and Red Crescent Societies (IFRC) with the support of the European Union.

Quantitative evidence from the third Ukrainian-language cohort showed consistent scores across all five value creation dimensions.

Dimension	Mean score (1-6 scale)
Changed me as a professional	4.83
Affected my social connections	4.68
Helped my professional practice	4.60
Changed my ability to influence my world	4.45
Made me see my world differently	4.40

These scores are notable in two ways.

First, all five dimensions are above 4.4, suggesting that the program is producing not just knowledge gain but genuine applied and realized value.

Second, the progression across dimensions reflects exactly what the learning model predicts: the most immediate dimension scores highest, and the deeper, more transformative dimensions show meaningful but slightly lower scores, consistent with the progressive nature of the framework.

The qualitative narratives from the same cohort document what these numbers represent in practice.

One participant wrote: “While working in breakout rooms, we exchanged experiences and techniques we use. So, thanks to this program, I expanded my knowledge. I have already used some techniques. It works.” Another reported: “Before the training, I thought that psychological help could only be provided by someone with relevant education. After completing the training, I understand that anyone can provide PFA.” A third noted: “I now consider myself a more qualified teacher because I have the necessary knowledge about PFA and PFA provision skills, and this is an important element in a teacher’s arsenal.”

These narratives are not anecdotes.

They are systematically collected evidence of applied and realized value.

They document specific changes in professional behavior, specific expansions of professional identity, and specific new capabilities being applied in specific work contexts.

No pre- and post-test score could generate this quality of evidence.

At the highest level of the evidence chain, the Impact Accelerator has produced documented evidence of attribution.

In Ukraine, TGLF demonstrated improved mental health and psychosocial support outcomes among children, and was able to link those improvements causally to the specific actions of practitioners who had gone through the Accelerator, distinguishing the program’s contribution from improvements that might have occurred for other reasons.

This level of evidence, linking a learning intervention to measurable improvement in the lives of the people served by program participants, is the standard to which program evaluation should aspire.

It is a standard that pre- and post-tests cannot even approach.

After the first Impact Accelerator in 2019, TGLF compared implementation progress at six months between participants who completed the full learning cycle and a control group who had developed action plans but did not participate in the supported implementation phase.

Participants in the full cycle showed substantially greater implementation progress across all tracked indicators.

This comparison provides a credible basis for attributing the implementation gains to the program.

The deficit model and its consequences

It is worth addressing directly the intuition that motivates most requests for pre- and post-tests: the sense that participants are deficient in knowledge, that the training exists to fill that deficit, and that the most logical measure of success is whether the deficit has been reduced.

This deficit model of professional learning is deeply embedded in conventional training culture, particularly in health and humanitarian contexts where technical expertise is highly valued and professional credentialing often depends on demonstrated knowledge.

The model is not entirely wrong.

Practitioners do need to know things.

But it is a radically incomplete account of how professionals develop.

It leads to training designs and evaluation approaches that systematically underestimate both the contextual knowledge that practitioners already possess and the complexity of what is required to change how they work.

Consider the case of a nursing school in Ghana whose curriculum contained no content on female genital schistosomiasis (FGS), a serious but neglected condition affecting millions of women and girls.

The head of the nursing school participated in a programme led by Bridges to Development and Bruyère, which included a peer learning exercise using TGLF’s methodology.

She subsequently revised the entire nursing curriculum to include FGS content.

From the perspective of the deficit model, the measurable outcome is that the head of school knows more about FGS.

A pre- and post-test could in principle document this knowledge gain.

But this account misses what actually happened.

The head of school did not revise the curriculum because she gained information about FGS.

She revised it because she participated in a dynamic process that engaged her as a leader, activated her sense of professional responsibility, connected her to a network of peers working on related challenges, and supported her through the practical steps of curriculum change.

The knowledge she gained was part of this process, but it was not its cause.

This is the insight that TGLF’s measurement framework is designed to capture and that pre- and post-tests are structurally unable to reach: the development of professional agency, the expansion of professional identity, the growth of networks of practice, and the ability to lead change in complex institutional environments.

These are the outcomes that move health systems and humanitarian response from good intentions to measurable improvement.

They are the outcomes that the value creation framework measures.

The knowledge question reframed

One of the most common concerns raised by potential partners and funders is the desire to measure how much participants know about a topic before and after a program: to “get a finger on the pulse of what people know as a starting point,” as one interlocutor put it.

This concern is legitimate.

Understanding where practitioners are starting from is useful for program design.

The question is whether a knowledge quiz is the best way to gather this information, and whether the starting level of factual knowledge is actually the most important thing to know.

TGLF’s experience across programs on immunization, malaria, neglected tropical diseases, mental health, radiation safety, and now menopause suggests a counterintuitive finding: the level of factual knowledge participants bring to a program is rarely the binding constraint on their ability to improve their practice and achieve better outcomes for the people they serve.

The binding constraints are more often structural: isolation from peers working on similar challenges, limited time and opportunity to reflect on practice, lack of structured support during implementation, and the absence of a community that validates their professional agency and helps them navigate institutional resistance.

Addressing these constraints is what TGLF’s programs are designed to do.

And measuring whether these constraints have been addressed is what the value creation framework is designed to capture.

When a participant reports that her participation “changed my ability to influence my world as a professional,” she is reporting that a binding constraint on her effectiveness has shifted.

This is a more consequential finding than an increase in her score on a knowledge quiz about technical content.

It is the finding that predicts, with greater confidence than any knowledge score can provide, whether she will continue to apply her learning, develop her colleagues, and improve outcomes in her community over the months and years ahead.

Addressing the measurement needs of funders

A practical concern must be acknowledged: funders and donors, particularly those who are new to an organization or a program area, often expect quantitative evidence of impact and may be accustomed to seeing pre- and post-test scores as the standard format for that evidence.

The value creation framework is fully capable of meeting this expectation.

It produces quantitative data: standardized scores across five dimensions, collected from all participants, aggregated by cohort, compared against a benchmark of 10,095 prior participants, and tracked across measurement points at 30 and 90 days post-program.

These numbers can be graphed, compared across cohorts, and reported to donors.

For example, the framework maps directly to EU4Health reporting requirements for capacity strengthening, practical skills development, and leadership competencies.

But the framework also goes further.

It produces qualitative evidence that is both richer and more persuasive than a knowledge score.

Testimonials from practitioners describing specific changes in their work, specific new capabilities they are applying, and specific outcomes they have achieved are uniquely powerful evidence for funders who want to understand what their investment is actually producing.

They are also uniquely useful for advocacy, for policy influence, and for program improvement.

The most candid assessment of the pre- and post-test format, from those who have worked with both approaches, is that organizations sometimes request pre- and post-tests not because they believe they are the most informative measure, but because they are the most familiar.

When organizations are shown the depth and specificity of evidence that the value creation framework produces, alongside its quantitative rigor, most recognize it as superior.

The challenge is not the evidence.

The challenge is the habit of reaching for the familiar.

For organizations in their early stages of building a learning program, TGLF’s position is pragmatic rather than dogmatic: if a particular donor requires a knowledge test, do it.

But build the value creation measurement infrastructure from the beginning, because it is what will produce the longitudinal, attributable evidence that a learning program needs to demonstrate its impact over time.

A pre- and post-test result from a single cohort is a snapshot.

A value creation evidence base built over years, compared against a global benchmark, linked to documented behavioral change and verified attribution of outcomes, is a scientific contribution to understanding how complex professional learning works.

What accountable measurement looks like

The ambition embedded in TGLF’s measurement approach is to make the implicit explicit: to document the chain of evidence that runs from a practitioner’s participation in a learning program, through changes in how she thinks and works, through changes in her professional networks and organizational influence, to measurable improvements in the lives of the people she serves.

Pre- and post-tests capture the very beginning of this chain, under conditions that make the measurement unreliable, and then stop.

The value creation framework, combined with the Impact Accelerator’s weekly documentation and peer-verified attribution process, follows the chain all the way to the end.

This is not a more complex version of the same thing.

It is a different understanding of what impact measurement is for.

Pre- and post-tests are designed to produce evidence that a training program ran and that knowledge scores changed.

The value creation framework is designed to produce evidence that professional practice changed, and that the world is better as a result.

For programs that exist to improve health outcomes, child welfare, humanitarian response, or any other consequential real-world goal, the second kind of evidence is the only kind that matters.

References

Learning and evidence from The Geneva Learning Foundation’s practice

Jones, I., Sadki, R., Brooks, A., Gasse, F., Mbuh, C., Zha, M., Steed, I., Sequeira, J., Churchill, S., Kovanovic, V., 2022. IA2030 Movement Year 1 report. Consultative engagement through a digitally enabled peer learning platform. The Geneva Learning Foundation. https://doi.org/10.5281/zenodo.7119648

Jones, Ian, Watkins, Karen E., Sadki, Reda, Brooks, Alan, Gasse, François, Yagnik, Anmol, Mbuh, Charlotte, Zha, Min, Steed, Ian, Sequeira, Jenny, Churchill, Sarah, Kovanovic, Vitomir, 2022. IA2030 Case Study 7. Motivation, learning culture and programme performance. The Geneva Learning Foundation. https://doi.org/10.5281/ZENODO.7004304

Sadki, R., 2014. Learning beyond training, to survive and grow. https://doi.org/10.59350/684nz-1tg64

Sadki, R., 2020. Ideas Engine: What is The Geneva Learning Foundation’s insights mechanism? https://redasadki.me/2020/09/17/ideas-engine-what-is-the-geneva-learning-foundations-insights-mechanism/

Sadki, R., 2021. Disseminating rapid learning about COVID-19 vaccine introduction. https://doi.org/10.59350/y5gwc-j7j89

Sadki, R., 2022. Learning for Knowledge Creation: The WHO Scholar Program. https://doi.org/10.59350/j4ptf-x6x22

Sadki, R., 2024. Knowing-in-action: Bridging the theory-practice divide in global health. https://doi.org/10.59350/4evj5-vm802

Sadki, R., 2024. What is double-loop learning in global health? https://doi.org/10.59350/s4xtw-b7274

Sadki, R., 2024. Why does cascade training fail? https://doi.org/10.59350/j8vg0-yng46

Sadki, R., 2025. Against chocolate-covered broccoli: text-based alternatives to expensive multimedia content. https://doi.org/10.59350/n1d17-7r990

Sadki, R., 2025. Peer learning in immunization programmes. https://doi.org/10.59350/wkr1w-y7x78

Theoretical foundations: learning in networks and communities of practice

Cope, B. and Kalantzis, M. (eds.) (2017) E-learning ecologies: principles for new learning and assessment. New York: Routledge. DOI: https://doi.org/10.4324/9781315639215

Siemens, G. (2005) ‘Connectivism: a learning theory for the digital age’, International Journal of Instructional Technology and Distance Learning, 2(1), pp. 3-10. Available at: https://www.ceebl.manchester.ac.uk/events/archive/aligningcollaborativelearning/Siemens.pdf

Schön, D.A. (1983) The reflective practitioner: how professionals think in action. New York: Basic Books.

Wenger, E., Trayner, B. and de Laat, M. (2011) Promoting and assessing value creation in communities and networks: a conceptual framework. Heerlen: Open University of the Netherlands. Available at: https://wenger-trayner.com/resources/publications/evaluation-framework/

Assessment design limitations

Cope, B., Kalantzis, M., 2013. Towards a New Learning: the Scholar social knowledge workspace, in theory and practice. E-Learning and Digital Media 10, 332. https://doi.org/10.2304/elea.2013.10.4.332

Campbell, D.T. and Stanley, J.C. (1963) Experimental and quasi-experimental designs for research. Chicago: Rand McNally.

Howard, G.S. (1980) ‘Response-shift bias: a problem in evaluating interventions with pre/post self-reports’, Evaluation Review, 4(1), pp. 93-106. DOI: https://doi.org/10.1177/0193841X8000400105

Knapp, T.R. (2016) ‘Why is the one-group pretest-posttest design still used?’, Clinical Nursing Research, 25(5), pp. 467-472. DOI: https://doi.org/10.1177/1054773816666280

Transfer of learning and behavior change

Baldwin, T.T. and Ford, J.K. (1988) ‘Transfer of training: a review and directions for future research’, Personnel Psychology, 41(1), pp. 63-105. DOI: https://doi.org/10.1111/j.1744-6570.1988.tb00632.x

Kirkpatrick, D.L. and Kirkpatrick, J.D. (2006) Evaluating training programs: the four levels. 3rd edn. San Francisco: Berrett-Koehler.

Peer learning and value creation in global health and humanitarian contexts

Bahattab, A.A.S., Zain, O., Linty, M., Amat Camacho, N., Von Schreeb, J., Hubloue, I., Della Corte, F. and Ragazzoni, L. (2024) ‘Development and evaluation of scenario-based e-simulation for humanitarian health training: a mixed-methods action research study’, BMJ Open, 14(8), e079681. DOI: https://doi.org/10.1136/bmjopen-2023-079681

Naal, H., Beaini, R., Harb, C., Chamas, A., El Asmar, K. and Saleh, S. (2024) ‘Capacity building and community of practice for women as community health workers: a mixed-methods evaluation’, Frontiers in Public Health, 12, p. 1375143. DOI: https://doi.org/10.3389/fpubh.2024.1375143

Saleh, S., Brome, D., Mansour, R., Daou, T., Chamas, A. and Naal, H. (2022) ‘Evaluating an e-learning program to strengthen the capacity of humanitarian workers in the MENA region: the Humanitarian Leadership Diploma’, Conflict and Health, 16(1), p. 27. DOI: https://doi.org/10.1186/s13031-022-00460-2

Tsiouris, F., Hartsough, K., Poimboeuf, M., Raether, C., Farahani, M., Ferreira, T. et al. (2022) ‘Rapid scale-up of COVID-19 training for frontline health workers in 11 African countries’, Human Resources for Health, 20(1), p. 43. DOI: https://doi.org/10.1186/s12960-022-00739-8

Collective impact and shared measurement

Kania, J. and Kramer, M. (2011) ‘Collective impact’, Stanford Social Innovation Review, 9(1), pp. 36-41. Available at: https://ssir.org/articles/entry/collective_impact [Accessed 14 March 2026.]

How to cite this article

As the primary source for this original work, this article is permanently archived with a DOI to meet rigorous standards of verification in the scholarly record. Please cite this stable reference to ensure ethical attribution of the theoretical concepts to their origin. Learn more…

Reda Sadki (2026). How to measure real-world outcomes in learning initiatives. Reda Sadki: Learning to make a difference. https://doi.org/10.59350/btfdf-1bf06

Fediverse reactions

Explore topics

Cite this article

Reda Sadki (2026). How to measure real-world outcomes in learning initiatives. Reda Sadki: Learning to make a difference. https://doi.org/10.59350/btfdf-1bf06

How to measure real-world outcomes in learning initiatives

The question behind the question

Why knowledge scores are not a proxy for capability

A different understanding of what learning is

What the value creation framework measures

The learning model that makes measurement meaningful

The Primer: active knowledge production, not passive consumption

Teach to Reach: collective intelligence at scale

The peer learning exercise: from experience to action plan

The Impact Accelerator: learning that becomes action, documented

How TGLF solves the attribution problem

What the evidence actually looks like

The deficit model and its consequences

The knowledge question reframed

Addressing the measurement needs of funders

What accountable measurement looks like

References

Learning and evidence from The Geneva Learning Foundation’s practice

Theoretical foundations: learning in networks and communities of practice

Assessment design limitations

Transfer of learning and behavior change

Peer learning and value creation in global health and humanitarian contexts

Collective impact and shared measurement

How to cite this article

Fediverse reactions

Explore topics

Cite this article

Other Posts

What is the future of International Geneva: insights from the CAGI report

World Immunization Week: 5 years of visual storytelling by and for the people who make vaccines work

Turning the tide: 8 practical insights to end malaria

A short history of the first five years of Teach to Reach

How to measure real-world outcomes in learning initiatives

The question behind the question

Why knowledge scores are not a proxy for capability

A different understanding of what learning is

What the value creation framework measures

The learning model that makes measurement meaningful

The Primer: active knowledge production, not passive consumption

Teach to Reach: collective intelligence at scale

The peer learning exercise: from experience to action plan

The Impact Accelerator: learning that becomes action, documented

How TGLF solves the attribution problem

What the evidence actually looks like

The deficit model and its consequences

The knowledge question reframed

Addressing the measurement needs of funders

What accountable measurement looks like

References

Learning and evidence from The Geneva Learning Foundation’s practice

Theoretical foundations: learning in networks and communities of practice

Assessment design limitations

Transfer of learning and behavior change

Peer learning and value creation in global health and humanitarian contexts

Collective impact and shared measurement

How to cite this article

Fediverse reactions

Explore topics

Cite this article

Other Posts

What is the future of International Geneva: insights from the CAGI report

World Immunization Week: 5 years of visual storytelling by and for the people who make vaccines work

Turning the tide: 8 practical insights to end malaria

A short history of the first five years of Teach to Reach

Discover more from Reda Sadki