Written by Taylor Hawkins, Managing Consultant

Satisfaction scores, completion rates, and reach tell you a program ran. They don't tell you whether it was worth it: once the buzz fades, are leaders, their teams and the systems they work in left better than before? It was one of the questions I heard practitioners asking each other at the L&D Symposium last week, and it's the one we all need to sit with. Avoiding it is what lets leadership investment be treated as a nice-to-have, rather than the organisational essential it is.

There's a moment in a lot of budget conversations that passes so fast you could miss it.

The leadership program comes up. Someone shares the numbers. Attendance was strong. Satisfaction scored well. Completion sat near a hundred per cent. Heads nod. The slide moves on.

And the question that matters rarely gets asked out loud: Was it worth it? Not "did people show up." Not "did they enjoy it." Did the investment, the money, the time, the attention of busy leaders pulled out of their day-to-day responsibilities deliver for the organisation as a whole.

It's an uncomfortable question, and one that I know is too often avoided as it is hard to measure and feels like a stride too far into a level of responsibility that few L&D practitioners want to take on. So, across the sector, we've quietly built an entire measurement habit around not having to answer it at all.

But that avoidance is exactly what keeps L&D out of the strategic conversations it should be driving. Measuring real impact is hard. It's multi-causal and the signal is delayed. All true. And none of it explains why completion metrics persist. They persist because they're the numbers we can control.

You can guarantee a hundred percent completion. You can engineer a strong satisfaction score. You cannot guarantee behaviour change, and you certainly can't guarantee business impact. So, we reach for the safe numbers and quietly leave the risky ones alone. Satisfaction, completion, reach: everyone counts activity, none counts impact.

The gap is stark. Global organisations spend more than US$60 billion a year on leadership development, yet only about half ever evaluate any program for whether behaviour changed on the job, and even they look at just a third of their programs that way. Do the arithmetic and fewer than one in five programs are ever checked against whether anyone leads differently afterwards. We spend like it matters enormously and measure like it doesn't.

The chain reaction to deliver impact over time

"Was it worth it?" isn't one question. It's four, and they map onto the Leadership Impact Chain:

  1. Leadership Behaviour: did the leader change how they lead?
  2. Conditions for Success: did that shift the conditions of their team to support their success through improved clarity, cultural climate, and competence?
  3. Aligned Performance: did stronger conditions change how the work gets done?
  4. Business Impact: did that add up to an outcome the strategy rests on?

Most evaluation never gets past the first link. We measure whether people completed it, liked it and maybe if they took an action to follow up, and call that impact. Even those who measure implementation, usually only do so a few weeks after the program. A leader reports they're coaching more. Tick. Whether it holds, and whether it changes anything downstream, goes unrecorded, because the measurement has already stopped.

Measured in isolation, a single number can even point the wrong way. A leader who routes every piece of feedback through an AI tool can post rising "feedback frequency" while the human, trust-building part of the exchange quietly erodes: the metric climbs as the thing that mattered fades. Behaviour change you can't trace along the chain isn't impact. It's a number you can't bank.

Design the chain, and measurement becomes obvious

The Leadership Impact Chain runs deliberately from outcome to behaviour:

Leadership behaviour sits last, not first. Start with the business outcome you're trying to create. Define the performance that delivers it. Identify the conditions that make that performance possible. Then, and only then, derive the leadership behaviour that creates those conditions.

Take a concrete case. Start with the outcome: faster, better decisions made closer to the customer. The performance that delivers it is frontline managers making the calls they currently escalate. The condition that makes those calls possible is senior leaders who genuinely let go. So the behaviour to build isn’t ‘delegation’ in the abstract, its senior leaders practising the release of real authority, with the measure being whether decisions actually move down the line.

Build a program this way and measurement stops being an afterthought. Every link gets its own measure, and "was it worth it?" becomes answerable, because you can see where the chain holds and exactly where it breaks. The value isn't a single ROI figure. It's a diagnosis you can action. And the payoff is significant: where the value is measured, leadership development returns an average of around seven dollars for every dollar spent, within arrange of three to eleven. The point isn't the precise multiple. It's that the impact was measurable at all.

It also shows where the chain usually breaks, and it's rarely the program content. New behaviour needs conditions to hold it: the 3Cs of Clarity, Climate and Competence. A leader who learns to devolve decisions into a still-siloed strategy has nowhere to put the behaviour. One who starts coaching while their peers stay transactional feels the pull back to the mean. None of that is L&D's job alone. But if L&D doesn't name those conditions and hold the wider system accountable for them, the program fails in a way that looks, on the dashboard, like success. That, more than the difficulty of measurement, may be the real reason the question gets avoided: answering it honestly shines a light on the conditions back in the business, and on the people who own them.

That's the shift that matters: from L&D as a delivery function to L&D as a systems design function. Not "we ran the program." Instead: "we designed the conditions for the behaviour to stick, and the impact to show up."

So, was it worth it?

The choice, in the end, is fairly binary. Keep measuring what's easy and controllable, and stay safe, busy, and peripheral. Or measure what matters, be willing to stop the programs that don't move the needle and claim the strategic ground that demonstrating impact earns.

None of that is comfortable. It means putting your own programs on the table and being willing to find them wanting. It means telling the people who fund the work that the problem isn’t only the training, it's the conditions back in the business that they own. And it means trading the safety of a clean dashboard for the exposure of an honest diagnosis. It is also what turns L&D from an add-on into a partner the business can't do without.

It means asking a different question on the way in, not the way out. Not "how do we fill the room?" but "what conditions need to exist for this person to lead differently, their team to perform differently, the organisation to shift, and how will we know?"

Because "was it worth it?" is the line between strategic investment and expensive activity. Leaders are already being asked it about everything else they fund. Sooner or later, they’ll ask it about us too.

Better that we're the ones who asked it first.

Further reading