Driving Operational Successes through Careful Metric Design | by Jordan Gomes | Jan, 2023

By Jessie Hobb On Jan 12, 2023

The art of translating a strategy into operational metrics

TL;DR:

Data/Business analysts can sometimes be given the “amazing opportunity” to help create some metrics. Or they can see the need for the creation of new metrics and proactively take on those tasks. But while building metrics is easy, designing good metrics is hard.
More than tracking, metrics are ways for organizations to align stakeholders around one vision of the world and one common goal, and this comes with a set of challenges.
One framework can help operational teams to make sure they are properly setting up their metrics: the input > output > outcome framework.
Whether the above framework is used or not, once the new metrics have been designed, it is important to validate them and make sure they pass a couple of tests.

Defining metrics is basically translating a strategy into a set of “quantities”. Well-defined metrics help you make sure you stay on track to achieve the goals of your organization. But translation errors can get costly: if the metric you design doesn’t fully represent the spirit of the strategy , it is easy to have the organization stray away from its original goal, and ultimately although you have people who did their job pretty well (they reached the OKR you set on the metric you created), the organization is not at all in the state you were expecting:

Imagine you are flying from Los Angeles to New York City. If a pilot leaving from LAX adjusts the heading just 3.5 degrees south, you will land in Washington, D.C., instead of New York. Such a small change is barely noticeable at takeoff — the nose of the airplane moves just a few feet — but when magnified across the entire United States, you end up hundreds of miles apart.

James Clear wrote the above in Atomic Habit. Granted — this was meant to illustrate something entirely different. But the idea is the same with operational metrics: your destination is your strategy, and metrics will help you stay on the right path toward it. If you don’t set your metric properly, it will be hard keeping on with the journey you set for yourself.

In short, metrics allow you to quantify something. You can literally build a metric for anything and that’s actually one of the biggest challenges of the metric design work (quantity <> quality).

Metrics allow you to understand how ‘a process’ is doing — they allow you to understand historical evolution, they allow you to do benchmarks, in the case of ‘leading indicators’ they give you an early read on future performance

But above all that, the true power of metrics is that they align. They give a common language and a common perspective inside your organization. They federate around a common goal. And that’s why having the right metrics can be tricky: at some point, once the metric is defined — very often it will become a goal, and the team will try moving it.

And when that happens — this is usually when you start seeing the different problems your metric might have/has.

Defining metrics comes with a lot of challenges — but two of them really strike me as ones to be extremely careful with whenever you are working on such a project

The metrics incentivize the wrong behavior

Metrics can create unintended consequences that may not align with the overall goals of the company. It is quite important to carefully consider the potential impact of metrics on behavior, and to ensure that they are designed in a way that incentivizes the right behaviors and outcomes.

For instance, let’s imagine reducing the number of support tickets opened via email is the number one priority for your team. One solution could be to make it as hard as possible to contact the email support. For instance, to “hide” your support email address on a random page of your website, to make it very complicated, and display it in a .png, so that people have to manually type it again. Most likely this will drive down the number of contacts, but that might have other unintended consequences (e.g. increasing the number of ‘negative’ social media interactions).

This is not necessarily because people want to game the system — it is more because people don’t necessarily have a total understanding of the organization and everything that goes into it. As soon as someone’s performance is linked to a metric — it is fair game to expect them to try to move the metric, so it is up to the metric designer to make sure the ‘rules of the game’ are clearly articulated.

There are a bunch of ways to fight this problem, but they also come with their own challenges:

You can design a pairing metric, i.e. another metric that is supposed to ‘force’ a certain behavior (for instance, a pairing metric to the # of sign up could be the churn rate). Some challenges can arise when you start having too many metrics, and they don’t necessarily point in the same direction — it can become hard to decipher the signal from the noise, and to decide what to do.
You can design a compound metric that takes into account several things, to make sure it cannot be easily gamed. In “Trustworthy Online Controlled Experiments” (a great book I recommend for anyone interested in A/B testing, don’t let the title scare you away!), the authors explain that you should build “Objective Experiment Criterion” (OEC) whenever you run an experiment. An OEC is a compound metric taking into account the metric(s) your experiment is supposed to move and the metric(s) you don’t want your experiment to move negatively (e.g. cost metrics, guardrail metrics, health metrics) — that will ultimately allow you to create a binary decision rule to decide if your experiment is successful or not. The concept is really interesting and can work well for experiments. Said this, compound metrics can be hard to use to track a process — you need to understand their underlying logic to be able to understand the movement of the metric and debug them — so ultimately you end up tracking all the different metrics that create the compound.

Ultimately it is the task of the metric designer to make sure the right checks in place to make sure metrics are moved in a way that aligns with the company goals.

The metric is deceptive

Imagine: you’re an analyst working at a big e-commerce company. Someone somewhere saw that there was a large correlation between the # of transactions and the overall revenue. From this study, one VP decided that an OKR against the # of transactions should be taken, and asked the different teams to move this OKR.

The teams start working on some campaigns to boost the # of transactions (re-targeting former customers, offering discounts, etc.). And they are successful — but it is not clear how this is impacting the revenue.

Unfortunately, a follow-up analysis reveals that most of the revenue for the company actually comes from ‘high ticket’ items, and the increase in the # of transactions happened mainly for ‘low ticket’ items — ultimately not generating any tangible impact on the revenue.

In clearer word, the team did move the # of transactions, but that didn’t translate into revenue.

In this scenario, the metric was deceptive. It failed to take into account one important aspect of the business: some transactions, and not all, drive revenue. It ended up being quite costly for the company: the company built a metric and an infrastructure to report it, it communicated it and explained it to all the different stakeholders, those different stakeholders had to change their operations to move the metric, the company had to run multiple studies to understand the underlying impact, etc. In summary, a lot of investment, and not a lot of return.

Metrics are a great tool but designing them comes with a lot of challenges, and is generally not an easy exercise. But to help, a few frameworks exist. In particular, the “Input > Output > Outcome” framework, that I will discuss in the next paragraph. To make the topic less dry, I will be taking the example of an education program for high-schoolers, and how you can use the input > output > outcome framework to define metrics to make the program a success!

Quick disclaimer: just like with any framework, it doesn’t necessarily work with anything / it is not necessarily the best to use for your business. Ultimately a framework is simply a tool to help with decision making — you shouldn’t follow that blindly, but instead, try to tailor it for your own situation.

Inputs are what you control

This is the “I” in ROI
This is what you bring to the table: the amount of time you spend on a task, the quantity of materials used to produce something, etc.
This should be fully in your control

In our example (the education program), the # of teachers, their seniority, the fundings, etc. could all be the input metrics of our system

Outputs are moved by your inputs

Outputs directly follow inputs: if you think in terms of funnel, outputs are the next step of the funnel after input
They can be moved via your inputs: if inputs increase (or decrease), outputs will change accordingly. They are very actionable.
They have a quick response rate to your activities, meaning that when the inputs increase or decrease, shortly after, the output changes accordingly. They are not fully under your control though.
There is a causal link between your output and your outcome

Those metrics are usually the hardest ones to define. They sit right between your input and outcome, but defining where this ‘in between’ exactly is can get tricky — as you want to keep the actionability but also the causality.

As with a lot of things in life, it is all about balance. In our education program example, the grades of the students, their consistency, their improvement over time, etc. could be our outputs.

Outcomes are the Northstar

Outcomes are the most important thing to you — what you are aiming to move with all your activities. They are the primary indicator of your business health. They are the representation of the “WHY” that drives you and your team. Moving them is usually harder than outputs, requires “the help” of multiple outputs, and takes some time. While outputs are more metrics you should track on the “day-to-day”, outcomes are what you want to achieve after a certain period of time.

In our example, the # of students graduating high school could be the northstar.

And here you have the final flow: The # of teachers, their average seniority, the funding of the school (inputs) help drive the grades of the students and their consistency over time higher (output) which ultimately lead to more students graduating high school (outcomes).

Whether this framework is used or not — once the new metrics have been designed, it is important to validate them and make sure they pass a couple of tests:

Making sure the metrics are properly representing the phenomenon

The first step is to make sure that the metrics you are thinking of using are properly representing the phenomenon you are trying to assess. This is highly dependent on your business/activity/company and there is no secret formula here.

Is “opening an email” a good proxy for “a prospect read our communication”? Are grades a good proxy for knowledge? Generally speaking, the idea is to select some kind of measurement methodology, and make sure the metric is reliable (i.e. measurement is trustworthy) and is accurate (i.e. it properly depicts the phenomenon it is supposed to depict).

Making sure the metrics are well classified and are aligned with why they have been created

Input metrics should be directly in your control / directly actionable. Output metrics should directly “follow” your input metrics, in the sense that it should be clear how much your output would change for one additional unit of input. Then there should be a causal link between your outcome and your output. If your input don’t necessarily flow into your output, or if ultimately your output doesn’t seem to have any impact on your outcome, then the system don’t really work and you are at risk of driving the wrong things.

On this last point — this is one of the hardest thing to prove: the causality between your outputs and your outcome. Depending on which level of ‘risk’ you are willing to take (but also which tools you have at hand), it might require you to do a few experiments — prior to properly defining which outcomes you should be driving.

Making sure the metrics are not incentivizing “wrong behaviors”

As said previously, you don’t want to incentivize the wrong behavior. You don’t want your support associates to reduce the number of support tickets without any care for customer satisfaction. You don’t want to push your salespeople to sell without caring about the retention rate.

The idea here is to use this step to think about the worst possible ways to ‘game’ the metric — and build pairing indicators accordingly, i.e. secondary metrics that will block anyone to take the ‘path of least resistance’ that could have negative consequences on your business.

If you measure some quantities (e.g. sales), you might want to make sure to get some measure of qualities (e.g. retention rate). If you measure something in the short term , you might want to make sure you also measure something in the long term. In High Output Management, Andy Grove talks about “pairing indicators” in a similar way:

“Indicators tend to direct your attention toward what they are monitoring. It is like riding a bicycle: you will probably steer it where you are looking. If, for example, you start measuring your inventory levels carefully, you are likely to take action to drive your inventory levels down, which is good up to a point. But your inventories could become so lean that you can’t react to changes in demand without creating shortages. So because indicators direct one’s activities, you should guard against overreacting. This you can do by pairing indicators, so that together both effect and counter-effect are measured.”

Closing thoughts: where should you focus your efforts?

According to Jeff Bezos, input is what you should be focused on. A few other businessmen have the same way of thinking (for example, Keith Rabois has this interesting quote: “in order to win a football game, you don’t focus on the goal, you focus on training the team”).

I personally share this thinking, but I wanted to offer a more nuanced opinion, and because it feels like these days you need to have at least one mention of chatGPT in your article — I asked chatGPT what were its “thoughts” about this.

In short — “it depends”.

Hope you enjoyed reading this piece! Do you have any tips you’d want to share? Let everyone know in the comment section!

And If you want to read more of me, here are a few other articles you might like: