Setting Effective Targets for Developer Productivity Metrics in the Age of Gen AI
Using AI usage as a metric alone is not the way to go, this is what to do instead!
Measure the impact of AI in engineering with DX (Sponsored)
Everyone’s asking the same question: “What productivity gains are we getting from AI?”
With DX, you can get answers about:
How much time developers are saving with tools like Copilot
Whether AI usage is improving throughput or quality
What’s holding some teams back from using AI more
Schedule a personalized product tour to see how DX helps you measure engineering productivity and the impact of AI.
Let’s get back to this week’s thought!
Intro
Measuring developer productivity is an important topic for many organizations. Especially in current times when many organizations are looking at how to increase Software Development productivity by using AI.
Some organizations are also enforcing the use of AI or measuring AI usage as a KPI in the performance review. You can find my thoughts on enforcing the use of AI either being a good or a bad thing here:
I’ve even heard of some companies thinking about doing leaderboards of who is using the most LLM credits or who has committed the most AI-generated code to the codebase.
Which is a totally wrong way to measure productivity.
To help us with how to do it the right way and set effective targets for measuring developer productivity, I am happy to have Laura Tacho, CTO at DX as a guest author to today’s newsletter article.
Before we hand it over to Laura, I’ll share a bit of my thoughts on measuring developer productivity in combination with pure output.
Measuring the wrong things inspires people to try to game the system
It’s really important to understand that if you purely just measure the output and usage, people will naturally be prone to use that as much as possible, which would provide the wrong results for the business.
And it can also inspire pure individualism, which you don’t want in your organization.
So, if you measure how much are your engineers using AI by tracking the LLM credits usage, people will set up cron jobs to use as many LLM credits as possible.
Which will actually have ZERO effect on business success. On the contrary, the company will lose a lot of money because of it.
And same is true with measuring:
lines of code being added,
amount of tasks finished,
story points being done,
number of hours being online.
All of such measures inspire people to do wrong things. I rather look at these 4 specific things:
Are they focusing on building the RIGHT things and challenging requirements.
How much are they helping others.
How are they contributing to the success of the whole team/organization.
What improvements have they implemented and ensure they get adopted by other engineers.
And these are the main things I value:
Team productivity > Developer productivity
Helping others > Completing your own tasks
Building the RIGHT things > Amount of things being built
Using AI alone or finishing a lot of tasks or story points doesn’t mean much if you don’t provide value to the business or you don’t share your knowledge with others and your team is not working well together.
You can read my thoughts on metrics and how I measure developer productivity in these 2 articles:
Now, let’s hand it over to Laura!
Setting targets for developer productivity metrics takes careful consideration
In some cases, setting the wrong goals can backfire by creating unintended consequences.
Teams might start focusing on optimizing the numbers instead of the system, especially if there are anti-patterns like tying bonuses to individual metrics, or setting blanket targets on metrics teams can't directly control.
At the same time, leaders want to drive meaningful improvement and use goals for motivation and accountability. Teams want transparency and direction on where to focus. Even so, it can be difficult to figure out what kind of targets are realistic in the first place.
These three practices help engineering leaders avoid pitfalls and encourage their teams to use data to improve the system, leading to the right outcomes:
Set goals on the right type of metrics
Use multi-dimensional systems of measurement
Consider organizational context when setting targets themselves
Without these three things, organizations run the risk of developers feeling mistrusted and micromanaged, teams gaming metrics rather than improving systems, and metrics becoming distorted so they no longer represent reality.
Set team goals on controllable input metrics, not output metrics
Not all metrics are immediately actionable because they measure big-picture trends, and are often summary metrics that are influenced by many other factors.
Setting goals on these kinds of metrics – output metrics – can incentivize the wrong type of behavior and disempower developers, as they feel they can’t meaningfully influence the numbers.
On the other hand, a different type of metric – controllable input metrics – are very actionable on the team level and contribute to improving the system.
Being able to identify the difference between these different types of metrics is an important skill for any devex leader.
Output metrics: These metrics represent what you want to get to, but are not directly actionable. That’s because they’re a summary of other factors, used best as a diagnostic tool but not as something to be directly influenced by a single process, tool, or action. Some examples include:
Change Failure Rate
PR throughput
Controllable input metrics: These measure behaviors or processes that teams directly influence, which then result in changes to the output metrics. For example, code review turnaround SLAs are controllable and can improve PR throughput, and reducing flaky CI tests can improve Change Failure Rate.
This pattern is not unique to developer experience and can be seen in other parts of life.
Let’s imagine you have low levels of iron in your blood. This level is an output metric, and setting a goal on it – without mapping it to controllable input metrics – can make improvement seem out of reach.
Instead, you want to focus on controllable input metrics like taking supplements, eating iron-rich foods, and avoiding coffee with meals.
Doing these activities will lead to a change in the output metric, which makes them more suitable for goal-setting. Similarly, engineering teams need to identify the actionable inputs that influence the larger output metrics.
Depending on an organization's size and complexity, it might still be preferable to set goals on output metric, like improving Change Failure Rate, in order to simplify reporting and align on a single goal.
In cases like this, it’s essential that frontline teams go through the process of metric mapping to break down the output metric into controllable input metrics, and that those input metrics have their own goals and structures of reinforcement around them.










