Setting Effective Targets for Developer Productivity Metrics in the Age of Gen AI
Using AI usage as a metric alone is not the way to go, this is what to do instead!
Measure the impact of AI in engineering with DX (Sponsored)
Everyone’s asking the same question: “What productivity gains are we getting from AI?”
With DX, you can get answers about:
How much time developers are saving with tools like Copilot
Whether AI usage is improving throughput or quality
What’s holding some teams back from using AI more
Schedule a personalized product tour to see how DX helps you measure engineering productivity and the impact of AI.
Let’s get back to this week’s thought!
Intro
Measuring developer productivity is an important topic for many organizations. Especially in current times when many organizations are looking at how to increase Software Development productivity by using AI.
Some organizations are also enforcing the use of AI or measuring AI usage as a KPI in the performance review. You can find my thoughts on enforcing the use of AI either being a good or a bad thing here:
I’ve even heard of some companies thinking about doing leaderboards of who is using the most LLM credits or who has committed the most AI-generated code to the codebase.
Which is a totally wrong way to measure productivity.
To help us with how to do it the right way and set effective targets for measuring developer productivity, I am happy to have Laura Tacho, CTO at DX as a guest author to today’s newsletter article.
Before we hand it over to Laura, I’ll share a bit of my thoughts on measuring developer productivity in combination with pure output.
Measuring the wrong things inspires people to try to game the system
It’s really important to understand that if you purely just measure the output and usage, people will naturally be prone to use that as much as possible, which would provide the wrong results for the business.
And it can also inspire pure individualism, which you don’t want in your organization.
So, if you measure how much are your engineers using AI by tracking the LLM credits usage, people will set up cron jobs to use as many LLM credits as possible.
Which will actually have ZERO effect on business success. On the contrary, the company will lose a lot of money because of it.
And same is true with measuring:
lines of code being added,
amount of tasks finished,
story points being done,
number of hours being online.
All of such measures inspire people to do wrong things. I rather look at these 4 specific things:
Are they focusing on building the RIGHT things and challenging requirements.
How much are they helping others.
How are they contributing to the success of the whole team/organization.
What improvements have they implemented and ensure they get adopted by other engineers.
And these are the main things I value:
Team productivity > Developer productivity
Helping others > Completing your own tasks
Building the RIGHT things > Amount of things being built
Using AI alone or finishing a lot of tasks or story points doesn’t mean much if you don’t provide value to the business or you don’t share your knowledge with others and your team is not working well together.
You can read my thoughts on metrics and how I measure developer productivity in these 2 articles:
Now, let’s hand it over to Laura!
Setting targets for developer productivity metrics takes careful consideration
In some cases, setting the wrong goals can backfire by creating unintended consequences.
Teams might start focusing on optimizing the numbers instead of the system, especially if there are anti-patterns like tying bonuses to individual metrics, or setting blanket targets on metrics teams can't directly control.
At the same time, leaders want to drive meaningful improvement and use goals for motivation and accountability. Teams want transparency and direction on where to focus. Even so, it can be difficult to figure out what kind of targets are realistic in the first place.
These three practices help engineering leaders avoid pitfalls and encourage their teams to use data to improve the system, leading to the right outcomes:
Set goals on the right type of metrics
Use multi-dimensional systems of measurement
Consider organizational context when setting targets themselves
Without these three things, organizations run the risk of developers feeling mistrusted and micromanaged, teams gaming metrics rather than improving systems, and metrics becoming distorted so they no longer represent reality.
Set team goals on controllable input metrics, not output metrics
Not all metrics are immediately actionable because they measure big-picture trends, and are often summary metrics that are influenced by many other factors.
Setting goals on these kinds of metrics – output metrics – can incentivize the wrong type of behavior and disempower developers, as they feel they can’t meaningfully influence the numbers.
On the other hand, a different type of metric – controllable input metrics – are very actionable on the team level and contribute to improving the system.
Being able to identify the difference between these different types of metrics is an important skill for any devex leader.
Output metrics: These metrics represent what you want to get to, but are not directly actionable. That’s because they’re a summary of other factors, used best as a diagnostic tool but not as something to be directly influenced by a single process, tool, or action. Some examples include:
Change Failure Rate
PR throughput
Controllable input metrics: These measure behaviors or processes that teams directly influence, which then result in changes to the output metrics. For example, code review turnaround SLAs are controllable and can improve PR throughput, and reducing flaky CI tests can improve Change Failure Rate.
This pattern is not unique to developer experience and can be seen in other parts of life.
Let’s imagine you have low levels of iron in your blood. This level is an output metric, and setting a goal on it – without mapping it to controllable input metrics – can make improvement seem out of reach.
Instead, you want to focus on controllable input metrics like taking supplements, eating iron-rich foods, and avoiding coffee with meals.
Doing these activities will lead to a change in the output metric, which makes them more suitable for goal-setting. Similarly, engineering teams need to identify the actionable inputs that influence the larger output metrics.
Depending on an organization's size and complexity, it might still be preferable to set goals on output metric, like improving Change Failure Rate, in order to simplify reporting and align on a single goal.
In cases like this, it’s essential that frontline teams go through the process of metric mapping to break down the output metric into controllable input metrics, and that those input metrics have their own goals and structures of reinforcement around them.
Avoid gamification with multi-dimensional measurement and aligned incentives
A common objection to setting targets around metrics is the fear that developers will game the system.
Gamification is the phenomenon where individuals distort the data in order to make the metrics look good, without actually improving the system. Goodhart’s Law describes this phenomenon, summarized as “when a measure becomes a target, it ceases to be a good metric.”
Gamification is dangerous for organizations because while the metrics show surface-level improvements, the reality is that the systems are usually worse off – but those negative changes are largely invisible because they aren’t being measured properly.
Setting goals amplifies the incentive for individuals to game the system, because goals create accountability and pressure to deliver specific results.
When people know they're being evaluated against a specific number, especially if rewards or advancement opportunities depend on it, the temptation to find shortcuts or manipulate metrics becomes stronger than the motivation to make genuine improvements that might take longer to reflect in the measurements.
A well-designed system of measurement and intentional culture around using metrics can help protect from the effects of gamification. We know how humans behave when metrics are used for measurement and goal-setting. With that knowledge, it’s up to us to design better systems.
Use multidimensional measurements instead of one-dimensional metrics.
When you track multiple related metrics together, manipulating one metric usually affects others negatively, making gamification more obvious. DX Core 4 is an example of a multidimensional system of measurement.
Focus on learning and improvement rather than incentivizing or rewarding hitting specific thresholds.
Give teams time and autonomy to address the root causes affecting metrics. When teams feel pressured without having resources or authority to make real improvements, they're more likely to find ways to adjust the numbers without fixing the system.
Set realistic targets based on organizational context and strategy
When determining actual target values, one size doesn't fit all. Consider:
Past performance
Different teams start from different places. Instead of blanket targets across the organization, consider percentage improvements from each team's current baseline.
External benchmarks
Industry benchmarks (like the 75th percentile) provide useful reference points, but remember that context matters.
Effort curves
Improvement isn't linear. For example, moving from the 50th to 75th percentile often requires less effort than moving from the 75th to 90th percentile.
Metric characteristics
For some metrics, higher isn't always better (e.g., extremely short PR cycle times might indicate inadequate code reviews). Some metrics need SLAs or thresholds rather than continuous improvement targets.
Above all, remember that metrics don't replace strategy. They enhance it. Even with robust metrics, you still need human judgment to set appropriate goals in your specific context.
Actionable points to get you started
To apply these principles in your organization:
Clearly distinguish between controllable input metrics and output metrics
Identify the specific input metrics teams can influence
Show how these inputs connect to larger organizational goals
Set appropriate targets on those controllable metrics
Ensure teams have time and resources to address improvements
Monitor both input and output metrics to validate your approach
By following these guidelines, you can create a more productive environment focused on genuine system improvement rather than superficial number manipulation.
Last words
Special thanks to Laura for sharing her insights on this very important topic! Make sure to follow her on LinkedIn and also check out DX, they are doing a lot of great things in regards to measuring developer productivity.
We are not over yet!
What Does a CTO do?
Check out my latest video. I am sharing what a CTO does on a daily basis. The role heavily depends on the business and looks completely different based on the size of the organization.
New video every Sunday. Subscribe to not miss it here:
Liked this article? Make sure to 💙 click the like button.
Feedback or addition? Make sure to 💬 comment.
Know someone that would find this helpful? Make sure to 🔁 share this post.
Whenever you are ready, here is how I can help you further
Join the Cohort course Senior Engineer to Lead: Grow and thrive in the role here.
Interested in sponsoring this newsletter? Check the sponsorship options here.
Take a look at the cool swag in the Engineering Leadership Store here.
Want to work with me? You can see all the options here.
Get in touch
You can find me on LinkedIn, X, YouTube, Bluesky, Instagram or Threads.
If you wish to make a request on particular topic you would like to read, you can send me an email to info@gregorojstersek.com.
This newsletter is funded by paid subscriptions from readers like yourself.
If you aren’t already, consider becoming a paid subscriber to receive the full experience!
You are more than welcome to find whatever interests you here and try it out in your particular case. Let me know how it went! Topics are normally about all things engineering related, leadership, management, developing scalable products, building teams etc.
Nice guest post! I've been following Laura, Abi and the DX team for a while!
Measuring DevXP is always a tricky matter, and now with GenAI in the equation is more relevant than ever. Ultimately, as mentioned in the post, metrics don't replace strategy; they enhance it.