Aren't we overdoing it? I mean the measurement? Sure, we can measure the utilization of the AI tools, how much code gets generated, etc.
Risking controversy: it's largely useless.
Measuring utilization was a red herring, even when we were measuring human work. Ask any Lean Management or Theory of Constraints folks, and they (we?) will rant about it as long as you want.
Generated code would only be interesting if the act of generation were unassisted. If a developer spent an hour writing the production-ready code and another developer spent the same hour delivering the same code but used the time to prompt, re-prompt, and review code, is there really a difference?
And it just so happens that we have (or should have) a good compound metric that allows us to integrate the vast majority of these nuances into one dimension.
*How much value are we delivering over time as compared to when we haven't used AI tools?*
In that aspect, it may actually be interesting to measure tool utilization one way or another to have a reference dimension.
But that's it. How much more value are we delivering?
And for the companies that are clueless about the actual value they create, they may use a less useful, albeit much easier to answer, question:
*How has our big-picture throughput changed?*
In other words, how many more value-adding features are we delivering?
Because, with the AI tools, I can easily optimize *some* aspects of my work by 100%. However, if it creates more work for others down the line, the aggregated gain may not be nearly as impressive.
A simple example is generating a large chunk of code quickly and shifting the cognitive load of ensuring it works well and doesn't break anything to a person conducting a code review. I just got super-fast. And the fact that the team delivers at the same (or slower) pace, well, who cares?
So, you want to know how much better your new shiny AI tools made you? Ask product people.
Thanks for sharing this Pawel. I am a big advocate of doing things on a case-by-case basis, becuase what may work for some company may not the other. I definitely wouldn't recommend to go "All In" in every metric for small-to-mid-size companies. My recommendation would be to pick a few metrics (maybe 1 or 2 for the start) and then go from there.
And yes, how much value is delivered is the ultimate key. The problem these days, especially many engineering leaders have is how to manage unrealistic AI expectations. Having at least some data helps! Appreciate you for sharing your thoughts on this!
I understand the need to manage expectations pouring from the top ranks who want to go all-in on AI, even if they don't understand the realities of their teams.
I get that we need to "give them something."
Yet most often it's like throwing the baby out with the bathwater. We get what we measure. Or, more precisely: "When a measure becomes a target, it ceases to be a good measure" (Goodhart's law).
Are they "getting better" at AI adoption? You betcha! Are they delivering more value? Now, that's a good question, isn't it?
You could bet that the developers would rather have many of the fixes made by hand, but because they measure how much code gets generated, it is what it is.
Would it look good at Nadella's slides? Absolutely! What does it measure exactly, though? Anything that customers care about?
I don't want to sound overly critical, especially not to the engineering teams. We will see mandates such as "do everything with AI or get fired" (which aren't that different from Bezos' "all integration goes through API or you get fired"). Then, all these AI-related metrics will be *contextually* useful.
But for any team willing to measure the impact of a product? The answer is somewhere else. A rough estimation of AI adoption may only be interesting in considering whether and how much it helps. And showing the bosses that it's not 100% (let alone 1,000%) productivity improvement, some would claim it to be.
This is a good introduction to how to measure the adoption of AI. At the end of the day, we will have to rely still in surveys, I think.
Looking forward to read more options.
CEOs ask their engineering leaders: what is the percentage of usage of AI in my company? Mainly, because they need that number to attract investors... a.k.a money.
Glad the article resonated Marcos! And yes, a lot of the pressure regarding AI in organizations come from the need of the company "looking good" to investors. And having <input some amazing AI stat> is an advantage with investors that don't really dig in to how that stat actually helps the business to be successful. So, yes, this is a lot harder to manage from engineering leaders, because raising $ is very high on the priority list of many companies.
Aren't we overdoing it? I mean the measurement? Sure, we can measure the utilization of the AI tools, how much code gets generated, etc.
Risking controversy: it's largely useless.
Measuring utilization was a red herring, even when we were measuring human work. Ask any Lean Management or Theory of Constraints folks, and they (we?) will rant about it as long as you want.
Generated code would only be interesting if the act of generation were unassisted. If a developer spent an hour writing the production-ready code and another developer spent the same hour delivering the same code but used the time to prompt, re-prompt, and review code, is there really a difference?
And it just so happens that we have (or should have) a good compound metric that allows us to integrate the vast majority of these nuances into one dimension.
*How much value are we delivering over time as compared to when we haven't used AI tools?*
In that aspect, it may actually be interesting to measure tool utilization one way or another to have a reference dimension.
But that's it. How much more value are we delivering?
And for the companies that are clueless about the actual value they create, they may use a less useful, albeit much easier to answer, question:
*How has our big-picture throughput changed?*
In other words, how many more value-adding features are we delivering?
Because, with the AI tools, I can easily optimize *some* aspects of my work by 100%. However, if it creates more work for others down the line, the aggregated gain may not be nearly as impressive.
A simple example is generating a large chunk of code quickly and shifting the cognitive load of ensuring it works well and doesn't break anything to a person conducting a code review. I just got super-fast. And the fact that the team delivers at the same (or slower) pace, well, who cares?
So, you want to know how much better your new shiny AI tools made you? Ask product people.
Thanks for sharing this Pawel. I am a big advocate of doing things on a case-by-case basis, becuase what may work for some company may not the other. I definitely wouldn't recommend to go "All In" in every metric for small-to-mid-size companies. My recommendation would be to pick a few metrics (maybe 1 or 2 for the start) and then go from there.
And yes, how much value is delivered is the ultimate key. The problem these days, especially many engineering leaders have is how to manage unrealistic AI expectations. Having at least some data helps! Appreciate you for sharing your thoughts on this!
I understand the need to manage expectations pouring from the top ranks who want to go all-in on AI, even if they don't understand the realities of their teams.
I get that we need to "give them something."
Yet most often it's like throwing the baby out with the bathwater. We get what we measure. Or, more precisely: "When a measure becomes a target, it ceases to be a good measure" (Goodhart's law).
I really like Satya Nadella's "Microsoft generates 30% of code with AI" juxtaposed with this: https://www.reddit.com/r/ExperiencedDevs/comments/1krttqo/my_new_hobby_watching_ai_slowly_drive_microsoft/
Are they "getting better" at AI adoption? You betcha! Are they delivering more value? Now, that's a good question, isn't it?
You could bet that the developers would rather have many of the fixes made by hand, but because they measure how much code gets generated, it is what it is.
Would it look good at Nadella's slides? Absolutely! What does it measure exactly, though? Anything that customers care about?
I don't want to sound overly critical, especially not to the engineering teams. We will see mandates such as "do everything with AI or get fired" (which aren't that different from Bezos' "all integration goes through API or you get fired"). Then, all these AI-related metrics will be *contextually* useful.
But for any team willing to measure the impact of a product? The answer is somewhere else. A rough estimation of AI adoption may only be interesting in considering whether and how much it helps. And showing the bosses that it's not 100% (let alone 1,000%) productivity improvement, some would claim it to be.
This is a good introduction to how to measure the adoption of AI. At the end of the day, we will have to rely still in surveys, I think.
Looking forward to read more options.
CEOs ask their engineering leaders: what is the percentage of usage of AI in my company? Mainly, because they need that number to attract investors... a.k.a money.
Glad the article resonated Marcos! And yes, a lot of the pressure regarding AI in organizations come from the need of the company "looking good" to investors. And having <input some amazing AI stat> is an advantage with investors that don't really dig in to how that stat actually helps the business to be successful. So, yes, this is a lot harder to manage from engineering leaders, because raising $ is very high on the priority list of many companies.