How OpenAI's Codex Team Works and Leverages AI
Insights on how OpenAI's Codex team works, leverages AI, their team structure, development philosophy, and more. Based on my conversation with the Engineering Lead at OpenAI Codex team
This week’s newsletter is sponsored by Redis.
AI Isn’t Slowing Down. But the Constraint Is Shifting.
Redis has forecasted that the next wave of AI failures won’t come from weak models, but from poor context engineering. Getting context engineering right will be a make-or-break for companies building AI apps: reducing latency, controlling cost, and enabling scale.
Their 2026 predictions break down what engineering leaders must prepare for:
Agents won’t struggle with reasoning. They’ll struggle with finding the right data.
Agent frameworks that stay open, extensible, interoperable, unfussy, will be the ones that stick.
Everyone will become a coder, and the volume of apps will spike.
Before latency and cost become the constraint, read the 2026 predictions.
Thanks to Redis for sponsoring this newsletter. Let’s get back to this week’s thought!
Intro
AI is no longer just a tool engineers use, but it’s also heavily influencing how engineering teams are structured, how decisions are made, and how software gets built.
I’ve seen this especially in startups and mid-size companies trying to adjust the engineering org structures by creating AI-native engineering teams, but in my opinion, few teams have truly embraced it like OpenAI’s Codex team.
Behind products like the Codex app, IDE extensions, and the open-source coding agent is a relatively small group of around 40 people operating with an unusually high level of autonomy, speed, and trust.
Their work spans from low-level systems engineering, large-scale distributed infrastructure, product design, research, and user-facing experiences, and in all of the work, they heavily utilize AI.
What makes the Codex team especially interesting isn’t just what they build, but how they work. They’ve embraced AI as a core layer of their workflow: from planning and onboarding to code review, testing, and prioritization.
It’s a team with minimal hierarchy, very few meetings, and a single product manager to move at a pace that would otherwise seem unrealistic.
I recently had the pleasure of speaking with Thibault Sottiaux, Engineering Lead for the OpenAI Codex team.
And this is the second part of the 2-part article. Make sure to also read the first part to learn about what exactly AI-native teams are and how to build them.
And in this article, we’ll go through how OpenAI’s Codex team works and leverages AI → their team structure, development philosophy, use of Codex internally, and the lessons other engineering teams can apply as we move toward an AI-native future.
Let’s start!
The Codex team is around 40 people
And inside the team, there are many smaller teams that work on different projects, some including: Open-source coding agent, Codex IDE extensions, Codex app, and others.
They run with a strong emphasis on empowerment and local decision-making. It’s closer to a modern version of Bell Labs.
Individuals are trusted to make decisions because the pace of change demands it.
The Codex team has 1 Product Manager for the whole team, 2 designers, and others on the team are engineers with various expertise.
Here is an example that Thibault has mentioned of how effective their PM is.
How their PM works very effectively
The Codex team is effectively run by a single product manager. The PM has scaled himself into something like a 100× PM by using Codex. Watching him work is unreal, it’s on another level.
He uses Codex to dig through user feedback, triage issues, and prioritize work in real time. During a one-hour bug bash session recently, the whole team was testing the app and logging issues. As the issues came in, the PM was instantly categorizing them, setting priorities, and assigning them to owners.
They went through over 100 issues in that single hour, and most were fixed within 24 hours. That kind of speed and coordination came from the PM’s planning and decision-making, but it simply wouldn’t be possible without AI.
The work they do and the profiles of engineers working on the team
A lot of the work is low-level systems engineering, mostly with Rust.
The work is done on the Codex harness, which is an internal software framework and runtime logic that powers the Codex AI coding agent, essentially the part of the system that orchestrates how Codex behaves, interacts with users, and runs tasks consistently across different interfaces.
And then there’s work on the open-source repo, Codex CLI coding agent, a tool you can run locally to help with coding tasks directly from your terminal.
On top of that, there’s backend infrastructure that connects the GPUs to the systems that run the models.
They also work on the Responses API, which is the core API interface for generating model outputs which is also how Codex talks to the models to do things like read files, run commands, or analyze code.
On top of that, they’ve built their own backend that handles the authentication, usage tracking, and sending requests and responses back and forth, so everything works reliably at scale. It’s a large distributed systems effort, and the team maintaining it is on on-call duty.
Then there are frontend and full-stack engineers building all their user-facing experiences where people actually use Codex.
They also have research profiles on the team, who work on various research to ensure the best OpenAI models are created.
Development process
They start with an overall strategy, and from there, the focus is on inspiring people and giving them the autonomy and ownership to do their best work, while still holding them accountable.
The most successful projects usually come from very small teams, often just two or three people. Sometimes an entire feature is built by a single engineer who owns everything end-to-end:
planning, implementation,
launch, positioning, communication,
and then iterating based on user feedback.
That kind of full ownership is strongly encouraged on the Codex team. Most ideas also come bottom-up, driven by people on the team getting excited about trying something new.
They also work very closely with research teams. Often, they’re inspired by new research directions or things they know are coming, and they build ahead of time, whether that’s foundational infrastructure or new product experiences.
It’s overall a very creative process for them. No one has fully figured out the right way to supervise and steer agents yet, and the Codex app is just one expression of their thinking.
Their process also contains a lot of experimentation and ideation. Not everything ships, but people are encouraged and rewarded for exploring new ideas, and they only launch the very best ones.
They keep things very lean and minimize meetings as much as possible
When meetings do happen, they’re usually spontaneous and driven by real needs. The office is set up to encourage in-person collaboration, and the leadership team makes itself extremely available so issues can be resolved in minutes or hours at worst.
They make decisions very quickly. The cost of making mistakes is much lower because Codex is always available. They can try things, observe what happens, and change course fast if needed.
As a result, they optimize heavily for speed and velocity. A lot of the processes that worked a few or even two years ago just don’t scale anymore, so they’ve essentially reinvented how they work.
The onboarding buddy for new engineers on the team is Codex
This is what I have already mentioned in the article: How to Build AI-Native Engineering Teams.
Thibault shared that Codex walks new hires through onboarding, sets up their entire computer, and helps them understand the codebase, projects, and existing features. It basically acts like a highly skilled engineering mentor.
From initial setup to being productive, most of the onboarding process is handled by the AI tool. As a result, onboarding is much faster than it used to be.
At OpenAI, they also have a strong culture of shipping on day one. It’s really important for every engineer to provide value as soon as possible.
Normal traditional teams have the goal to ship code to production within the first week. In their team, that should be on the first day.
With that kind of culture and systems in place, new hires can arrive with no prior context, quickly understand the system, and ship meaningful features on their first day.
Now, let’s go to the part on how they leverage AI to be productive.
How Codex team uses AI
They use Codex internally for basically everything.
They’ve built a lot of custom skills that are specific to developing Codex itself. For example, one skill lets Codex run full QA on its own builds. When they ship a new CLI build, Codex can check different features and verify that nothing broke.
Internally, they also use sub-agents a lot. It’s still an experimental feature, but they rely on it for massive parallel testing and for large-scale refactors.
On the planning side, Codex is connected to tools like Linear, Notion, and Slack, which help them collect and synthesize user feedback much faster.
Their feedback channels are extremely active, especially the Codex one. It’s hard for any human to keep up. Codex summarizes the daily themes and helps engineers prioritize what to work on next.
If you walk around the office, you’ll see Codex on almost everyone’s screen. It’s editing videos for YouTube, prioritizing Linear tickets, and running large refactors, such as their recent Python-to-Rust rewrites. Codex is absolutely central to how they work.
Codex automatically reviews all pull requests
They’ve set Codex up with custom instructions so it enforces very specific standards. It’s making sure the right structures are followed, module boundaries are respected, semantics are correct, and coverage meets the bar they’ve set.
All of that happens automatically on every PR. On top of that, they do a lot of local code review with Codex as well. Engineers can run a simple “review” command and iterate on it in a loop.
With sub-agents, this has become even more common, as people will often have multiple sub-agents review a PR before it’s ever shared. This process regularly surfaces small issues and improvements they would have otherwise missed.
A lot of time also goes into maintaining a healthy, well-structured codebase so they can keep moving fast over the long term.
What they’ve learned is that without the right guardrails and structure, it’s easy to drift in the wrong direction and eventually slow yourself down.
Codex helps them keep quality high while still accelerating.
How they structure their prompts to use AI effectively
Of course, the Codex team uses Codex, but in your case, you can use your AI tool of choice, and Thibault’s recommendation is to start with the following.
Have deep design discussions with the model
That’s also why they have shipped Plan Mode, for users who want more structured help. Plan Mode invites the model into a back-and-forth conversation rather than a one-shot answer.
It helps surface requirements you didn’t even realize were missing.
Often, what you ask for is underspecified. For example, saying “make the backend faster” sounds clear, but it’s not.
In Plan Mode, Codex will ask follow-up questions:
Which endpoints are on the critical path?
Can we look at the logs?
What trade-offs are we willing to make?
That conversation helps clarify what you’re actually trying to optimize.
The next step is to keep asking critical questions
Once you land on an implementation, dig into why it works, how it works, and what the alternatives are.
One of the great things about AI is that trying different approaches is almost free → you can explore multiple implementations in parallel and make informed trade-offs.
And the last thing, it’s recommended to actively look for bugs
No software is bug-free. Thibault has also mentioned that every time he does a deep bug-finding session with Codex, he uncovers something, maybe not a bug yet, but a latent issue that’s worth improving.
One additional thing is that they also encode many of their best practices into reusable skills and share them across OpenAI. They now have hundreds of these skills.
They significantly speed up onboarding for new teammates, because they’re starting with an already optimized, “upgraded” version of Codex rather than a blank slate.
This has become an important way they scale knowledge and accelerate how quickly people can be effective.
A lot of their code is AI-generated through Codex
There are still cases where code needs to be edited manually, but that’s becoming less and less common. Internally at OpenAI, the Codex app has been very successful, and a lot of their design, implementation, and day-to-day work has shifted into it.
As for the exact percentage of manual coding versus using Codex, they don’t have a precise number they can share at the moment.
What they can say is that the Codex app acts as a companion to other tools, especially IDEs. It can use the IDE context, like which files are open, so running both side by side works really well.
In practice, it’s becoming clear that a large portion of the work gets done directly in the Codex app, with manual edits used mainly for fine-tuning. The two workflows complement each other extremely well.
OpenAI likes to hire engineers in 2 extremes
And in the end, let me share an interesting insight from recently talking with Sulman Choudhry, Head of Engineering at OpenAI.
I’ll be doing a deepdive on the overall OpenAI engineering culture soon, so stay tuned for that!
Interestingly, Sulman mentioned that OpenAI likes to hire engineers in 2 extremes:
Really great generalists
Experts in a really specific thing who think outside of the box
I really like that, as I see big benefits from having great generalists on the team, as they can multiply the effort of everyone and can tackle many different problems.
And then there are a lot of benefits to having people who can think outside of the box and challenge the "status quo". I like to see such people as "disruptors", who can really make a difference with their specific knowledge and their unique point of view.
I think more companies will go such a route, and both are valid paths for engineers.
My advice is to play to your strengths and decide whether you want to be a great generalist or really focus on being exceptionally good in 1 specific thing.
Last words
What stands out most about the Codex team isn’t just the technology they build, but the way they’ve reimagined how an engineering organization can function in an AI-native world.
They operate with a high level of trust, autonomy, and speed → decision-making is fast, ownership is clear, and individuals are empowered to build end-to-end.
And they have truly embraced AI as a way to make them more productive. AI isn’t treated as a productivity hack or an assistant on the side, it’s a first-class teammate that shapes planning, execution, review, onboarding, and iteration.
The result is a team that moves with startup-level velocity while working on some of the most complex systems in the world.
Liked this article? Make sure to 💙 click the like button.
Feedback or addition? Make sure to 💬 comment.
Know someone that would find this helpful? Make sure to 🔁 share this post.
Whenever you are ready, here is how I can help you further
Join the Cohort course Senior Engineer to Lead: Grow and thrive in the role here.
Interested in sponsoring this newsletter? Check the sponsorship options here.
Take a look at the cool swag in the Engineering Leadership Store here.
Want to work with me? You can see all the options here.
Get in touch
You can find me on LinkedIn, X, YouTube, Bluesky, Instagram or Threads.
If you wish to make a request on particular topic you would like to read, you can send me an email to info@gregorojstersek.com.
This newsletter is funded by paid subscriptions from readers like yourself.
If you aren’t already, consider becoming a paid subscriber to receive the full experience!
You are more than welcome to find whatever interests you here and try it out in your particular case. Let me know how it went! Topics are normally about all things engineering related, leadership, management, developing scalable products, building teams etc.












technical skills get you hired, but how you handle humans and trade offs decides if you actually lead. The hard part is staying pragmatic when ego and shiny tech pull you toward over-engineering, but long term the people who keep it simple, ship, and protect their team’s energy usually win.