OpenAI Codex Lead: Deep Systems Knowledge Is Your Only Insurance Against Obsolescence; The Ultimate Destination for Top Engineers Is "Code Reviewer"

The truth about promotions at big tech companies is cleaning up the ugliest "shit mountains".

Compiled by Wang Qilong

Produced by AI Tech Base

In the Silicon Valley engineer hierarchy, a select few stand at the absolute apex of the pyramid. They don't write flashy frontends or fancy products; they lurk deep within operating systems, wrestling with compilers, build systems, and virtual file systems. Their purpose is to ensure that monolithic codebases like Meta's, with billions of lines of code and tens of thousands of engineers, don't collapse with every keystroke.

Michael Bolin is the true "janitor monk" (hidden grandmaster) guarding the foundations in big tech's technical hierarchy.

As a former Meta Distinguished Engineer (DE, level E9, the ceiling of big tech technical ladders), he once led the rewrite of Facebook's Android build system. Because the codebase was so massive, he intercepted system calls at the OS level to create a virtual file system, just to prevent engineers' laptops from crashing when running git status.

For a long time, Bolin defined himself as "a ruthless code-writing machine."

But this sovereign of traditional engineering had his worldview shattered after jumping to OpenAI to become the technical lead for Codex (the underlying model behind GitHub Copilot).

In a recent episode of The Peterman Pod, Bolin admitted with brutal honesty, tinged with self-mockery: "I barely write code by hand anymore. Eighty to ninety percent of my code is generated by AI."

We have distilled the most nerve-hitting core assertions from this conversation for you:

The "Heroism Trap" of Big Tech Promotions: Many senior engineers fail to get promoted because they always want to write perfect "toys" from scratch, rather than take over the ugly but mission-critical legacy shit mountains. To reach E9 at Meta, you must be a "politician" willing to do the dirty work.

Culture Shock: Research-Led vs Engineering-Led: At Google and Meta, engineers are absolute kings; but at OpenAI, researchers (scientists) are kings, and engineers are just "support staff" building compute infrastructure. This gap has caused countless big tech elites to suffer culture shock when jumping to AI labs.

The Ultimate Redemption for Top Engineers Is "Writing Documentation": Bolin revealed that his most critical weapon for promotion wasn't writing code at all, but "writing technical roadmaps that executives and cross-functional teams could understand."

The Moat Is Shifting in the AI Era: Although AI writes 90% of his code, because large language models are prone to "hallucinations," having deep systems understanding (like memory allocation, pointers, assembly) has become his only trump card for quickly spotting AI errors and fixing big tech architecture.

Here is the translated transcript of this conversation.

From Firefox Extensions to Google Calendar: "Wilderness Survival"

Ryan Peterman (Podcast host, former Instagram software engineer, hereinafter "Ryan"): Welcome to the show. Today we are honored to have Michael Bolin. He is the technical lead for OpenAI's open-source project Codex and was formerly a Distinguished Engineer (E9) at Meta. Michael, let's start with your early career. I dug deep into your personal website and found a project you were once excited about but now has broken links everywhere—"Chickenfoot." What on earth was that?

Michael Bolin (hereinafter "Bolin"): Oh my, that does take me back. It was actually my master's thesis project, a Firefox browser extension. You have to understand, writing JavaScript for Firefox back then was absolutely avant-garde for a graduation project.

Simply put, it was a small programming tool integrated into Firefox's sidebar. Its core concept was "end-user programming for the web." It offered macro commands like enter and click. You could type enter, pass in a string argument, and it would automatically find the corresponding input field on the webpage; you say click search, and it would click the search button.

Under the hood, we built massive amounts of extremely complex heuristics. It would parse the DOM structure of webpages, look for text like "first name" and "last name," then locate the text box nearest to those labels, and finally use JavaScript to auto-fill your input.

Looking back now, it's fascinating. Because what many AI Agents are doing today is exactly what we did in Chickenfoot—only now they use real natural language large models, while we back then cobbled it together with crude JavaScript scripts.

Ryan: So it was essentially parsing frontend code to provide an interactive console letting you manipulate webpages through natural language commands?

Bolin: Exactly. We utilized webpage accessibility tags, image alt text, and everything else we could grab. It worked particularly well on extremely bare-bones sites like Craigslist. Some friends even used this tool to write automation scripts and make money.

Ryan: Later you officially entered the industry, and your first stop was Google, where you immediately took on the Google Calendar project. What was the vibe at Google back then? What attracted you?

Bolin: This was early 2000s. When I first encountered the internet in the 90s, if you wanted to search for something, you had to open five different search engines simultaneously (like Yahoo, Lycos, etc.). I clearly remember in March 2000, my roommate told me: "Hey, there's this search engine called Google from Stanford that seems better than the others."

I tried it and found not only was the search quality high, but the page was extremely clean. Before that, Yahoo's homepage was stuffed with dizzying ads and portal links, while Google's homepage was like pure land, very restrained. This engineering culture focused on quality rather than short-term traffic made me want to work there from graduation.

Joining the Google Calendar team was perfect timing for me. At the time Microsoft's IE browser dominated absolutely, and they had even canceled subsequent IE development plans. But in this hostile ecosystem, Google was trying to break the deadlock through web applications. Back then, JavaScript engineering quality was terrible; everyone thought writing frontend was "toy code." But our team embraced extremely excellent engineers, and we were trying to bring desktop-app experiences (like dragging schedules, no-refresh loading) into browsers. This was groundbreaking at the time.

Meta Chronicles (Part 1)—"Violently Dismantling" Build Systems in Million-Line Codebases

Ryan: After several years at Google, you jumped to Facebook (now Meta), which was then at its zenith. I understand you were a top JavaScript expert there, but the first major project you took on was refactoring the Android build system. How did that happen?

Bolin: Facebook back then had a crazy hackathon culture. The company had just decided: "Mobile is the future," making it a life-or-death priority.

One day, a friend who knew JavaScript found me: "Hey, I know you're good at Java, you should learn Objective-C and do mobile. Everyone who wants to build products must understand mobile development now." This was a huge stimulus for me, because I simply didn't like Objective-C.

My initial positioning was helping the company develop a PhoneGap-like framework (an early technology for packaging web pages into native apps). I gathered a small group trying to develop mobile Facebook using HTML5. But in practice, this web-based mobile app experience was extremely terrible, performance agonizingly slow. Eventually, Mark Zuckerberg personally decided to scrap this plan, announcing a full pivot back to pure native development.

At this juncture, I faced a choice: either bite the bullet and learn the native languages I disliked, or find something else to do.

Fortunately, I found a pain point. At the time, Facebook's Android team was struggling daily. For every frontline engineer, the time between modifying a line of code, clicking "compile," and seeing results in the simulator (iteration cycle) was the core metric determining productivity.

But Facebook was using an extremely antiquated build system based on Ant (Apache's early build tool). The entire build process had no modularity, no caching. Every line of code modified required recompiling the entire massive codebase, taking extremely long.

I thought: "I know Java, this low-level build dirty work definitely can't defeat me." I originally just wanted to fix a few bugs, but the deeper I went, the more I realized this system's foundation was completely rotten and needed to be rebuilt from scratch.

Ryan: But why didn't you just use Google's existing tools? I remember Google already had extremely powerful internal build systems (later open-sourced as Bazel).

Bolin: That's an excellent question. Indeed, many people at the time, especially those who had jumped from Google, kept saying: "We had awesome tools at Google, why don't we just copy them?"

The truth is, we did try. We even recreated a micro version of Google's build system internally. But the problem was that Facebook's codebase structure was completely different from Google's.

Google's codebase was extremely standardized, with clear boundaries for every module. But Facebook's Android codebase at the time was a massive creature full of historical baggage, with various messy resource files (XML, images) and highly customized scripts intertwined. If we forcibly applied Google's system, we'd have to rewrite all the underlying logic.

More fatal was the timeline. This was a life-or-death moment for the company's mobile pivot; management's order was: "We need to improve iteration speed immediately, even just a little bit!" We simply couldn't wait a year to rebuild a perfect Google architecture.

This was the background for my leading development of Buck (Facebook's open-source build tool). We initially wrote a few Python scripts trying to cache some intermediate compilation results. But as time passed, this patching reached its limit. So at a hackathon, I decided to completely abandon Python scripts and rewrite a strongly-typed, high-concurrency build system prototype in Java.

I clearly remember when I got that prototype running, compile speed directly doubled. The entire Android team was stunned. They originally didn't care about this underlying tool refactoring, but when I compressed compile time from 4 minutes to 1 minute, everyone became a loyal believer of this new system.

Meta Chronicles (Part 2)—Rewriting IDEs and Intercepting System Calls to Combat the "Curse of Scale"

Ryan: After solving the Android build system challenge, you seemingly didn't stop. I traced your subsequent work trajectory toward even larger infrastructure—you started rewriting IDEs (integrated development environments) and even created a virtual file system. What was happening?

Bolin: This was actually a logical evolution. After I built the build system (Buck), engineers compiled faster. But new problems emerged: as the codebase expanded exponentially, traditional IDEs couldn't handle it.

We were using Eclipse. For a massive monorepo with tens of millions of lines of code, just opening the project in Eclipse would take half an hour just for indexing. During this time, your laptop fan would scream, memory would be eaten up, and the entire machine would brick.

Our engineers started complaining: "Builds are fast now, but I can't even open the code!"

There were two voices in the industry then: one was to completely abandon local IDEs and move dev environments to cloud browsers (Web-based IDE); the other was finding lightweight local editors.

We chose a middle path. GitHub had just launched the Atom editor (predecessor to VS Code). Atom was based on Web core technologies, very flexible. We decided to develop an IDE plugin system specifically for Facebook's massive codebase on top of Atom, which we named Nuclide.

We stripped out extremely performance-heavy features like language parsing, autocomplete, and jump-to-definition, running them all on remote powerful servers. Local Nuclide editors only needed to handle UI display and receiving keyboard input. This was equivalent to giving every engineer an invisible "supercomputer."

Ryan: This sounds extremely clever. But you mentioned the "virtual file system" (Virtual File System) earlier—what pain point forced this out?

Bolin: The virtual file system was born to fight the laws of physics.

You must understand, Facebook adhered to a "monorepo" philosophy. That means all code for all products (Facebook, Messenger, Instagram, etc.) was in one giant repository.

This led to catastrophic consequences: when the codebase expanded to millions of files, dozens of GBs in size, traditional version control systems (like Git or Mercurial) completely collapsed.

Imagine this scene: A new employee joins, wanting to modify just one line of frontend code. By traditional logic, they must clone all millions of files, dozens of GBs of data, to their laptop hard drive. This takes hours, and every time they run a simple git status command, the system must traverse millions of files checking for changes. This directly causes the terminal to freeze.

Facing this "curse of scale," conventional optimization methods failed. We had to go deep into the OS's lowest levels.

We created something called Eden (later evolved into Miles), a virtual file system. Simply put, we utilized Linux's FUSE (Filesystem in Userspace) mechanism, or similar mechanisms on Mac, to intercept the OS's file read requests.

When you browse the codebase on your laptop, you see the complete directory structure of millions of files. But in reality, your hard drive contains none of these files. They are all "placeholders."

Only when you actually double-click to open a file, or when the compiler needs to read a file, does our virtual file system instantly trigger a network request to "lazy load" that specific file from the server.

This was pure magic! Through this on-demand loading, clone times originally requiring hours were compressed to seconds; system status checks originally needing to traverse the entire hard drive became instantaneously responsive. At the OS lowest level, we fooled all upper-layer applications.

From E8 to E9: The "Path of Blood and Flesh" and Promotion Failure

Ryan: This is unbelievable. From build systems to virtual file systems, you solved the hardest, lowest-level life-or-death problems for big tech. This naturally leads to the most watched segment of your career—your promotion from Principal Engineer (E8) to Distinguished Engineer (E9) at Meta.

Many engineers struggling in big tech face promotion difficulties. And from E8 to E9, this is almost a "path of blood and flesh." Can you share the story behind this?

Bolin: This was an extremely painful but completely transformative experience.

Many people misunderstand senior engineers. They think as long as I'm a "10x engineer," as long as I write code faster and better than everyone else, I can promote all the way to the top.

In early levels (junior to senior), this indeed works. But once you cross Staff (E6) or even reach Principal (E8), "pure coding" instead becomes your biggest poison for promotion.

At the time I had just finished building Nuclide and early underlying tools, I was full of confidence I'd soon reach E9. But my promotion application was ruthlessly rejected.

My first reaction was extreme anger and confusion: "Aren't I the person in this company who understands these underlying architectures best? Don't I produce ten people's output alone? Why won't you promote me?"

But I later calmed down and had long conversations with several senior E9 and E10 veterans. They pointed out my problem with pinpoint accuracy: I had fallen into the "heroism trap."

During that period, I was used to discovering a problem, then going off alone, using extremely high technical skills to write a brand new tool from scratch, then running to tell everyone: "Look, I built a better wheel, everyone come use it!"

But this was an extremely arrogant approach. When company scale reaches thousands of engineers, you forcibly pushing a new tool means breaking everyone's existing workflow. You encounter massive resistance.

What truly creates impact at the E9 level isn't "building new wheels," but "solving those unclaimed, extremely ugly systemic problems, and taking everyone with you."

Ryan: In other words, you need to transform from a pure tech geek into a technical leader with extremely strong political acumen and evangelism capabilities?

Bolin: Exactly right. After that promotion failure, I put away my pride of "hand-coding." I started putting massive energy into "non-standard automation" and cross-department coordination.

Like the virtual file system Eden mentioned earlier. This project's resistance was extremely huge, because it required modifying not just the underlying client, but also full backend server cooperation, and changing thousands of people's development habits across the company.

I spent months no longer writing C++ or Java, but frantically writing documentation (Tech Specs), writing strategic plans. I had to prove to executives in documents why this extremely risky thing was absolutely necessary; I had to lobby backend storage teams to convince them developing a dedicated API for us was worth it; I had to pacify frontline business engineers, promising the new system wouldn't make them lose data when it launched.

This was essentially playing a "patchwork points" game. You need to find a company-level pain point, piece together resources from scattered teams like puzzle pieces, and ultimately win this battle.

When I finally led the team to completely clean up that extremely ugly but extremely massive legacy shit mountain involving countless interests, my E9 promotion was almost a natural result. It was the result of water flowing to the channel, not something I forcibly demanded.

Joining OpenAI, the "Culture Shock" of Research-Led Organizations

Ryan: After reaching the pinnacle of engineering honors at Meta, you made a decision that surprised many—leaving this giant where you had fought for 11 years and were extremely proficient in its tech stack, instead joining OpenAI, which was then far smaller than it is today. What drove this decision?

Bolin: In my last year at Meta, I actually fell into some degree of burnout. I had been in the underlying architecture domain too long; I could clearly see the ceiling. All optimizations had diminishing marginal returns: spending another year might only improve some system's performance by 5%. This patching work made me lose the original excitement of changing the world.

Right at that time, late 2023, the large language model (LLM) wave completely exploded. I started frantically reading papers about Transformers, attention mechanisms in my spare time. I suddenly realized this was a completely new dimension of computing paradigm.

I felt that if I missed this bus, my technical career might stop there. Just then OpenAI was recruiting senior people who understood large-scale engineering foundations, so I submitted my resume without hesitation.

Ryan: When you step from a traditional internet giant into an AI lab at the crest of the wave, what's the biggest shock?

Bolin: Absolutely heaven-and-earth "culture shock."

At traditional giants like Meta and Google, their core culture is "engineering-led." What does this mean? It means software engineers (SWE) are the company's core assets. Product managers propose requirements, engineers decide architecture and schedule, then write the code and launch. All halos, resources, and promotion channels are built around engineers.

But the moment you step into OpenAI, you immediately sense different air—this place is "research-led."

Here, the true core trump cards are researchers with mathematics and physics backgrounds. Their daily work is deriving formulas, adjusting model structures, observing loss curves.

And we big tech top engineers with glamorous resumes, to some extent, became "support staff."

Our duty is no longer deciding product direction, but like logistics troops, desperately building the most extreme distributed compute clusters, optimizing GPU memory utilization, building data cleaning pipelines. All our engineering efforts are to allow those researchers to run their next experiments.

If you're an engineer desperately craving personal heroism, yearning to control products, this gap will cause you extreme pain. But I personally very much enjoy this state. Because in this completely new domain, I'm already a "primary school student." I put down my E9 airs, started fresh learning how to dialogue with scientists, how to translate those wild mathematical theories into stable-running C++ and CUDA code on thousands of H100 GPUs. This is too exciting.

Ryan: This inevitably leads to the core project you're responsible for at OpenAI—Codex (the brain behind GitHub Copilot). I heard Codex originally emerged from a hackathon?

Bolin: Yes, that was an extremely crazy weekend.

Actually OpenAI internally had engineers using early code generation models to assist work. But at an internal hackathon, a few of us suddenly thought: "What if we package this model as a tool directly callable from the command line (CLI)?"

I spent very little time writing a crude terminal wrapper. The results shocked everyone's jaw off. You just type in English in the terminal: "Help me rename all files with .txt extension in the current directory to .md, and remove spaces from filenames," and it instantly generates a perfectly running Python or Bash script.

This tool immediately spread virally within the company. It proved one thing: AI doesn't just teach you to write code, it can directly do the chores for you.

This is also why we later decided to completely open-source the Codex test suite (Harness). We want the entire open-source community to see this isn't magic, but real engineering standards. We're trying to define the evaluation standards for future AI coding assistants this way.

Ryan: As a top underlying systems engineer, how much time do you spend coding daily now?

Bolin: This question hits the nail on the head, and is also what I've reflected on most recently.

The former me, as I said, was a code-writing machine. I extremely enjoyed that thrill of flying fingers across the keyboard, line by line knocking out exquisite logic.

But now, if you asked me: "Of the code you wrote yesterday, how much did you type by hand?" I would very honestly tell you: "Probably less than 10%. Many times this number approaches zero."

My daily workflow has completely changed. What I do now is: open the editor, write an extremely detailed English comment (Prompt), describing the data structure, interface logic, and edge cases I need to implement. Then I press a hotkey, watching AI "spit out" hundreds of lines of code in seconds.

Then my role transforms from a "writer" to a "reviewer."

I use my twenty years of accumulated systems engineering experience to examine if the AI-generated code has memory leak risks, concurrency deadlock hidden dangers, or ignores certain edge test cases. If so, I point out the problem and let it regenerate.

What makes me feel most relieved is that AI handles what I used to hate most—writing unit tests and configuring CI/CD scripts. Now I just say: "Generate 100% coverage test cases for the above function," and it can generate extremely complete test code, even automatically generating fake mock data.

This gives me massive amounts of time to think about higher-dimensional system architecture design, rather than wasting life in syntax details.

Old Soldier's Final Advice: Dimensional Reduction Thinking, Dimensional Elevation Striking

Ryan: This is truly shocking. For young engineers still in school or just entering the workforce, hearing an E9 veteran say he no longer handwrites code might cause them to panic: "Since AI can write code, what's the point of me desperately studying data structures and algorithms?" What advice would you give them?

Bolin: I completely understand this panic. But I must make one point clear: deep technical fundamentals will remain your sturdiest moat for a long time.

Why? Because current AI is essentially still a probability-based model. It hallucinates; it generates seemingly perfect but actually fatally buggy code.

If a junior engineer relies completely on AI, when systems crash online facing gigs of logs and garbled stack traces, they'll be helpless. They fundamentally don't know what happened underneath.

The reason I can confidently rely on AI to write 90% of my code is precisely because I have that 10% underlying control. I understand C++ memory allocation principles, I understand OS thread scheduling, so I can instantly judge whether AI is "talking nonsense" with one glance at the generated code.

In this era, AI is an infinite-ammo machine gun in your hands. But if you don't know where to aim, or even how to handle jams, this gun becomes a weapon that kills you.

Ryan: Regarding personal ability improvement, what special experiences can you share?

Bolin: I strongly suggest all engineers wanting to improve systems depth go play with CTF (Capture The Flag, cybersecurity competitions).

This is my personal little secret. When solving those extremely perverted underlying security vulnerabilities, you're forced to deeply understand assembly language, register workings, every byte of network protocols. This goal-oriented, puzzle-game-like training method is a hundred times more effective than dryly gnawing on textbooks.

Also, if I had to recommend technical books, I push two. One is about underlying OS (like the classic dinosaur book), and another is about technical writing.

Ryan: Technical writing? That doesn't sound very "hardcore."

Bolin: This is precisely the most advanced form of hardcore.

Like I said when summarizing how to promote to E9: When you're trying to leverage a cross-departmental project with hundreds of people, writing good code is useless. You need extremely clear, persuasive, logically rigorous business/technical documents to convince those VPs and finance directors who don't understand code at all.

If you only know how to write code, you're at best a senior craftsman; but if you can use words to build a grand technical vision and make everyone willingly follow you, you're a true technical leader.

Ryan: Thank you so much, Michael. You've shown us an extremely vivid personal microcosm crossing from the traditional software era to the AI era.

Bolin: Thank you for having me. I'm very happy to share these pitfalls I've stepped in, blood I've shed, with everyone here. Wishing all engineers still debugging late at night, good luck.

OpenAI Codex Lead: Deep Systems Knowledge Is Your Only Insurance Against Obsolescence; The Ultimate Destination for Top Engineers Is "Code Reviewer"

Related Articles

分享網址