AI Can Already Write 80% of Code, But Agents Have a Fatal Flaw! OpenAI Codex Tech Lead: Asking the Wrong Question is Worse Than Not Knowing How to Write

Image

Translated by He Zike | Planned by Tina | Edited by Cai Fangfang

"Most engineers adapt to tools, while a few rewrite them out of dissatisfaction."

Michael Bolin, the technical lead for OpenAI Codex, is a classic example of the latter. From Google Calendar to Facebook's Buck build system, the Eden virtual file system, and now OpenAI's Codex, this engineer's career trajectory has spanned the key evolution of software engineering infrastructure over the past decade.

In a recent episode of The Developing Dev podcast, host Ryan engaged in a deep dialogue with Michael Bolin, covering 20 years of engineering practice. In this interview, he reviewed his transition from a "JavaScript engineer" to leading the development of tool systems, candidly discussing misjudgments, capability boundaries, and the cost of growth. More importantly, he attempted to answer a question all engineers are currently facing: In an era where AI is reshaping development methods, which capabilities are still worth persisting in, and which must be re-understood?

In his view, the real gap has never been about the speed of writing code, but rather what problems you choose to solve and how you define a "better system."

Several key takeaways worth noting include:

  • Many engineering breakthroughs stem from "dissatisfaction with the status quo" and rapid hands-on verification.
  • An engineer's influence ultimately depends on whether they solve problems the company truly cares about.
  • In the age of AI programming, 80%-90% of code can be generated by models, but critical parts still require human oversight.
  • Compared to writing code, the ability to ask the right questions is becoming increasingly important.
  • In the long run, the execution of programming agents will migrate more to the cloud rather than remaining local.
  • While AI seems to know everything, the ability to understand how underlying systems work remains crucial at this stage.

The following is a translation of the entire interview, edited by InfoQ without altering the original meaning.

From Engineer to Tool Creator: A Growth Path Driven by Questions

Ryan: I dug deep into your website and found that almost everything has a lot of information to mine. There was one project you poured a lot of heart and soul into, but now all the links are dead, and I can't find any related data. So, what exactly was Chickenfoot?

Michael: That's a long story. It was my master's thesis project, a Firefox extension, and one of the very few Firefox-based theses written in JavaScript at the time. It was essentially a small programming tool embedded in the Firefox sidebar, like a real-time interpreter that end-users could call upon at any time. The core concept was to achieve web programming.

It included functions like "enter" and "click." When calling these functions, users needed to pass string parameters, and it would automatically locate the input box; for example, entering "click search" would execute a click operation. The bulk of the work on this project was building the underlying heuristic algorithms: when inputting "enter first name," it would identify the keyword "first name," locate the nearest text box, and then use JS to convert it into input content.

Looking back, it's quite interesting. Much of what we did then is actually very similar to the principles behind current AI programming assistants—except now we've truly achieved natural language processing, without relying on the JS workarounds of the past.

Ryan: Interesting. So its function was to parse the front-end interface and translate user instruction descriptions into corresponding operations through the interactive interface.

Michael: Exactly. We used techniques like accessibility labels and toggles to turn text into functions, and this scheme worked particularly well on Craigslist—after all, that's one of the simplest websites. And I actually had a friend who used this tool to automate tasks and even made real money thanks to this efficiency advantage, which was indeed very interesting.

Ryan: You joined Google right out of the industry and participated in the Google Calendar project with great enthusiasm. What attracted you to Google? What memories did that experience leave you with?

Michael: I got into the internet back in the 90s. I remember browsing the web and having to switch between a bunch of search engines to find the information I needed. I still clearly remember in March 2000, my roommate told me, "Hey, there's a new search engine that looks pretty good." It was hosted under a Stanford University domain at the time.

I immediately realized Google Search was superior. After studying it closely, I found its interface was worlds apart from other search engines—Yahoo's interface was cluttered and messy, while Google deliberately pursued minimalist design, showing much stronger principles. Later, as everyone knows, many companies started emulating this style. Then people around me started going to work at Google, and I thought: "Okay, they are hiring a bunch of top talent."

I really hoped to work with those people; they truly, truly understood the Web. In contrast, Microsoft at the time didn't understand it at all and announced the discontinuation of the IE project. I felt this was the portal to the Web, yet Microsoft planned to dismantle it. Google was clearly more forward-looking regarding the Web, plus the engineering quality of the projects and the massive impact they brought ultimately made Google my most desired destination upon graduation.

Ryan: What was the cultural atmosphere like at Google back then? I remember you mentioned in an article that there was also a divergence between product and infrastructure business lines at Google.

Michael: Many companies, especially after reaching a certain scale, have a very obvious tendency: the business area that the founding team was originally best at and most successful in often gets "favored" for a long time.

This was very typical at Google. Whether it was information retrieval or underlying infrastructure, these were the core capabilities supporting the company's growth and naturally held higher status internally.

The reason I was attracted to Google initially was largely because they launched products like Gmail. These products seemed more "user-experience-oriented" at the time, with directions that were relatively open and full of imaginative space. But within the company, their status could never compare to core businesses like Search.

For example, the Google Calendar project I participated in was mainly aimed at the consumer market, although it also had some enterprise sales scenarios. But from a business perspective, it wasn't the company's revenue core. To some extent, we were more like a "service-support" product team rather than the type of business that directly generates revenue. That was roughly the situation back then.

Ryan: You eventually left Google. Judging from your posts, your time at Google was a bittersweet experience. So what prompted you to choose to leave?

Michael: I worked at Google for four years. To be honest, the key factor that made me choose to leave was personal planning. First, after working for four years, I had some spare money, so I had more choices to consider. More importantly, I discovered I had a bad habit: I always liked to pour all my energy into projects that were very important to me personally but not necessarily important to Google. For instance, the Calendar project I was responsible for fell into this category.

Later, I switched to working on the efficiency tool Google Tasks, which was a tiny functional module under Calendar. The user base was two orders of magnitude smaller than Calendar, but I was still passionate about it. Equally fascinating to me were the JS infrastructure and the Closure tool suite. These projects were certainly exciting, and I enjoyed the development process and felt proud of my contributions—I even wrote a book specifically for Closure, so I was full of drive. But from a career development perspective, this wasn't the wisest choice, right?

I thought to myself, I'm dealing with high-quality engineering problems, so why is it always others getting recognized? What's the point of working so hard? Maybe choice really is greater than effort, so I realized perhaps it was time to try a different path: either focus on the career I'm passionate about or devote myself to the direction the company values most.

Ryan: Later, you returned to a big tech company, joining Facebook at the time. I understand you were already an expert in the JS field, and your first major project at Meta was building a toolchain for the Android codebase. Can you tell us the behind-the-scenes story of your involvement in this project?

Michael: At that time, there was a very clear direction within the company: Facebook was going to make a phone.

Although there had been some failed attempts before, the atmosphere this time was clearly different; everyone generally felt "this time we're really going to make it happen." The company's plan was to partner with HTC to customize Android and make some new attempts on that basis.

For me, having just joined the company, this was very exciting. I had done quite a bit of Java-related work before—although overall I leaned more towards JavaScript, but this project gave me the opportunity to touch more Java.

At that time, there was also a direction inside the company called "Face Web," which was essentially hoping to move the HTML5 and Facebook Web experience directly to mobile phones. But very quickly, everyone realized this path wouldn't work. Meanwhile, one thing became increasingly clear: the mobile end would become the key battlefield determining the company's success or failure.

It was also around this time that a friend told me: "I know you really like JS, but it's best to devote more energy to drilling into Java or Objective-C now, otherwise switch to product management." Looking back, this was indeed a very important and very timely piece of advice.

I thought to myself: I really don't like Objective-C, Java is better. So, I joined the project. Our timeline was extremely tight at the time because, unlike most other projects, this one had a hard deadline. Other projects could be released once completed normally, but this one required submitting results to HTC on time to ensure they had enough time to burn the code into phones around March 1st.

So the whole process was a mad dash, and the initial Android codebase was actually taken directly from an outsourced contractor hired by Google. Big tech companies are pretty much the same; Facebook didn't want to develop native apps themselves, so they paid someone else to do it. As a result, after the app went online, the contractor washed their hands of it, dumping the code on us. Actually, Google should have just thrown away that pile of junk they had before, but they kept holding onto it—we received that mess, and the requirements for iteration speed were quite high.

I believe friends who have done long-term web development are used to the workflow of editing first and then refreshing. But the Android build system was... particularly rough. Build tools like Ant simply couldn't be modularized; we could only try to forcibly split them into four or five modules.

Every development cycle was painful. I thought to myself: We must reorganize this build system. I've done quite a bit of work in Java; it shouldn't be this slow, it shouldn't be this laggy during iterative builds. Facebook has a hackathon culture, so I decided to organize a hackathon to directly build a new build system. Aiming for the style of Google's build system, let's get it done cleanly and efficiently.

Actually, there was another build system in the company at that time called FB Build, which was itself a "knock-off" version of Google's build system. It was written in Python and only supported C language; but at that time I thought, either fix this thing, or lie flat and rot... or just quit. After all, fixing old projects is the most annoying.

Ryan: If you hadn't fixed it, would you have quit?

Michael: At least I would have applied to switch projects, or found a way to make myself happy, so I could come to work happily every day. I come to work to write code, to get things done, and to maximize my abilities as much as possible.

It's quite interesting. Looking back, I still admire the people around me at that time—almost everyone told me that what I was doing was a terrible idea. Basically, no one was optimistic about it, except for one person.

At that time, I held the position of Senior Android Engineer, and no one really stopped me. People would express opposition, but they wouldn't directly say "no." This felt a bit different from my experience at Google—at Google, many things would be explicitly vetoed.

So I continued in this direction. Very quickly, we produced a significantly better version—performance improved by about double. Once the results came out, everyone's attitude changed accordingly. Many people started to realize: "Okay, this direction is indeed better, so let's go with this."

Ryan: I've noticed a certain inertia in big tech companies where no one wants to touch those mountains of unfavorable factors. Many other engineers noticed the same issues, but as long as they felt there was a way to solve it, they were too lazy to start a new project from scratch. Moreover, Google has competing products over there, so maybe we can't win here at all. So what made you firmly believe your project could defeat opponents and become the priority option in the market?

Michael: There were a few main points. First, as I said, I've worked on other Java projects, so I felt the original build tools shouldn't be this slow. Or rather, from the perspective of a software engineer's experience, this kind of underlying implementation shouldn't be so inefficient. Actually, most of the opposition was based on rigid logic: if we deviate from the standard solution, we will lose standard support. Or, what if next week the standard performance improves by 100 times, but the new solution can't inherit it, what then?

Thinking about it carefully, it's ridiculous. After all, Facebook engineers themselves have developed their own PHP virtual machines and languages, and they clearly embraced innovation before. I don't know why this time was an exception. What I want to say is that the entire mobile project at that time was wading across the river by feeling the stones; my heart was filled with anxiety and severe time pressure.

Senior personnel seemed to want to treat this project as a scientific experiment, but the problem was we were facing a hard deadline. But could this really maximize the use of our time? Fortunately, it succeeded in the end. Additionally, I deliberately downplayed the infrastructure nature of the project at the time, only emphasizing "building a build system for Android," and absolutely dared not expose too much business expansion ambition.

I didn't plan to touch other people's project cakes because that kind of thing would definitely trigger more friction. So in the design, I considered making it support more teams, and never forced promotion. Until about a year later, the iOS team actively came over and asked, "Our build system is too terrible; can we collaborate to work on Buck together?" I responded on the spot: "Of course, come on, friend."

Ryan: This point is very interesting—you lacked credibility when you first joined, as all newcomers have to go through this process. Later, to advance your own project, you had to gain support from more people. And everyone said don't do it this way; you had to convince them and make them believe this was the right direction. So on what basis did you drive this change under the condition of lacking credibility?

Michael: I actually took a shortcut. At that time, I had a colleague named John Perlow; he was also a Senior Android Engineer and also from Google. He said, "Yes, you should do this; act quickly before anyone opposes it." He also mentioned, "If you get it done, I'll support you." With such an early supporter and a highly productive programmer, my development cycle really did speed up.

He was one of the earliest people to affirm me, helping me a lot. But I also have to admit I made big mistakes back then. You mentioned lacking credibility—when I first jumped from Google, I thought this place gathered elites from Bell Labs, pure top talent. But after arriving at Facebook, I discovered, aren't these people just college graduates... What do they know about technology?

But as mistakes happened, I gradually developed respect for them. For example, when I talked about how things were done at Google, they would often say "We don't care at all"—and in most cases, they were actually right. Yes, methods that work effectively in some places may not work equally well in others.

Ryan: One last thing I want to ask: the tool you developed performed several times better than similar solutions. How was this technical intuition formed? What key designs did you make to make it so efficient?

Michael: I think the most critical point was that I really sat down and sorted through the implementation logic of Google's tool from beginning to end: What exactly is it doing? Where is the problem?

Very quickly, I discovered it had a huge problem: whenever there was any change, it would rebuild everything from scratch from the beginning. This was the root cause of its slowness, especially in incremental build scenarios, where performance was very poor.

So I started breaking it down at a deeper level: Which things depend on which? Which steps actually don't need to be repeated? There were indeed some complexities here, such as Android's resource processing, which is itself quite special and the logic is very complex. Precisely because it was complex, the system initially adopted a "simple and crude" strategy: once there was a change, clear everything and start over.

But when I truly went deep into it, I found it could be optimized—if certain inputs hadn't changed, then the results of the corresponding steps could be completely cached and didn't need to be re-executed. Once this caching mechanism was introduced, the overall speed improved significantly.

Another issue was modularity. At that time, the system basically only supported four modules; once someone wanted to add a new module, they needed to write about 200 lines of XML ANT build scripts, and almost no one truly understood these configurations. The result was that no one was willing to do module splitting—because once you did, you were responsible for maintaining that pile of complex configurations.

And one very important thing Buck did was to make the act of "adding a module" very simple. So, everyone became more willing to split modules, and the number of modules increased accordingly. After the modules increased, builds could be executed incrementally with finer granularity, further improving overall efficiency.

So essentially, this was not just a technical optimization, but a change in mindset.

Ryan: To put it plainly, it's about reducing repetitive labor.

Michael: Exactly.

"Choosing the Right Problem": Refactoring IDEs and Virtual File Systems

Ryan: After solving the Android build aspects, you turned to other directions within the company. I noticed you started participating more in IDE-related work. What problems did you see in the IDE field at that time that prompted you to want to go deep into doing things in this area?

Michael: After finishing Buck, I briefly went to do some iOS Messenger development. At that time, I thought: I've done Android, might as well expand and try another direction. Although to be honest, I've never really liked iOS development—and still don't.

Many people may not know that Objective-C had an early mechanism called ARC (Automatic Reference Counting). Nowadays, these things are basically handled automatically by the compiler, but in earlier times, developers needed to manage reference counts manually. For example, every time an object was created or a reference added, you had to write code to maintain it yourself. Many people nowadays haven't seen this kind of code, but the iOS Messenger code we took over through acquisition back then was very old and written in this way, making maintenance very painful.

If it were now, perhaps tools like Codex could help clean up this code, but at that time, we could only grit our teeth and change it. Plus, the user experience of Xcode itself didn't suit me very well, such as the design of separating header files and implementation files, which I've never really liked.

Additionally, from a broader perspective, whether Android or iOS, Facebook's app scale was always the largest. We basically stuffed all functions into a single App for unified release. This is different from Google's strategy—they would split into Drive, Sheets, and other apps, and because they control the platform themselves, they can pre-install a whole suite of apps in the system.

The result of this difference is: Facebook always hits the scale bottleneck of mobile development tools earlier than other companies.

This is painful for development, but from the perspective of development tools, it's actually quite interesting—because we were forced to solve problems others hadn't encountered yet, and these problems weren't "research projects," but directly affected the business. The problem with Xcode was similar. We communicated with Apple at the time, and the feedback was: "Xcode can't really support us at this scale." But their response was: "Then your project shouldn't be this big; it should be split smaller." In this situation, self-developing tools became somewhat reasonable.

At that time, my thinking was also very direct: What does an IDE essentially do? It's nothing more than dealing with compilers (like Clang) and language services. So we can completely build a better-to-use "shell" on top of this.

Moreover, the company was also shifting from Git to Mercurial for version control at that time, and I realized that mainstream IDEs were unlikely to natively support these customized needs. Plus, we were already using Buck as a build system; these were all highly customized internal tools, and Xcode couldn't possibly support these "Facebook-specific" workflows well. So, looking at the whole picture, investing energy to create a development experience more suitable for us made sense.

In contrast, I didn't have similar motivation on the Android side because IntelliJ itself was already done very well, and we had already found a usage pattern that could support large-scale development. But on iOS, Xcode was indeed harder to use at that time.

Ryan: So at that time, on one hand, Xcode wasn't working and didn't meet your needs; on the other hand, there was actually another IDE in the company being made by another team, right? I remember it was a Web IDE?

Michael: Yes, it was a Web IDE (laughs). I'm not laughing at the direction itself; the direction itself is fine. The problem is, it was built on an abandoned Google open-source project, and this project was written in GWT (Google Web Toolkit)—meaning you write code in Java, and it automatically generates JavaScript.

At that time, I actually tried to continue building on their foundation, attempting to establish some "credit" first, and even helped them optimize some build speeds.

But later, I returned to a familiar judgment method: look at iteration speed, look at the tech stack itself. Then I discovered—this project itself was built on an abandoned open-source project, and it was using technology like GWT. Meanwhile, our company was already the birthplace of React.

Then the question arose: Why don't we let our development tools be built on the tech stack we are best at and recognize the most?

My first reaction at that time was: You guys actually chose the wrong path. So I did something very similar to what I did with Buck before—I started a new project myself. However, my strategy at that time was the same: no direct conflict, no attempt to "take over everything." I just said: "I'll make a new editor over here, but we'll just focus on the iOS scenario first."

Ryan: You were deliberately avoiding friction. But that Web IDE already had many users back then, right?

Michael: That's right, about a thousand engineers were using it.

Ryan: But in the end, the company chose your route (which later became Nucleide). You had almost no users at that time; why would they choose you in the end?

Michael: I think there are two main reasons.

First, the technical route itself. One point I emphasized at that time was: we are making a desktop application (desktop IDE), not a Web IDE. Because if you really want to replace Xcode, developers will definitely hope: to be able to connect directly to the simulator, plug in the phone for debugging, and operate the local environment directly. Theoretically, the Web can do these things, but the cost would be much higher and much more complex.

Second, "historical credit." Buck succeeded before, so everyone was willing to bet once again. Simply put: "You succeeded last time, so we can try again this time."

Ryan: It seems this experience, combined with your other performance at work, later led to your promotion to E8—equivalent to what the industry calls a Principal Engineer. How did you feel at that time?

Michael: I was naturally very excited at that time, but more important was a sense of "alignment." When I was at Google, I actually always felt a bit out of rhythm—not just technically, but there was some deviation between the things I did and the directions the company truly valued.

Arriving at Facebook, this promotion was actually a confirmation for me: I had not only grown technically but also began to understand one thing better—which kind of work is both important and aligns with the company's direction. This understanding itself is as important as technical ability, or even more so.

Ryan: I remember Nucleide was open-sourced, and Buck seems to be as well? So what was the original intention behind choosing to open source?

Michael: That's right. In comparison, Buck's open sourcing was more representative. Nucleide didn't actually become particularly popular externally.

I feel companies like this benefit too much from open source themselves, so there's a thought process: if this thing isn't our "core moat," then why not share it? Like some things I've done in my career, including Codex, the thinking is similar. Even if no one directly uses your tool, making the implementation method public as a reference is valuable to others.

Of course, in more ideal situations, you can also gain external contributions, such as people submitting PRs or fixing bugs; these are all very helpful. I remember Uber and Airbnb later used Buck.

In a sense, it's quite natural—Facebook is one of the largest-scale applications, so we encounter many problems first; then the next wave of companies starts encountering these problems and will come to see how we solved them.

Another interesting point is that Google internally actually uses Blaze, and externally it's Bazel. At that time, we also thought, if we open source Buck, could we somehow "force" them to open up more things? Later they did make some openings; although not entirely because of us, it did have some driving effect.

Another realistic factor is recruitment. Open sourcing is also showing externally: what we are doing, what we are good at. If you want to do the most cutting-edge things in this field, this is the place you should come.

Ryan: So was the decision to open source bottom-up? Was it engineers actively proposing it, or management pushing it after recognizing the value? Do you have dedicated internal policy documents?

Michael: It seems there isn't such a document, but both situations you mentioned probably exist. For example, typical success stories like React and PyTorch created huge value for the company. But there are also other long-term projects; when the economic situation was good, there wasn't much controversy, but as the macroeconomic environment worsened, managers would also complain that engineers were investing too much energy in open source.

Overall, most open-source projects are driven from the bottom up and usually don't encounter too much resistance. It can also 顺便 do a few technical shares, write a few blog posts; these contents actually have long-term value, are very helpful for recruitment, and have a longer lifecycle than many people imagine.

Ryan: Since you've been promoted to E8, I guess your expectations for yourself have also risen. So now is it time to find a problem that matches the E8 level? What did you do after the promotion?

Michael: Haha, I was a bit overreaching at that time. At that time, I wanted to solve the problem of Web loading speed—this is indeed a big challenge; the loading speed of facebook.com at that time was really not ideal, and the architecture was somewhat outdated. But this problem was too big, and I actually didn't have much experience. Most of the Facebook team responsible for the Web had been cultivating in this direction for many years, while I was mainly doing mobile and development tools before, so I wasn't familiar with this field.

I remember sitting down with another colleague at that time, planning to compile the V8 engine from source code to see if we could optimize the JS generation mechanism to make it more adaptable to V8. We blindly tried various methods, and in the end, all came to nothing.

Looking back now, different people are suitable for different types of problems. I am better at projects that require writing a lot of code from scratch, while the problem at that time focused more on data analysis, cross-team communication, and coordination—this was really not my strong suit.

Ryan: You mentioned that this stage of your career belongs to the "Hero's Journey." What does this term specifically refer to?

Michael: To be ashamed, it refers to some engineers always holding the fantasy that "someone should solve this technical problem," and this expectation triggered my self-inflation. Many engineers feel that this engineering problem has existed for too long; why has no one managed it? And my thinking was also very simple: JS, I'm most familiar with it. So I threw myself into it.

But I couldn't get it done; I failed completely. Looking back, I feel this was an important lesson, and I at least summarized it again: although I can indeed solve many problems, the things that truly make me enjoy it and can complete excellently are actually few. Of course, I will still try to gradually expand into other fields, but from the results, I should still focus on what I truly love; I cannot do everything best.

I think everyone can understand this principle and should calmly accept their own limitations.

Ryan: So how did you later find problems that truly matched the E8 level? What happened next?

Michael: I think this also had some element of luck. We organized small engineer meetings at that time to brainstorm potential bottlenecks that might appear in the future. I mentioned that the continuous expansion of the codebase would eventually trigger scaling problems, and the department manager at that time, who later became my supervisor, Brian O'Sullivan, obviously took it to heart.

He decided to gather people to develop a virtual file system, intending to solve this problem in advance. So, Adam Simpkins, Wes Furlong, and I joined the team. These two are top engineers, and for a long time, I even felt I was the weakest member of the project.

Ryan: You mentioned the virtual file system. For a big tech company like Meta, what are the benefits of self-developing such a system?

Michael: Previously, we adopted a monorepo mode—putting all code concentrated in one repository. But most people actually only need to use certain subsets of the repository, checking specific files at any time. So our core idea was to build supporting tools around this virtual file system: when cloning the repository or switching commit versions, the system no longer needs to write all files in the repository to the disk. By eliminating this default operation in traditional file systems, it avoids the exponential growth of file numbers with repository scale.

And this work involves two key links: the first is building the virtual file system—when a user requests file content at a certain commit point, the system dynamically generates the corresponding file, presenting a complete file layout effect. And when the operating system requests file content, it can instantly retrieve data, making it present the complete file structure of the user's actual layout.

The other part is my area of expertise, which is predicting tools in the toolchain that often need to read all files. Tools like grep, for example, will directly read all content. I had to consider how to adjust the development process and tool design to revolve around the virtual file system. Because if we insist on using the original tools to present all files directly, then the new virtual file system would be meaningless.

Ryan: Macroscopically speaking, its essence is lazy loading of a huge file system. Not only is it more efficient, but there's no need to process all content in the initial stage.

Michael: Exactly.

Ryan: You just mentioned that what you are better at is integrating all functions onto this basic framework?

Michael: Yes. Actually, when I was collaborating with Hanson Wang (who is also a member of the Codex team currently), I realized that traditional solutions, when needing to achieve ultra-fast file search through an IDE or editor, would mostly traverse the entire file system first to search for files.

I immediately felt this was definitely a big problem. So we started thinking: How to achieve file search without destroying the original file system advantages, thereby exploring new solutions beyond the status quo? Eventually, we developed a file system named miles (short for my files).

It indexes all new commits on the main branch through cron jobs, tracking file additions and deletions—and only indexing file names rather than content, because file names alone are sufficient. Hanson also proposed a clever scheme for maintaining the index, achieving fuzzy matching functionality. That is to say, the new system not only supports substring matching, but even when files use camelCase naming, entering only uppercase letters or having typos, the system can still accurately identify them.

We came up with a very interesting representation method to record all files that have ever been checked out, cooperating with some markers to note: when executing a certain commit, did this file exist at that time? When we send a query to the system—for example, informing it of the current commit version number, and whether files were added or deleted locally—it can return corresponding data.

I remember at that time when processing queries involving over a million files, the response time was only about 10 to 20 milliseconds. This framework was much faster than the default response of Xcode or MDS code, effectively solving the problem of slow performance.

With its extremely fast running speed, Miles began to be opened for internal service calls and was widely used by everyone in various other scenarios. When I left, the Miles service was already running on at least 30 servers globally, meeting diverse needs far beyond file search with a huge deployment scale.

Ryan: You just mentioned quite a few implementation details; after all, most people don't use LeetCode in their work. But I pondered for a long time and still don't quite understand; how was your input structure implemented? Did you use try?

Michael: Try is indeed powerful. We used two parallel arrays in the solution: one storing file content, and the other seems to be integers pointing to the index; I can't remember specifically. In addition, we set up a 64-bit mask, which contains 26 lowercase letters, 26 uppercase letters, 10 digits, and hyphens, etc. As long as the character appears in the search file, set the corresponding character bit.

This way, we can quickly scan the list, exclude a large number of invalid items, and achieve a highly parallel design. All arrays adopt a parallel layout; from the perspective of cache efficiency, the CPU can obtain excellent performance when reading memory linearly. This structure is naturally suitable for partitioned parallel processing of arrays.

Ryan: I know your involvement in the Eden and Miles projects eventually led to further promotion. But before the promotion, did you accumulate some experience about enhancing personal influence and handling opinion conflicts?

Michael: Yes, this was also the challenge I faced when assuming the E8-level management position. I had always been responsible for writing code, but in fact, most colleagues at the same level or higher positions no longer wrote code. They almost entirely focused on enhancing their influence or carrying out cross-team collaboration, such as writing project documents and coordinating opinions from all parties, etc.

So as an E8, to achieve the expected level of influence, relying solely on writing code won't work. I have to spend time influencing others; this is beneficial for myself and can also satisfy my boss.

Of course, sometimes we may be too insistent on our own views; at least I was like that at that time, resulting in an attitude that was too aggressive, ultimately causing me to suffer a big loss. During that period, my promotion was delayed somewhat, and I was reminded to adjust my approach.

One background at that time was that after Microsoft acquired GitHub, I was very anxious. Because our New Clyde project was largely built on the GitHub ecosystem. I felt this project was definitely doomed; after all, VS Code would definitely swallow its independent status, and later this indeed happened. I was anxious to death at that time, feeling the project had fallen into major risk. So I desperately urged everyone to change direction, without considering that many people in the team were actually satisfied with their current work and didn't want to be乱了 (disrupted) by sudden changes.

Later, my boss called me over and scolded me severely; I also accepted guidance and specially went to learn how to better handle such situations.

Ryan: So what is the most important lesson you learned in this aspect?

Michael: For me, I am now clearer about which situations will trigger my emotional reactions, such as certain technical decisions making me emotional. When encountering this kind of situation, I will immediately react and remind myself: Okay, absolutely cannot act impulsively. Or, when realizing I cannot conduct a dialogue normally or my emotional state is poor, I will choose to communicate with the other party's supervisor instead of barging into an engineer's workstation and shouting: "I have an idea, here's the deal..."

Many times I will first say: "I'm a bit excited about this issue right now, and my expression might not be very good; help me see how to advance this more appropriately." This method is actually more effective.

Ryan: An interesting point is that you foreseen the rise of VS Code and that the underlying architecture of New Clyde would be eliminated, yet your promotion was delayed because of your judgment. After the fact, you proved to be right; you were arguing for the correct judgment but were asked to "calm down." So when you realized things developed this way and realized you were right all along, how did you feel at that time?

Michael: About a year later, I did talk to the relevant people about this matter, equivalent to a review. After all, things were quite awkward before. Fortunately, the result was not bad; we finally solved the problem.

Ryan: So the most important thing is the processing method, not the stance itself.

Michael: That's right, indeed.

AI is Reshaping Development Methods: The Real Changes Brought by Codex

Ryan: You seemed to work quite happily at Meta, but ultimately left. So what attracted you to OpenAI?

Michael: There were multiple factors. I first interviewed at OpenAI in late 2023. At that time, I was responsible for the large-model-based developer tools project at Meta, and even released a self-developed authorized version of Metamate—a lightweight version similar to GitHub Copilot, and also did related papers and shares, such as the Code Compose project.

But the reality is, we would often receive feedback asking why we didn't choose to use GPT-4. We could only explain that we were using Llama 2; the gap between the two is actually very obvious. I am not doing model research myself; what I want to do more is to turn these capabilities into products and experiences. So I naturally felt at that time that if I wanted to do this thing, I should go to where the best models are.

As for the second point, I felt the importance OpenAI placed on top talent was like what Google was back then; the choices of these people definitely won't be wrong. And indeed it is. At OpenAI, I have the opportunity to work with senior teams, including many level 8 and 9 experts from the Meta side, allowing me to freely exert my value.

The third point is that both the timing and OpenAI itself are very special. I've mentioned to many people that joining OpenAI at that time was like joining Google in 2000. Note, not Google in 1998, but Google in 2000. At this time, the company had already established a foothold, and product-market fit was beginning to show results; this stage is very attractive for individuals.

There is also a more personal reason; I initially chose Facebook because of its huge consumer market—after all, I had deeply cultivated consumer products before, and the calendar tool I made was also widely welcomed by users. But the result was that after arriving at Facebook, I had absolutely no opportunity to touch consumer business; the developer tools I worked on later actually served the company's internal engineers, about 20,000 people.

Later, wanting to go to OpenAI was to seize the opportunity to return to the consumer field—at least to have a huge user base. Now, the Codex project I am responsible for has broken through a million weekly active users—I can't remember the specific number, but the growth curve is indeed very steep. This scale of service far exceeds the influence range of 20,000 to 40,000 developers inside Meta.

Ryan: That's right, absolutely correct. Most development tools in the industry only reach this scale.

Michael: Yes.

Ryan: In my opinion, Meta is more of an engineering-driven company where engineers are the core, and many things are driven from the bottom up. While many AI labs are more research-driven, with research being the first priority; this also has its rationality, after all, the model itself is key. As an engineer and not a researcher, how do you view the difference between research-led culture and engineering-led culture?

Michael: This is indeed a change that needs adaptation. If anyone claims they can switch smoothly between these two cultures, I think they are lying. But one thing is for sure: friends who have stayed in big companies like FAANG will have a good habit, which is paying attention to the cultivation of their own influence. This is very important, and I am also sincerely pursuing this influence. Just as I love my work on the Codex and Harness projects, these are very meaningful and respected results. But if the model itself isn't excellent, no matter how much optimization we do on the Harness side, the meaning is limited.

So after joining OpenAI, I felt particularly great; we collaborate closely with the research team, sit very close, and many things are promoted together. This is also an important reason why I left Meta to invest myself—I hope to work with colleagues who make models to build products and explore new technical boundaries together.

Perhaps a similar model could be achieved at Meta, but the actual effect ultimately cannot be compared.

Ryan: You just mentioned participating in the launch of the Codex project when you joined. I heard that when the Codex CLI was first released, the market response didn't fully meet expectations, but later the project gradually got on track. Can you share this journey?

Michael: Of course. This journey was also full of twists and turns. In April 2025, we released Codex CLI. We did a live demonstration at the Easter egg session at the end of the promotional live stream, and simultaneously open-sourced the Core3Pro project at that time; many people actively tried it out. Everyone was full of expectations for this brand new programming assistant; its performance was also not bad, but the release at that time was indeed quite rushed.

Overall, this was very helpful for attracting popularity to the project—after open sourcing, we received a large number of PRs. I remember the project got about ten to twenty thousand stars within about one or two weeks. That experience was interesting, and it also gained a lot of sincere love from users. But the problem was, the team at that time might not have had the comprehensive strength to drive the project's development, after all, the company needed to promote multiple businesses simultaneously.

In the month after that, seven engineers plus a few researchers (I can't remember the exact number) released the Web version of Codex—allowing users to use Codex directly in containers, and even start new projects on mobile phones. This was really cool. Anyway, the manpower invested was more sufficient, and I firmly believed the long-term vision of this project was worth looking forward to. But from the results, it was a bit "ahead of the users"; users were not fully ready yet.

In contrast, everyone at that time was more inclined towards local programming agents, so although our Web version gained a wave of growth, the stickiness was not as expected.

Throughout the summer, we continued to advance two product lines. Until mid-summer, localized agents still had stronger product-market fit. But I personally always felt that the local solution was just a stopgap measure. After all, the operation of agents requires more devices; it's impossible to support solely on a laptop.

So in the summer, we made major adjustments: expanded the programming team and introduced more developers. At that time, GPT-5 was about to be released, and the market prospects were very, very bright. I was personally quite excited about this because, besides the CLI interface, I had also done several prototype tests before; this time, manpower was finally sufficient. We also simultaneously started the development of the VS Code extension because I insisted that although terminals are suitable for many scenarios, there are still many limitations in interaction. Creating a beautiful user interface in the terminal always requires many compromises, while in the IDE, it can be done more naturally and completely.

August was an explosion period—GPT-5 came out, and we also released a brand new terminal interface. At the same time, the GPT open-source model also appeared; we also supported it in the TUI. The design of open-weight models and open training frameworks was truly amazing. Later in the same month, the VS Code extension was released, and we thereby entered a new stage of crazy iteration.

It was the convergence of these various elements that catalyzed and pushed us across the turning point of this vertical growth. This journey was exciting, and relevant data can also be verified in the code repository. Whether it's the number of participants or the number of commits, various conventional indicators can make people intuitively feel this change. Looking back now, this was also quite a wonderful journey.

Ryan: You just mentioned two forms of programming agents: local version and cloud version, and you seem to firmly believe that in the long run, the future hope lies not in the local version but in cloud deployment. Why do you judge it this way?

Michael: The usage scenarios that truly make people "unable to leave" now are often like this: for example, whenever a new GitHub issue or a Linear task comes in, you automatically trigger the agent to handle it. Although there are indeed cost issues here, and it might be abused, if it's in a private repository within the company, this is actually a very natural usage.

In this case, the agent is more like part of an automated pipeline rather than a tool that only interacts locally on your machine. That means these tasks cannot all run on your laptop.

From this perspective, as an individual developer, you might still spend more time interacting with the agent locally; but if you look at "the actual computing power consumed by the agent," I think the bulk will still happen in the cloud. Deploying these things in the early stage might be a bit troublesome, but once set up, the experience is actually very good.

Ryan: Understood. So what you mean is not that the local form will disappear, but looking at the whole industry, the computing power consumed by agents will mainly migrate to the cloud.

Michael: That's right. I remember when the Freeze Code extension was first released, one of its core functions was that when users conduct a conversation, they just need to click a button to transmit the conversation content to the cloud (provided configuration is complete). It is conceivable that in the future, we will see more similar scenarios—you start something locally, then throw it to the cloud to run, and take it back when it's done.

Ryan: Since the beginning of this year, Codex usage has grown by 5 times, and the current user scale exceeds a million. I'm curious, since starting to use the new version of Codex, has your own AI workflow changed significantly?

Michael: It has indeed changed. I now use Codex far more frequently than I ever imagined. In the past, I always strongly relied on the VS Code extension, needing to put all code in front of my eyes in the sidebar—at that time I felt, these elements should be integrated together. After all, I myself pay attention to code content. Of course, if it's really that kind of one-off prototype experimental project, I wouldn't read the code much either.

This feeling of free choice is great, and it's absolutely worth getting excited about. But for the code of the Codex project itself, I still must personally control it. This point is very important, after all, this project will affect many people. Slowly you will form a judgment: which changes can be safely handed over to the model, and which places do you need to watch, or even need to "watch it do." So the current method is a bit like, I will write the requirements more completely at the beginning, and then hand them over to the model to execute.

At the same time, I will open four or five copies of the Codex repository locally, and then cooperate with the Codex App to do multi-task processing; this method has significantly improved efficiency for me because you can basically stay in one window and advance multiple tasks at the same time. To some extent, it really is a bit like "playing a game"; you subconsciously think: what is my throughput right now? Similar to how many balls can be thrown in the air and caught at the same time?

Of course, sometimes it can be a bit chaotic, especially when context switching is frequent, but you can obviously feel the output improving. Sometimes there is even a bit of "guilt" to write code by hand because you think, if I described the problem clearly to the model at the beginning, it might be faster.

Just like you originally just wanted to change three lines of code, but 30 minutes passed and you still haven't finished—everyone has experienced this situation, and now many times it can actually be avoided.

Ryan: So now, of the code you write, roughly how much is written by yourself, and how much is generated by the model?

Michael: Oh my, now the proportion of code generated by models must be 80% to 90%; this is no joke. Especially in work like debugging tests and continuous integration, it is reflected even more obviously. I always like to let large models generate print debugging and other codes; it's really great and can liberate a lot of manpower.

Ryan: Then let's talk about which problems are suitable for large language models to handle and which are not. For example, how do you judge which tasks require you to write personally? That is, what are the 10% to 20%, and what are the remaining 80% and 90%?

Michael: This is indeed a good question. Every time I sit down to program, I ponder: is it necessary to write this part by hand? And the answer is almost always negative.

There is a type of work that is more underlying, such as the harness layer of Codex; I am mainly writing in Rust, which involves many operating system-level details.

In actual work, I spend a lot of time on sandbox mechanisms. It is precisely this mechanism that guarantees the safety and integrity of our work, ensuring the model does not break through preset boundaries. For this kind of work, I am more inclined to write manually because I must ensure everything is foolproof.

To ensure test coverage is perfect enough, sometimes I will initialize first, and after the basic framework is built (such as modules I have repeatedly pondered), I let AI automatically fill in the remaining parts.

But apart from this, actually many things are very suitable to hand over to the model, such as refactoring code, splitting PRs; things that used to take a lot of time can now basically be handed over to the model. For example, I often have a relatively large PR; I know it's too big myself, so I directly let the model help me split it into multiple commits that are easier to review; this kind of work saves a lot of time.

Ryan: What about code review? At OpenAI, roughly how much code is manually reviewed? Or is there already an agent helping you handle part of it? For example, test cases like this probably don't need manual review, right?

Michael: I agree with a method where the agent first does multiple rounds of self-review until it is confident enough, then hands it over for humans to look. But ultimately, before the code is truly merged, we will still do manual review; this will not be saved.

From the actual situation, like other teams, we will also have some configuration files for agents, but we will still encounter places where the model's knowledge is incomplete; some context needs to be supplemented, or some experience that is not explicitly written into the system; these are often things humans remember but models don't know. So in the review stage, problems can indeed still be found.

Another obvious change is that now everyone also starts using AI to write PR summaries, which has improved the description quality of the entire team quite a bit. Now when I go to review, the code has already been through a round of Codex, and there is a structured summary clearly writing the reason and content of this change. This obviously greatly accelerates the PR review process and dissolves the original huge task review pressure.

Ryan: This is simply too great. I feel maybe half of the diff files are empty, and the test plan simply writes vague content like "arc build," not knowing what it is at all.

Michael: Indeed, very baffling.

Ryan: Let's talk about this—why did you decide to open source Codex CLI?

Michael: The reason is simple: this is a tool that runs locally and has very high permissions, and this is precisely the meaning of open source. Although I am not that kind of extreme open-source advocate, I very much understand this thought: since this thing is going to run on my machine, I naturally care about how it works. Especially in the field of AI programming, allowing users to view the code and understand its behavior is extremely critical—after all, there are still many doubts about new things like AI agents. Therefore, I feel open sourcing is a necessary step.

In addition, we can also gain a large number of valuable contributions and bug reports from this, discovering problems we might have missed. At the same time, we are also showing the world our implementation method, how to build functions one by one through code. I previously sent out a blog post about how the agent loop works; afterwards, I actually wanted to write more, but time was indeed limited.

There was also an interesting little episode; two candidates came for an interview, and one of them held up code saying they wrote it, but I immediately discovered that was what I wrote. The other one was much better and even approved of the fact that I personally write code, praising me until my heart was beautiful.

Ryan: The blog post you mentioned introduced the working principle of Codex; so how does Codex discover available resources in the environment? When I run these tools, I see it can think for itself and find various things in the terminal; it's too magical. How is this actually achieved?

Michael: Actually, several methods are used here. For example, Codex's basic training mechanism determines that it is particularly good at using RIP grep to find various information. In addition, if the MD file or Readme file in the agent marks that "these tools in this repository are important," then it will try to prioritize using them.

Additionally, if everyone uses MCP and associates these MCP servers with the current project, then the definitions of these tools are actually injected into the context at the very beginning of the conversation, so strictly speaking, this part isn't the model "discovering" it itself, but rather it is directly provided to it.

Ryan: Understood. Part is explicitly injecting context in Harness, and the other part is the model exploring and calling itself. Is my understanding correct?

Michael: That's exactly it.

When AI Can Already Write 80% of Code, What Should Be the Core Competency of Engineers?

Ryan: Looking back at your career experience, the technical span is actually very large, from front-end, JavaScript, to build systems, virtual file systems, and now Codex. You must have been constantly learning in this process; are there any technical books that had a big impact on you?

Michael: There is a book on operating systems, about a thousand pages thick, published by Addison Wesley, but I can't remember the author for the moment. At that time, I was doing the virtual file system project; actually, I hadn't written much C in my career before, undergraduate was more theoretical, and I hadn't really gone deep into the underlying layer.

As a result, I suddenly found myself doing a very underlying system, which is why I joked at that time that I was the weakest engineer in the project. Once someone said something, and I found I completely didn't understand, which was a bit awkward, so I went to ask what book I should read, and my manager at that time recommended this one.

I just bought this book directly, read it from beginning to end, basically taking it wherever I went until I finished reading it. It's actually quite interesting; many people can go far in software engineering without truly understanding how computers work because there are too many abstraction layers in between.

On one hand, this liberates productivity, but on the other hand, it also brings some limitations. Later I also found that if you can actively walk towards the underlying layer, many problems can actually be re-understood or even "dismantled." For example, you will realize there is redundant overhead between certain layers; once removed, it might be an order of magnitude performance improvement.

So now I would suggest everyone actively go deeper to understand these things; this is very obvious for improving problem-solving abilities. Besides this, I also quite like some Rust-related books published by O'Reilly; they are written relatively solidly and systematically.

There is also a suggestion that isn't a book, but I gained a lot from it, which is to do some CTF (Capture The Flag) security competitions. This kind of competition is actually quite like a "computer decathlon"; it involves many different fields. For example, some questions require you to understand assembly, while others might require you to analyze a poorly written PHP management page.

This method forces you to contact knowledge at different levels, and it itself is like a game, more interesting, and easier to persist in than simply reading books.

Ryan: Can you explain a bit more about what CTF is?

Michael: CTF can have different forms, but it is usually a competition in the field of information security. The most common one is the "Jeopardy" mode, which means there is a set of designed questions; each question has a corresponding score, and you need to solve as many as possible within the specified time. It can be participated in individually or in teams.

Each question usually hides a "flag," which is a string of a specific format. Only when you truly complete reverse analysis, vulnerability exploitation, or other problem-solving steps can you get this string, and then submit it to prove you solved the question.

So it's a bit like an "escape room in the terminal"; you can only rely on the computer itself to solve problems. This kind of practice forces you to touch content you wouldn't encounter in normal work, such as debugging underlying programs, analyzing binaries, understanding system behavior, etc.

From the perspective of improving abilities, it not only exercises your technical breadth but also lets you form a more "adversarial" way of thinking, which is actually very helpful in many complex problems.

Ryan: So your suggestion is, if someone wants to become a stronger engineer, they can do some exercises like CTF because it forces you to solve various types of problems, right?

Michael: Yes, that's what I think. You will master some skills that are difficult to contact usually in this process, such as debugging underlying programs, analyzing system behavior; these are not necessarily used in daily development.

And it forces you to switch between different technical fields; this breadth is actually difficult to obtain through regular work. If you just write React every day, you are likely not to open GDB, not to do reverse analysis, and not to truly understand how the underlying layer works, but these abilities are very valuable in many key problems.

In a sense, this is also a training method to let yourself "penetrate abstraction layers downward," letting you not only know how to use tools but also know how tools work behind the scenes.

Ryan: Many friends in the early stages of their careers, when seeing that Codex can handle everything in the terminal, will sigh "Then I don't need to learn GDB; anyway, Codex understands it all." In this environment where AI tools are developing rapidly, how would you advise them to think about their engineering capability construction?

Michael: That's right; I feel many people are trapped by this problem nowadays. I have also thought about this a lot myself, but I can't find a perfect answer. Personally, I will still return to a relatively simple judgment: I still feel that one should actively "go downward," penetrate these abstraction layers, and understand how the system works at a deeper level; this matter is still crucial.

Of course, this judgment might change in the future, but at least for now, the quality of the questions you ask the agent will still directly affect the final result you get. If you ask the wrong question, you likely won't get a good engineering solution. Perhaps further down the line, this layer will also be further abstracted away, but not yet.

The overall trend is definitely moving towards "higher-level abstraction"; the problem is when we will truly reach that stage; it's hard to say now. Progress is indeed faster than many people expected, but I feel mastering the ability to ask the right questions is still the most important.

As for what this ability means for newcomers, I actually haven't figured it out completely myself. I can make judgments now largely because of past experience accumulation, letting me form a kind of "intuition" and "taste," knowing how to ask. But for young friends just starting out, I really can't give a definite answer. Moreover, we cannot predict the future trend 100%; everything is still undecided.

Ryan: Looking back at your career development, the requirements for these high-level positions are actually quite exaggerated. For most people, E7 or Senior Staff is already a very difficult level to reach, and you went up two more levels. In daily work, how do you view this "high expectation"? Does it put pressure on you?

Michael: For me, the pressure has never disappeared. After all, everyone has gone through performance reviews, right? And at this position, just like examining others' ranks and influence, I also want to establish a judgment system as fair and just as possible in my own mind. For example, different ranks must have corresponding standards and influence ranges.

Similar to what standard this rank corresponds to, and what standard that rank should correspond to. At this time, my brain will constantly deduce: if others were sitting in my evaluation seat discussing standards, they must also be fair and just. This is huge pressure. For example, reaching the E8 rank, suddenly facing D1-level directors, one would think: How do I compete in influence with executives managing hundreds of people underneath?

This is indeed difficult, after all, many individual contributors will achieve goals through another kind of "indirect management" method—such as writing specification documents, coordinating collaboration between teams, etc. The reason they choose this method rather than directly serving as supervisors is because—I have actually contacted such colleagues—they emphasize they possess technical credibility, or have personally built the current project, so when they communicate with the team, their influence is far superior to ordinary engineering supervisors and other positions.

And this situation is not rare. When those senior contributors can influence dozens or hundreds of people, it is equivalent to having the influence of a D1 rank. And if as an ordinary programmer, we need to prudently choose projects. Never always think "this project can make me happy"; after all, no one wants to lose their job or be hung out and criticized by the big boss in an all-hands meeting.

Actually, when I establish projects, I will also weigh repeatedly: for a certain function, I also want to do it personally out of fun. But after serious consideration, I still decide to hand it over to others. For example, in the current Codex project, I will consider which codes I should write personally to maximize my influence as much as possible. If handing it to others can achieve 80% of the effect, then others should take over.

But even so, when leading a five-person project, it is still very difficult to reach the contribution expectations of E8, E9 levels. So the key is to find projects that can produce a multiplier effect. For example, the virtual file system is an excellent case; we all know its emergence will unlock countless possibilities, avoiding the dilemma of teams being locked due to performance bottlenecks.

There is also another key point—my boss once emphasized: we often underestimate the value of high-level managers. It is they who matched senior core engineers with suitable projects. Some senior engineers are excellent problem solvers and code experts, but they are not the source of creativity. They are the ideal candidates for handling high-difficulty technical projects. I don't want to name names specifically, but everyone should understand what I mean. Many times the person involved doesn't realize it themselves; it's the manager who makes the decision that "this project must be done by that person."

Ryan: I once asked your former colleague Adam Ernst if he had ever encountered an engineer he particularly admired and what the reason was. At that time, he mentioned your name and specifically emphasized that your project launch ability was outstanding. It is imaginable that so many excellent projects were brought out by you alone; this kind of execution ability to have an idea and then build a prototype is not simple.

And you can always prove with strong persuasiveness that your solution is the optimal one. So for other engineers who have found a problem and have preliminary ideas, wanting to build a project from scratch, what suggestions do you have?

Michael: Actually, I feel the inspiration for many excellent projects comes from slight dissatisfaction with certain statuses quos. You will feel that something shouldn't be like this now; this feeling itself is a very strong starting point.

Interestingly, my biggest feature is a burst of energy; I bury my head and work hard, without looking ahead or behind, or even considering the so-called best implementation method. A classic case is the early version of Google Calendar. I suddenly had a whim to stuff the weather function in. Yes, directly displaying weather icons on the calendar. Afterwards, I just started doing it directly.

At that time, I was mainly developing with JS and hadn't touched the backend at all. I forced myself to piece together code to get the function done. As a result, the technical lead asked: Wait, how is your weather data stored? I answered, "Just threw a pile of XML data casually." He reminded, "We need to discuss the Protocol Buffers scheme first." So I suddenly realized that binary formats can save space.

Anyway, that's how I am; I only think about implementing the current function, without asking others if there is a better scheme. As long as it can run, it's fine.

Ryan: It seems it is precisely this unique attribute of yours that makes you particularly suitable for digging into the root cause of problems and solving them autonomously. Almost every project is like this: how can reality be like this? Must implement a function to solve... and then you start working.

Michael: Yes, that's just how disregarding I am.

Ryan: I think you have another characteristic, which is that many people are used to staying in their comfort zones. For example, the Buck project you mentioned before; surely many people felt the build speed was ridiculously slow. But their thought was "Forget it, go to the cafeteria to get something to eat, then come back and continue." But on what basis did you think you could significantly optimize performance?

Michael: Actually, that time I owe it to my experience at Google. Although I never participated in the Blaze team's work, nor understood the specific implementation principles of the project, nor touched the relevant code, I clearly knew that there existed some better solution in the world. This existential proof is important—knowing it exists, even if unsure if I can achieve it, is enough to support me trying.

In addition, many previous technical achievements also helped me accumulate confidence. I often boast of being a "programming machine"; even now with Codex and everyone becoming programming machines, I still maintain this confidence. Rapidly building prototypes can at least answer questions or verify basic assumptions at the thought level—that is, whether what I think should exist truly has the possibility of realization.

Simply put, as long as the determination is firm, often a way can be found to solve it.

Ryan: I found a quite common phenomenon: after many people see world-class infrastructure in big companies, when they go to other companies, they will feel "this is missing, that is missing," and then start redoing these things themselves. Additionally, I have read many of your articles and found your writing extremely clear,堪称 an excellent model of technical writing. For engineers intending to improve their writing ability, do you have any suggestions?

Michael: I think the key still lies in reading more; reading is the best beginning to improve writing ability. Whether consciously or subconsciously, we can always master writing routines in the process of reading, such as the structural characteristics of works, etc. In addition, reading also helps cultivate more macro thinking: what do we truly want to convey? What do readers truly need to understand? Furthermore, one must definitely make a detailed planning outline in advance; many people have emphasized this point, but its importance is often underestimated.

In specific writing, the key lies in: can the content form a linear logical chain? We also need to ask ourselves from time to time: is the span from this point to that point too large? Are there any links that might be ignored by readers but are actually very important? I think as long as these doubts are solved, we can have a relatively clear prediction of logical fractures, which helps avoid writing things people can't understand.

After foreseeing this situation, we need to skillfully add examples to guide and help readers complete this logical leap. At least in the field of technical writing, I think this is also a very important point.

Ryan: You previously wrote a note on career development that won my heart deeply; the opening proposed a three-step plan to expand influence. Can you explain these three steps specifically? I think everyone should reference this in their career development.

Michael: The first step is to figure out what we truly like to do. Just as I mentioned before, expanding the scope of interests is certainly good, but being sincere with one's own heart is also very important—after all, this question is not that easy to answer. Just like many of my "Hero's Journeys" also failed, many things I did were not out of love. As for the second step, it is to figure out what the employer truly values and what is most valuable to it.

Ryan: This is too important.

Michael: For example, when I was at Google, I didn't do this well. I was doing things I was passionate about, but it wasn't the core of Google's advertising business.

The third step is to find the intersection of these two, and then concentrate energy on this intersection as much as possible. The more you can do this, usually the easier it is to make influential results.

Of course, the realistic problem is that sometimes this intersection doesn't exist; then you might need to consider changing environments to find a place where it can be established.

Ryan: Last question: If you were to return to the start of your career now with this experience, what advice would you give to your younger self?

Michael: I should have opened my heart earlier to learn more things. Also, I should have been harder on myself. I believe many friends share this feeling: in the starting stage, there are simply too many things we need to learn, so just after contacting the first programming language, especially after successfully mastering it, we will have a preference for this language, always looking for excuses to prove it is the best. For example, no matter what the scenario, I would say "No, no, this language itself is completely fine."

After all, it represents the starting point where we truly start doing things and solving problems, closely associated with a wonderful feeling of "finally being able to make some moves." But this will also become our trap, making us want to死死 (die-hard) guard this job. After all, we spent so long to gain a foothold and can finally work efficiently; as a result, we have to re-learn for new transformations and breakthroughs... who can stand that?

I am also like this; I found I drilled too deep into JS. I feel if I could have maintained more curiosity and flexibility towards other project types or learning content at that time, the result would be better. Although I eventually achieved it, actively stepping out of the comfort zone would definitely allow me to achieve breakthroughs earlier.

Ryan: From past experiences, you once hated Objective-C very much during the Xcode period, later even thought of a way to compile Objective-C into Java and then compile it back. And your persistence in mastering the C language might have been stimulated by something; anyway, you forced yourself to understand it. Fortunately, Codex is becoming more and more mature; it can greatly reduce the learning threshold. Perhaps in the future, we can casually let it output code in JS, Rust, or any language form.

Michael: Indeed, the development and maturity of Codex has indeed opened up more possibilities.

Ryan: Great. Thank you for your valuable time; I benefited a lot from this exchange.

Michael: You're welcome, Ryan, and thank you for your invitation.

Original Link:

https://www.developing.dev/p/openai-codex-tech-lead-on-how-his

Disclaimer: This article is organized by AI Frontline and does not represent the platform's views. Reproduction without permission is prohibited.

Image


分享網址
AINews·AI 新聞聚合平台
© 2026 AINews. All rights reserved.