Blog #1: George Washington,

Carb Loading, and the Future of AI

Blog #1: George Washington,

Carb Loading, and

the Future of AI

May 4, 2026

May 4, 2026

I “joke” that I’m the dumbest guy in the company. I’m surrounded by math Olympiad winners, engineers who’ve been building and shipping AI for over a decade, a Fields Medalist, a Turing Award winner, and a boss who worked at CERN as a teenager, has a PhD in algebraic topology (I still don’t know what that means), and was early at Google’s Quantum AI lab.


Most explanations of AI either get buried in jargon or get simplified to the point where they’re misleading. So I’m going to try something different. I’m going to explain it the way I understand it, in plain English, using whatever analogies actually stick, even if they’re a little rough.


And when I get things wrong or cut a corner or two too many, Dr. Boris Hanin from Princeton, who advises us, will jump in and mark it up in the margins just like he does my internal memos.

The goal is simple. Help the rest of us non-quantum PhDs understand the systems that are starting to shape how the world actually works…

I “joke” that I’m the dumbest guy in the company. I’m surrounded by math Olympiad winners, engineers who’ve been building and shipping AI for over a decade, a Fields Medalist, a Turing Award winner, and a boss who worked at CERN as a teenager, has a PhD in algebraic topology (I still don’t know what that means), and was early at Google’s Quantum AI lab.


Most explanations of AI either get buried in jargon or get simplified to the point where they’re misleading. So I’m going to try something different. I’m going to explain it the way I understand it, in plain English, using whatever analogies actually stick, even if they’re a little rough.


And when I get things wrong or cut a corner or two too many, Dr. Boris Hanin from Princeton, who advises us, will jump in and mark it up in the margins just like he does my internal memos.

The goal is simple. Help the rest of us non-quantum PhDs understand the systems that are starting to shape how the world actually works…

A high school student uses AI to help write a term paper on George Washington. The introductory paragraph looks great. Clear writing, confident tone, everything you’d expect from a well-researched paper. But early on, the AI model gets a small detail wrong. It says Washington served in government up until the War of 1812.


It sounds reasonable enough if you don’t have your US history down pat, so it goes unnoticed by the student.


From there, the narrative really starts to drift. Washington’s later years extend well past his death in 1799. His role in shaping the early republic starts to blend with the efforts of his grandson. By the time the paper gets to the Jacksonian Era, the timeline is quietly distorted. The relationships, the decisions, even the sequence of events start to blur together.


Nothing sounds obviously wrong at face value if you don’t know the material. Each paragraph flows into the next. But it’s all built on a detail that didn’t get caught. By the end, the paper reads well, but it’s historically inaccurate in ways that matter.


That’s where we are with AI today. One small, believable mistake at the start doesn’t just stay contained. Modern language-based systems build on it, reinforce it, and carry it forward into something that sounds coherent, but isn’t actually correct.


The easiest way to understand this is to think of large language models as really confident interns. They think fast, communicate clearly, and they present solutions with conviction. For many use cases, that is incredibly useful. But if you actually know the domain, you can see where they are subtly off. More importantly, once they start down the wrong path, they don’t stop and reassess. They don’t ask for help, and they don’t understand the consequences of being wrong. They just keep going, very confidently, toward a flawed conclusion. That is not a flaw in the system so much as a consequence of how it is built. These models are trained to predict what a correct answer should look like based on patterns in data. They are not built to guarantee that the answer is correct, they are built to mimic intelligence. This is a feature, not a bug.


However, that distinction starts to matter the moment you move from a high school US history term paper and into critical systems that underpin the global economy. If you are writing marketing copy or summarizing documents, being mostly right is often enough. But if you are managing an energy grid, routing financial transactions, or operating a manufacturing system, being slightly wrong is unacceptable. In those environments, correctness is not a feature. It is the requirement. This is the gap that is opening up right now, not between hype and reality, but between capability and trust.


Most people still talk about AI as if it is a single system that will just keep getting bigger and eventually do everything, often known as the single agent theory. Think HAL from 2001 A Space Odyssey or Jarvis from The Avengers. That is not how this evolves. At least we don’t think so, and a growing cadre of AI and business leaders across the globe agree.


What is actually taking shape today is the growing likelihood that a diverse ecosystem of different types of AI models, each handling a specific role, will soon power the most important corners of the global economy.



AI is kind of like a sandwich

A high school student uses AI to help write a term paper on George Washington. The introductory paragraph looks great. Clear writing, confident tone, everything you’d expect from a well-researched paper. But early on, the AI model gets a small detail wrong. It says Washington served in government up until the War of 1812.


It sounds reasonable enough if you don’t have your US history down pat, so it goes unnoticed by the student.


From there, the narrative really starts to drift. Washington’s later years extend well past his death in 1799. His role in shaping the early republic starts to blend with the efforts of his grandson. By the time the paper gets to the Jacksonian Era, the timeline is quietly distorted. The relationships, the decisions, even the sequence of events start to blur together.


Nothing sounds obviously wrong at face value if you don’t know the material. Each paragraph flows into the next. But it’s all built on a detail that didn’t get caught. By the end, the paper reads well, but it’s historically inaccurate in ways that matter.


That’s where we are with AI today. One small, believable mistake at the start doesn’t just stay contained. Modern language-based systems build on it, reinforce it, and carry it forward into something that sounds coherent, but isn’t actually correct.


The easiest way to understand this is to think of large language models as really confident interns. They think fast, communicate clearly, and they present solutions with conviction. For many use cases, that is incredibly useful. But if you actually know the domain, you can see where they are subtly off. More importantly, once they start down the wrong path, they don’t stop and reassess. They don’t ask for help, and they don’t understand the consequences of being wrong. They just keep going, very confidently, toward a flawed conclusion. That is not a flaw in the system so much as a consequence of how it is built. These models are trained to predict what a correct answer should look like based on patterns in data. They are not built to guarantee that the answer is correct, they are built to mimic intelligence. This is a feature, not a bug.


However, that distinction starts to matter the moment you move from a high school US history term paper and into critical systems that underpin the global economy. If you are writing marketing copy or summarizing documents, being mostly right is often enough. But if you are managing an energy grid, routing financial transactions, or operating a manufacturing system, being slightly wrong is unacceptable. In those environments, correctness is not a feature. It is the requirement. This is the gap that is opening up right now, not between hype and reality, but between capability and trust.


Most people still talk about AI as if it is a single system that will just keep getting bigger and eventually do everything, often known as the single agent theory. Think HAL from 2001 A Space Odyssey or Jarvis from The Avengers. That is not how this evolves. At least we don’t think so, and a growing cadre of AI and business leaders across the globe agree.


What is actually taking shape today is the growing likelihood that a diverse ecosystem of different types of AI models, each handling a specific role, will soon power the most important corners of the global economy.



AI is kind of like a sandwich

I think of it as a deli menu. A metaphor that our Founder and CEO hates and I am obliged to note in writing.


Disclaimers aside, most of the industry today is trying to build everything like a bread sandwich. Large language models being the carbs. LLMs best serve as an interface layer allowing humans to interact with machines in a way that feels natural. That is a real breakthrough, and it is why everything feels like it is accelerating. But bread is not a full meal. If you try to make it the meal, the protein, and the kitchen all at once, it falls apart under pressure.


To make these systems work in the real world, you need additional layers. You need systems that can take messy, noisy real-world data and translate it into something machines can actually reason over. If you are managing an energy grid, demand is not a paragraph. It is time-series data, latency, geography, and hard physical constraints. A language model can describe that, but it does not natively understand it. This layer, world models, what you might think of as the condiments in the sandwich, is what will allow AI to move beyond language and into real-world systems.


Then you need the core of the system, the part that reasons what is valid and what is not. What you ultimately want is a system that does not just generate answers, but evaluates them against rules or constraints and filters out the ones that do not hold up. That is a different posture. It is less about producing output and more about validating it. This is the direction we are pursuing with energy-based reasoning models. Instead of predicting the next word in a sequence, our systems evaluate entire solution spaces and assess whether a given outcome satisfies the rules of the system. It is closer to reasoning than prediction.


It is important to be clear that this work is still early. Like any new architecture, it needs to be tested in real-world environments before it can be broadly deployed. That is why we are starting with design partners in areas where correctness can be clearly defined and where failure has real consequences. Our initial focus is on formal verification and verified code generation. In that domain, the rules are explicit. Code either does exactly what it is supposed to do or it does not. There is no gray area. It is also a space where the gap between what current systems can suggest and what can safely be deployed is easy to see. If you can close that gap, you do more than improve software development, you expand the range of systems that organizations are willing to trust machines to operate.


When you step back, what emerges is not a single “god model" but a stack. Language models handle interaction. World models and structured data systems provide context. Reasoning systems enforce correctness. That is the full deli sandwich.


Not every system will require all three layers. In some environments, machines will be primarily interacting with other machines. Engineers will define the constraints, systems execute, and outputs are returned in structured form. That is closer to an open-faced sandwich, or a tartine, where interaction is minimal and execution is the focus. In other cases, the interface layer disappears entirely. Systems operate autonomously, processing data, enforcing constraints, and producing outcomes without human prompting. That is the equivalent of a lettuce wrap, where the core ingredients do the work without the need for a user interface.


The more critical the system, the less you want a human guessing at prompts in the loop. That is the direction many parts of the real economy will move over time. Right now, we are still in an experimental phase. Many deployments are closer to testing than full integration, which is typical for any early-stage technology. But it is important to be clear about where things stand. We have built systems that are very good at sounding right. The next phase is building systems that are actually right. That shift, from impressive outputs to reliable outcomes, will determine how and where AI is ultimately deployed across the global economy.

I think of it as a deli menu. A metaphor that our Founder and CEO hates and I am obliged to note in writing.


Disclaimers aside, most of the industry today is trying to build everything like a bread sandwich. Large language models being the carbs. LLMs best serve as an interface layer allowing humans to interact with machines in a way that feels natural. That is a real breakthrough, and it is why everything feels like it is accelerating. But bread is not a full meal. If you try to make it the meal, the protein, and the kitchen all at once, it falls apart under pressure.


To make these systems work in the real world, you need additional layers. You need systems that can take messy, noisy real-world data and translate it into something machines can actually reason over. If you are managing an energy grid, demand is not a paragraph. It is time-series data, latency, geography, and hard physical constraints. A language model can describe that, but it does not natively understand it. This layer, world models, what you might think of as the condiments in the sandwich, is what will allow AI to move beyond language and into real-world systems.


Then you need the core of the system, the part that reasons what is valid and what is not. What you ultimately want is a system that does not just generate answers, but evaluates them against rules or constraints and filters out the ones that do not hold up. That is a different posture. It is less about producing output and more about validating it. This is the direction we are pursuing with energy-based reasoning models. Instead of predicting the next word in a sequence, our systems evaluate entire solution spaces and assess whether a given outcome satisfies the rules of the system. It is closer to reasoning than prediction.


It is important to be clear that this work is still early. Like any new architecture, it needs to be tested in real-world environments before it can be broadly deployed. That is why we are starting with design partners in areas where correctness can be clearly defined and where failure has real consequences. Our initial focus is on formal verification and verified code generation. In that domain, the rules are explicit. Code either does exactly what it is supposed to do or it does not. There is no gray area. It is also a space where the gap between what current systems can suggest and what can safely be deployed is easy to see. If you can close that gap, you do more than improve software development, you expand the range of systems that organizations are willing to trust machines to operate.


When you step back, what emerges is not a single “god model" but a stack. Language models handle interaction. World models and structured data systems provide context. Reasoning systems enforce correctness. That is the full deli sandwich.


Not every system will require all three layers. In some environments, machines will be primarily interacting with other machines. Engineers will define the constraints, systems execute, and outputs are returned in structured form. That is closer to an open-faced sandwich, or a tartine, where interaction is minimal and execution is the focus. In other cases, the interface layer disappears entirely. Systems operate autonomously, processing data, enforcing constraints, and producing outcomes without human prompting. That is the equivalent of a lettuce wrap, where the core ingredients do the work without the need for a user interface.


The more critical the system, the less you want a human guessing at prompts in the loop. That is the direction many parts of the real economy will move over time. Right now, we are still in an experimental phase. Many deployments are closer to testing than full integration, which is typical for any early-stage technology. But it is important to be clear about where things stand. We have built systems that are very good at sounding right. The next phase is building systems that are actually right. That shift, from impressive outputs to reliable outcomes, will determine how and where AI is ultimately deployed across the global economy.