Theory 2.0
Old: Scientist develops theory. New: Scientist designs pipeline for AI system to develop theory.
“Well, why not say that all the things which should be handled in theory are just those things which we also can hope to observe somehow.” . . . I remember that when I first saw Einstein, I had a talk with him about this. . . . [H]e said, “That may be so, but still, it’s the wrong principle in philosophy.” And he explained that it is the theory finally which decides what can be observed and what cannot, and, therefore, one cannot, before the theory, know what is observable and what not. Werner Heisenberg, recalling a conversation with Einstein in 1926, interviewed by Thomas S. Kuhn, February 15, 1963
Researchers expect a robot to abide by the laws of physics. This is an outdated view of science. A robot navigating through nature doesn’t abide by the laws of physics; it is the laws of physics. Or, more precisely, the robots behavior is what future scientists will call the laws of physics. Those laws might still be formulated in mathematics, but that is an old-fashioned way of describing patterns in nature, kind of like using Latin to describe science. In the future, the laws of physics will be expressed in the behavior of AI systems. So, for example, instead of saying this robot is following a Jacobian when reaching out to grab a bottle and using force F to overcome inertia I, we’ll say this apple belongs to the class of objects that are grabbed in a particular way by a particular robot.
Theoretical physics research will morph from conjecture about the laws of nature towards data structures, algorithms, and models to best learn and solve certain physics problems.
For example, instead of trying to find a function that describes the evolution of two hydrogen atoms, scientists will think about the data structure needed to train a model to predict this type of evolution. Data, model architecture, and algorithm design are going to be the work of theory.
The painted word turns into a virtual world
In his seminal essay “The Painted Word,” Tom Wolfe aptly describes the ascent of modern art in the first half of the 20th century. The centerpiece of Wolfe’s argument is that paintings and art are perceived through theory. We think what we see. Therefore, art is really what happens in the brain when words and observation collide. But it’s always the theory that precedes the art. In that sense, the artist is just an illustrator of words, and the true artists are the editors, curators, and art historians who fill the void between what we see and what we think.
Heisenberg puts a hard stop on observation
Wolfe was an avid observer of New York elites and focused on art and money. But his point has a much wider meaning. Wolfe captures the Zeitgeist of the 20th century. Art is just a narrow lens through which the entire culture of the 20th century can be seen, felt, and grasped.
And it didn’t start in the galleries on the West Side of New York nor in the salons of rich patrons on the Upper East Side. It started much earlier, in the corridors of German universities around the turn of the century, when the world as it was known was turned upside down. In his conversation with Heisenberg, Einstein lays the foundation of what the 20th century will eventually become: a century of theory. This would have phenomenal implications for humanity and bring about the twins of progress and destruction. When humans rid themselves of the shackles of observation and instead embark on the liberating path of imagination and theory, they have much more potential. But they also run the risk of destroying themselves. Theory can bring about amazing innovations such as cheap, abundant energy through nuclear fission. But it also brings Hiroshima and the Gulag. Thus, the fingerprint of the 20th century is the act of seeing through theory, a uniquely human invention that gives us superpowers.
In this essay, I will describe the wider implications of modernist culture in the 20th century, from art to Wall Street and science. It’s not just in the arts, where theory determines what we see. The same applies to investing, entrepreneurship, and most crucially, the sciences, as Einstein says in his dialog with Heisenberg. Today, we are at the crossroads of theory. Inspired by advances in artificial intelligence and machine learning, we are entering a new area where theory is becoming less crucial and rapid try and error at scale take over. I call this new period “The Area of “Hyper-Conjecture”. Whether it’s robots learning how to drive or digital assistants learning how to write poems, the process of developing theories is changing. Instead of creating theories, scientists will focus on developing pipelines for AI systems to create theories.
Recently, my fully end-to-end-trained self-driving Tesla stopped at a red light and turned left after the green arrow popped up. My son asked me whether the car knew what an arrow was. I said, “No, the car doesn’t care. An arrow, as far as the Tesla is concerned, is the thing that makes the car behave in a certain way. It learned this through millions of iterations.”
Move 37
When Deep Mind demonstrated Alpha Go Zero, a machine that beat the best players in the game of Go, they shocked the world. But the shocking thing was not that a computer could beat a human in a board game. It was a fact that Alpha Go Zero was never formally taught how to play Go. It learned all by itself. No explicit knowledge was required—no theory. All the machine needed was data, energy, and some time to figure out the best path through the maze of many optional paths to victory. Alpha Go Zero was trained with reinforcement learning. That is a set of rules that allows the agent to learn from mistakes and improve its positioning in the game. RL, as reinforcement learning is called, allows for adaptive improvement through trial and error. The machine gets rewards for doing something right and punishments for the opposite. It’s precisely the role of the scientist to design appropriate rewards and punishments so the agent can better optimize its trajectory through the maze of all possibilities.
RL is still an emerging field in the broader area of artificial intelligence. I once joked with some researchers that RL is the hip-hop of the 1980s. Back then, hip-hop was an underground music scene in the shadows of rock and pop music. But it kept going and emerged as the quintessential musical and cultural melting pot of our time. Today, RL operates in the shadows of more prominent AI architectures, such as transformers, which are trained on statistical approximations with lots of data. In essence, modern AI systems are compressed representations of the real world with highly optimized algorithms for prediction. For example, if I show the AI a picture of an apple, the AI searches its representation of the world for a match until it finds something that resembles an apple. It then classifies the picture an apple. A similar method is used in the more sophisticated approaches of Chat GPT and Perplexity, two popular AI engines. Here, we input more than just an image or a word. We can input a whole sentence, and the machine will output something that resembles, most likely, a useful reaction to the sentence, such as an answer, for example. The problem with all these approaches is that the AI must have some sort of representation of the real world stored in its memory. Something researchers call a “world model". Think of modern AI such as ChatGPT or Perplexity as compressed representations of the knowledge of the internet. We then search that compressed knowledge base for insights.
RL is different. Here, we operate without a world model, or, to be more succinct, we don’t need to know anything about the world in order to operate in it. We don’t need to know what an apple is, how fast it falls from the tree, or whether its texture is soft or hard when we bite it. All we need are lots and lots of training sessions. We must find a way to navigate through the real world and learn from our mistakes. This is how humans learned how to navigate through the real world in the millennia before us. RL is the quintessential tool in this new area of Hyper-Conjecture. We will train robots to navigate in the real world and make increasingly complex decisions based on billions of trials in simulation. Eventually, they will get it. This is the central thesis of this essay. Humanity will rid itself of the shackles of theory and embark on a new frontier where knowledge, intelligence, and even wisdom will be derived from agents who are trained in simulation and/or real environments where they have the luxury to make mistakes, learn, and eventually get really good at navigating through the real world.
Now, Theory 2.0 doesn’t mean we are done with thinking about how stuff works. To the contrary, we must focus much more on the inner workings of things. But this time, it’s different. Theory, as we know it, is a mapping of reality to our model of the world. We see things, and then we conjecture what the underlying dynamics might be. When Galileo saw the apple fall from the tree, he guessed what might be going on there. He tried to map his observation to a model of the world and came up with the theory of motions, which later culminated in Newton’s gravity. Today, Theory 2.0 is concerned with the three fundamental building blocks of AI, which I call the AI stack. Data, architecture, and model The central doctrine of AI is data -> architecture -> model. Theory 2.0 will work on those three building blocks of the AI stack. Instead of thinking about the formula for the motion of an apple, physicists academia 2.0 will think about what kind of data is necessary, what architecture is most appropriate, and which model is most effective in approximating an apple falling from the tree. The ultimate goal of Physics 2.0 is to design a robot to catch the Apple. Watching the robot catch the apple will be the equivalent of writing a paper. The explanation will be the code on GitHub. Whether this model comes up with a better theory of gravity or not is irrelevant. What matters is that the AI finds the best possible approximation of an apple falling from the tree and then catches it in the most efficient way. Theory 2.0 is concerned with the input to the AI stack, not the mapping of observation to theory. In that sense, theory will become much more like art or movies. We observe stuff and get a sense of what it means. Some might still attempt to theorize about it, but who cares? What matters is how the robot reacts.
Imagine a robot on Mars catching an apple. This process will look very different from earthly apple catching. Scientists will design robots to do just that, catch an apple on Mars. They will read the laws for physics on Mars from the behavior of the robot.
Outline
This essay is structured as follows: First, I will give a brief overview of the cultural and scientific developments that led to the pivotal changes at the turn of the 20th century. Specifically, I will focus on the parallels between Picasso and Heisenberg. Second, I will describe the impact of modernity on art and science, and third, I will expand on their influence on Wall Street and the broader economy. In the fourth section, I will focus on the evolution of AI and its inner workings. For more than half a decade, AI has been dominated by Alan Turing’s imitation game and the idea of building digital agents that can trick humans into believing they deal with other humans. In other words, AI has been about virtual agents acting as if they were humans. Today, we are embarking on a new trajectory with AI working its way through the maze of nature, and the ultimate arbiter is not humans but physics and nature. The tool for that is called reinforcement learning. Eventually, all AI will be called reinforcement learning, or RL will just be labeled AI, either way. Finally, fifth, I conclude with a discussion of the profound impact on culture and science this intellectual shift will bring about. In particular, the shift away from theory about nature towards theory about how to design algorithms that navigate through nature in order to learn from it. I call this new area “Theory 2.0, or the area of Hyper-Conjecture.”
1. What I see is what I think
When Van Gough and his friends set out to the countryside to paint flowers and lakes, they didn’t do anything unusual. Artists had been working outdoors for centuries, trying to replicate what they saw. Realism was not just the artistic narrative du jour; it was the cornerstone of culture and science. When the impressionists came along, they just changed their perspective from what they saw to what they felt, or “what they were impressed with." Seeing meant understanding. This subtle difference brought about widespread changes. In that sense, the artist was nothing but a scientist with a different laboratory. Instead of alchemy or physics, they used brushes and paint. But the goal was the same: to understand nature. “What I see is what I think.” And thinking is understanding. “If I can paint it, I can understand it.”.
By the time impressionism came along, this type of thinking was well established in the salons and universities of Europe. But who started this? Things like that don’t just happen. They evolve. Think about it: how did something that started as bacteria, then evolved into fish, frogs, birds, apes, and eventually humans come up with the idea to see and understand? How did that happen? It most likely started somewhere around the time of ancient Greece, probably even before. But let’s stick with the Greeks, since they are the most famous. They looked at the sky and figured that there was something to learn from what was going on there. Greek philosophers, scientists, and artists started to look, and low and behold, they started to learn from looking. In fact, just the desire to look at items constitutes an intellectual achievement. My cat doesn’t spend energy observing the sun. Humans do.
Enter Jesus and his Judeo-Christian sect. They didn’t agree with that notion and created something else. Instead of guessing and conjecturing about nature, Jesus et. al created a monotheistic religion, which dictated what people should think when they look. Everything was dictated by God and the Scripture. It took humanity more than a thousand years to abandon this view. When Galileo looked at the apple falling from the tree, he conjectured that there was something fascinating going on there. “The apple doesn’t just fall from the tree; the earth is also falling towards the apple." This made no sense to contemporaries and particularly upset the clergy, since they were in charge of knowing about the world and explaining it to regular people. Galileo was a threat, and they attacked him for that. To see and form a theory was not okay. There was no room for theory because the scripture had everything written down there was to know. “Who is this guy, Galileo, to come up with new knowledge?”
2. From Look first, then think to think first, then look
But Europe was way too competitive to let an opportunity like that fly by. Columbus had just discovered America not long ago, and fortunes were to be made with advancements in science, technology, and warfare. So Galileo’s idea to use theory as a description of nature got traction and proliferated through the universities of Europe. Whether the church liked it or not couldn’t deter the academics from jumping on the theory bandwagon. When Newton came along with his work on the mathematics of heavenly bodies, the dam broke, and science and technology were officially born. Out of that fascinating work, engineering evolved, and humans suddenly started to build wonders of the world. Machines, ships, engines, bridges, and eventually whole industries were built on the shoulders of theory.
Let’s look at this a bit more closely. What is theory exactly, and how does it help in developing steamships or bridges? Theory is mapping a model to the real world. You see things, and then you try to put them into words so others can understand them. Galileo had a clever idea back then, in the 16th century. He thought that expressing theory in the language of mathematics might be a good idea. Why? First, it was much more convenient than Latin, the then prevalent language of professional researchers. Human language was and still is messy. Using numbers and symbols simplifies the communication of complex physical concepts. So Galileo invented math as the first universal description language for science. This was probably his biggest contribution to the world, since mathematics proliferated through the academy like wildfire. But there is a second reason why math is so successful in science. It’s because it works so well. In his 1960 essay "The Unreasonable Effectiveness of Mathematics in the Natural Sciences," physicist Eugene Wigner wrote about this mystery. His conclusion is that the reason math is useful to science is hard to explain. But scientists don’t care. It works.
Newton took it to the next level and constructed the first universal theory of nature in his Philosophiæ Naturalis Principia Mathematica. Let’s think about this for a second. Newton didn’t just come up with the clever idea of gravity. He also managed to formulate complex relationships in a language others were able to understand and replicate. Mathematics is the language of science because it’s extremely efficient. And it works. Well, most of the time. But more of that later. Newton unleashed a revolution in science and technology because his concepts were understandable, replicable, and succinct.
Observation became the key concept with which to fight the doctrine of the church. “It’s not about what the scripture says, but what I see," a typical scientist would say in the 19th century. Victorian England, the epicenter of science back then, was obsessed with observing, collecting, referencing, filing, and organizing. Take Darwin, for example. He embarked on a ship to observe the Galapagos, and while doing so, he came up with the theory of evolution. But observation didn’t just serve the sciences well. The arts also benefited from a new perspective, away from scholastic doctrine towards what the artist actually sees. Impressionism followed realism because it offered more opportunity. It’s not just what you see but also what you feel while observing.
Enter Werner Heisenberg and the Uncertainty Principle. In his pivotal work on the inner workings of quantum mechanics, Heisenberg formally established that there is a fundamental limit to observation. There is literally the smallest unit of observation beyond which the act of observing doesn’t even make sense anymore. Another important factor of Heisenberg’s Uncertainty Principle is that observations only makes sense in relation to something else. In fact, the whole edifice of quantum physics is built on the idea that there must be an observer outside of the system to look at it for the system to have scientific meaning. So, observation matters, but it has fundamental limits. It’s impossible to overstate the impact of Heisenberg’s principle on contemporary science and culture. If observation has limits, then what else is there? How about the mind? How about theory?
Picasso and Einstein both offered answers to this question. They would go on and define the Zeitgeist of the 20th century. While the former used canvas, the latter used chalk and a blackboard. But both men centered on one crucial idea, which is that we see what we think. “Observation is theory-laden," to use David Deutsche’s description of the same idea.
“I don’t want to fool the eye. I want to fool the mind,” says Picasso. Painting, so Picasso, was a means to express thoughts. Even though the Spaniard was very physical when it came to his paintings, refusing to be sucked into lengthy excursuses of theory and pondering about his art, he was well aware of the impact his revolutionary style had on the art world. Paintings such as “Les Demoiselles d’Avignon” and the follow-on cubist pieces reflected a new direction where art and painting were an illustration of thought. Most crucially, this thought was not objective but subjective. In other words, looking at art is a personal experience. Whatever the artist tries to express eventually lies in the eye of the beholder. Observation matters in the sense that the observer is part of the creation. Art is thought and then seen, not the other way around.
The same dynamics overtook science. Heisenberg’s uncertainty principle put a hard stop on observability and thus paved the way for theory to take the throne in science. Theoretical physics became a thing and propelled its most prominent proponents, such as Einstein, Fermi, and later Feynman, to rock star status. Observation, according to Einstein, is theory-laden. What you see depends on what you think. This was a massive shift in attitude away from the values of the Enlightenment and back to the more anthropocentric world of theory and human conjecture. Theory opens many more doors and opportunities, and in the words of David Deutsch, it is literality “The Beginning of Infinity”, as the British philosopher eloquently formulated in his book with the same title. But theory also brings risks. Communism and fascism are based on theories; Mao’s Great Leap Forward was a theory, and today’s tendencies toward intolerance and harassment in American liberal politics are based on theories that often defy the very foundation of liberal democracy. In short, theory comes with upside and downside.
3. Investing is theory applied to finance
When investors fund entrepreneurs, they often rely on nothing more than theory. Most of the time, it doesn’t work out for them, but every now and then, a new trillion-dollar company is born. Wealth creation, progress, growth, democracy, and every other good thing about our economic system rely heavily on theory. There is a fine line between theory and illusion, between hopes and hype. It’s the core skill of a seasoned investor to differentiate between exactly those two factors. Nothing happens without theory. Whether it’s Bill Gates developing a new software operating system, Steve Jobs building a small, internet-enabled pocket computer, Jeff Bezos building a cloud computing infrastructure, or Elon Musk building a software-defined electric car, all these endeavors require belief in theory. It’s an almost uncanny ability of humans to invest billions of dollars in such theories. On the one hand, it’s great and produces enormous wealth. On the other hand, it’s borderline irresponsible. Human ingenuity and processes depend on exactly this fine line. To paraphrase Tom Wolfe, “Theory is the first-class coach on the freight train of history.”
Where does this belief come from? How have humans developed the ability to believe in their thoughts and take enormous risks? When Starship is launching from Boca Chica on the shores of the Gulf of Mexico, Space X engineers heavily rely on theory to make it work. What is going on here?
Theory is the mapping of ideas to reality. It’s like a muscle; you train it, and it gets better. What exactly happens at the level of synapses and neurons in the human brain is not clear. But we know for sure that an emerging property comes out of the process of generating theory. It’s ideas. Those ideas sometimes proliferate, like memes, and change the world. Ideas are the workhorse of theory, like proteins originating from DNA code, ideas come out of theory. But where does the latter come from? Maybe it’s just luck. Maybe some sort of loop that amplifies itself, like Douglas Hofstaedter says in his book “I am a Strange Loop." Whatever it is, it follows some sort of feedback loop where input comes in, gets processed, and then, through some sort of self-referential feedback mechanism, generates memory and knowledge. From there, theory emerges, and from there, ideas.
4. The emergence of AI and Theory 2.0
It’s remarkable how early AI pioneers like Yann LeCun, Jeffrey Hinton, and Joshua Bengio built models based on this idea. Whether they were inspired by Hofstadter remains an open question. His seminal book “Gödel, Escher, Bach” appeared in 1979, which, from a timing perspective, works perfectly. Yann LeCun et. al. introduced the canonical convolutional neural net with the spicy name “Le Net” in the late 1980s. The goal of Le Net was to build an artificial reader of human handwriting. They literally worked for the US Postal Office, which was struggling with the problem of identifying handwritten addresses on letters. Le Net was built on a convolutional neural net architecture, but most importantly, it relied on a feedback loop where signals come in and get processed through a mesh of layered neurons until they reach a final output layer. This result then gets compared with a real example, or ground truth. In order to improve the result, the network then back-propagates through all the layers of neurons and updates their parameters according to a specified optimization function. This negotiation between input and output continues in a closed loop until the network stops improving. In other words, the loop continues until there is no more point in looping forward and backward. In technical jargon, this means that the network converges.
This type of looping, according to Hofstadter, might be the key to knowledge and conciseness. The core idea here is that information gets processed in a knowledge-defining loop. Now, the question naturally comes up: what kind of information and what kind of knowledge? Philosophers have debated this topic for millennia. It boils down to the question of purpose and what it means to be intelligent.
"I need my eyes to see and my I to be." D. Hofstaedter
AI research is still in the early stages of this steep intellectual hill. But some ideas have already been formulated. In particular, Berkley professor Sergey Levine comes to mind with his notion of a “Robinson Crusoe robot." Levine asks the question: What does it mean to build a machine that can be left on a deserted island? He then visits the island two years later and finds itself organizing itself with all the necessary ingredients for survival. For example, it figures out how to harness energy to refill its batteries; it learns how to build shelters to protect itself from the sun and rain; etc. The key concept here is to build a general robot that has human-like attributes. Humans can be left stranded in the South Sea or in the Amazon Jungle, and they have the ability to at least try to survive. Modern robots don’t. This difference might be just what defines intelligent life. It’s not about skills or the ability to predict things. It’s neither ChatGPT regurgitating Shakespeare, Perplexity delivering amazing search results, nor Tesla driving around San Francisco by itself. The key to sentient intelligence lies in the ability of the machine to develop a sense of “I.”. The sense of "I," according to Hofstadter, is what separates sentient beings from plants, algae, or viruses.
Is my Tesla sentient? Recently, I subscribed to FSD, which is the closest I have experienced to a robot that feels human. The car nudges forward on a stop sign, waits, and then quickly turns. It slows down in a school district and crawls over speed bumps. Still, it doesn’t feel sentient. But where is the line? AI researchers often talk about unsupervised learning, which is the technique of analyzing data without labeling it and letting the algorithm make sense of it. ChatGPT and Tesla are both trained on unsupervised learning architectures.
Here, I introduce another term: “unsupervised intelligence." It’s when machines learn to navigate the world without having been programmed for specific tasks and/or goals. Levine’s “Robinson Crusoe” robot is an example of unsupervised intelligence. Imagine sending a rover to Mars. This machine will not have any connectivity with the earth. It must be able to function without specific supervision. Unsupervised intelligence is still new and barely explored. In my opinion, research in this area will answer many questions about conciseness and sentient beings.
5. Design the machine that creates the theory
Back to where AI stands today. Breakthroughs such as GPT, which stands for generative pre-trained transformer, have solved the problem of metalearning. These models learn from many different sources of data how to extrapolate a given input. In other words, when prompted, they are able to generate something that makes sense. Today, these models are primarily used in the text domain, but it’s only a matter of time until transformer technology penetrates the robot and physics domains with so-called multi-modal transformers. The latter, in my opinion, is just a matter of compute, memory, architecture, and a bit of model innovation. It doesn’t require new breakthroughs, just engineering. Once multimodal transformers are up and running, physics will change forever.
Imagine a prompt: “Show me a robot that catches an egg." This robot will, in simulation, act based on billions of iterations it learned previously about how to catch an egg. Physicists will observe this robot and formulate a new version of gravity, which is “When catching an egg, this is how a robot behaves." Now let’s take this to the next level. Imagine a simulator that simulates billions of versions of the egg falling. The key to this simulator is that it was trained on real “eggs falling." This simulator would be able to generate billions more examples of eggs falling. Eggs falling straight, being pushed by the wind, experiencing resistance due to snow, etc. The key to this simulator is that it was trained end-to-end, without any explicit physics. Nobody tells the simulator how an egg is exposed to fluid dynamics when carried by the wind. It just learns from millions of examples to generate billions more. Now, our robot must be programmed in a certain way so that it captures the essence of catching an egg. Current state-of-the art models deploy something like Q-learning, which is a form of reinforcement learning. Q-learning has captured the imagination of researchers since Deep Mind presented Alpha Go Zero, a machine that was able to beat the best Go players. The key innovation in Alpha Go Zero was that it was never formally taught anything about the game of Go. It was neither imitating human players nor was it given any prior information on how to successfully navigate the board. It wasn’t even taught the rules. All Alpha Go Zero did was play in billions of iterations with a simulator and learn how to play from there using something akin to Q-learning. The result was extraordinary. Not only did Alpha Go Zero beat all human players, it also beat other artificial intelligence agents. Even more exciting was so-called move 37, which was a move Alpha Go Zero chose that, at first sight, did not make any sense to human experts. Only many moves later did human experts realize that move 37 was key to winning the game. In other words, Alpha Go Zero came up with its own move independent of what humans would have ever come up with. Move 37 is one of those moments in the history of science when many things change.
There is another move-37-type example from a very different experiment. Davide Scaramuzza, a professor at the University of Zurich, set himself up to compete with the best human drone flyers. His goal was to build a drone guided by artificial intelligence that could beat the best human drone flyers. Needless to say, Scaramuzza obliterated the human competition. But what is more interesting about his experiment is the analysis after the race. He compared the trajectory of human-operated drones to the one his robot drones chose and found that the robots did something very unusual. When turning into a gate, they sometimes did almost a somersault, which gave them a little bit more momentum for the next turn. This move, according to human experts, is very risky and would not have occurred to humans because their success rate would be so low. In other words, the robot was able to practice on millions of simulated races and learn a move that, in real life, would have cost most humans their lives.
Imagine a downhill ski racer going down the Streif in Kitzbühel, the most notorious downhill ski race in the world. Imagine a racer choosing to do a sideways summersault instead of turning into a particularly steep and narrow section. This would take him a few hundred seconds. But a human would never attempt to try something like that because he would most likely die. However, the machine has the luxury of testing such moves in millions of iterations in simulation and then, once it has figured out all the physics, applying them in the real world. A robot trained in simulation would most likely win the Kitzbühel race the same way Scaramuzza’s robots won the drone flying race.
The key sentence here is “once it figured out all the physics." My point is that a machine would learn physics by iterating over millions, if not billions, of tries and errors until it figured out the optimal way to do things. The same applies to the robot catching an egg.
In this new world physics is not about equations and theoretical concepts but about building robots that interact with the world in simulation. In other words, the key contribution of physicists in the future will be to design pipelines so AI systems can learn in simulation how to interact with the environment, then translate the learnings from simulation to the real world. This is easier said than done. When Alpha Go Zero inspired the world of AI with a superhuman Go player, the system was trained in simulation. But Alpha Go Zero had the luxury of training in a static environment. Go boards never change. But Scaramuzza’s drones had to fly in the real world, and the hypothetical Kitzbuehl racing robot would have to race in real snow exposed to real physics. Building simulations based on such environments is hard. It’s a problem not yet solved by the time of this writing in 2024. It’s the so-called inverse graphics problem. Instead of representing a 2D image in a perfect 3D simulation, you take an image of a real-world 3D scene and represent it in 2D with all the physics in tact. Solving this problem is what the science of the 21st century will be like.
In this new world of science, researchers are not spending too much time understanding how nature works. They put their energy into building simulators and designing robots that interact with those simulators. That’s the key. I call this new type Science 2.0. Instead of formulating theories about physics, scientists will spend their efforts designing data pipelines to feed simulators and reinforcement learning architectures for AI systems to interact with the simulators.
Physics will become the science of designing AI systems and simulators to observe how interactions with virtual environments. Instead of formulas and equations, physics will be expressed in behavioral language. “The egg falling induces the robot to act in this particular way.” Instead of formulas, the language of physics will consist of trajectories. The key difference between Physics 1.0 and Physics 2.0 is that in the former, researchers try to formulate behavior a priori in terms of mathematical formulas, whereas in the latter, the goal is to build an AI stack so that when the model is prompted with a query, it offers the most optimal trajectory.
The role of AI is to serve as a function approximator
Where does this leave technology? How can we build a Golden Gate Bridge or a self-driving Tesla with this kind of new science? We don’t. We let the robot build it. More precisely, instead of developing mathematical shortcuts to nature, which is how science and technology work today, we will design AI systems to come up with the best solution for engineering problems. The prompt would be something like, “When building a bridge across the Golden Gate channel in San Francisco, how would you build it?”. The robot has access to a pre-trained representation of the real world with objects like steel beams and their relationships with each other as well as with wind, water, snow, and any other elements. Think GPT in four dimensions (3D plus time). Now, the AI will iterate over billions of tries at how to build the GGB until it finds a path towards building a functioning bridge. In this new world, technology means optimizing robots to execute tasks based on the experience they got from iterating in simulators that were trained to generate reality. The loop of generative AI and physics is closed through simulation.
The new type of science resembles “Constructor Theory,” as formulated by David Deutsch and Chiara Marletto.
Conclusion
Humans see themselves through the lens of science. When Boltzmann formulated the theory of gases as a probabilistic approximation of the behavior of molecules, Impressionist painters chose to approximate lines with points. When Heisenberg put a hard stop on the ability to observe things, artists chose to observe through their thoughts. “We see what we think” was the motto. Today, we are at another crossroads in science, where AI is taking over the analysis of nature. Science in the area of generative AI will consist of developing data pipelines to feed simulators that can train AI to solve engineering problems. In this world, science will find a new language—a language away from mathematical formulas towards a behavioral description of systems. Observing nature will be akin to observing how AI moves through nature. Scientists in the 21st century will focus less on asking questions about how atoms react with each other or how jet engines process kerosine. They will invest their ingenuity in how to design appropriate data pipelines and algorithms to enable systems to iterate in simulation and learn behaviors. Those behaviors are what we today call technology. Key to this new type of science are three fundamental tasks every scientist will be confronted with. First, they have to represent the world as accurately as they can in the virtual world. Second, they have to develop algorithms such as Q-learning that can iteratively find trajectories in simulation that solve a specific problem. Third, they have to translate the learnings from simulation back to the real world. Some tools, like the petri dish, the Bunsen burner, or the electron microscope, will become relics for science museums. Others, such as math, will retain a crucial role and new tools such as data formats like the Universal Scene Descriptor (USD) will become standard tools in science labs. Theory 2.0 is about designing systems that create solutions for science problems.