Bitter Lesson Everything
What works for AI research also works for everybody else
In a future shaped by AI, the only way to stay competitive is to align with its core strength—leveraging computation at scale. As Richard Sutton puts it in his seminal essay ‘The Bitter Lesson’, “develop general methods that leverage compute.” This perspective, while rooted in AI, offers a universal strategy for thriving in an increasingly computational world.
If you can’t beat it, join it
The core insight of “The Bitter Lesson” is that the most effective path forward in AI research is to embrace general methods that harness the power of computation, rather than crafting narrow, bespoke solutions for short-term goals. Sutton argues that attempts to manually optimize for specific tasks often fall short as computational power grows, enabling simpler, scalable algorithms to dominate.
The same principle applies beyond research. For companies, it means building adaptable, computation-driven systems rather than rigid, one-off strategies. For careers, it suggests prioritizing skills and tools that scale with technological progress, like mastering AI-driven platforms or data-driven decision-making. For individuals, it’s a call to adopt flexible, learning-oriented mindsets that scale with computation.
Tesla’s FSD is a case study for a bitter lesson pilled product strategy
When Tesla introduced FSD in 2016, the technology was far from achieving true autonomous driving. Yet, the Tesla team made a bold strategic decision to invest heavily in custom-designed inference chips—specifically, the Hardware 3 (HW3) and later Hardware 4 (HW4) processors. These chips, tailored for neural network inference, were over-engineered for the FSD software’s capabilities at the time and came with a significant cost, estimated at $1,000 to $1,500 per vehicle. To many, this seemed like a risky bet on a technology that didn’t yet exist.
However, this approach aligns perfectly with Richard Sutton’s insight: prioritize general methods that leverage computation over short-term, bespoke solutions. Tesla’s inference chips were designed to handle massive computational loads, anticipating that future advancements in AI model architectures and training would unlock their full potential. Fast-forward to 2025, and FSD version 14 represents a general solution for autonomous driving. Today, Tesla owners can purchase FSD and experience their vehicles driving autonomously, navigating complex urban environments with remarkable competence. The system continues to improve rapidly, driven by iterative updates to neural networks and the vast data collected from Tesla’s fleet.
Tesla’s Optimus: Conquer labor with compute
Tesla’s Optimus, a humanoid robot designed to perform human-like tasks, is often dismissed by investors and analysts as a clunky, low-value project in its early stages. Yet, Elon Musk has stated that he dedicates much of his time to Optimus, viewing it as Tesla’s most significant Bitter Lesson-driven endeavor. Why? Because Optimus is not about immediate functionality but about building a platform for general intelligence that scales with computation.
Optimus is designed to pursue high-level goals, such as keeping an office clean or ensuring warehouse safety, using goal-oriented reinforcement learning. Unlike traditional robotics, where specific skills (e.g., sweeping floors or folding towels) are explicitly programmed, Optimus is trained on abstract objectives. The robot autonomously develops the necessary skills through a reward function that evaluates its success in achieving these goals. For instance, tasked with keeping an apartment clean, Optimus learns to sweep, fold, or organize by iterating toward the goal, not by following pre-coded instructions.
This approach reflects the Bitter Lesson’s core insight: avoid brittle, human-designed solutions and instead rely on general methods that leverage vast computation. However, real-world reinforcement learning is impractical—physical trial-and-error is slow, costly, and risky. This is where the Bitter Lesson’s emphasis on simulation comes into play. By developing high-fidelity physics engines and virtual environments, Tesla can train Optimus in simulation, allowing it to undergo millions of “rollouts” (trials) to refine its skills without real-world consequences. This mirrors Sutton’s vision of true intelligence: a system that learns continuously, free from human-imposed assumptions or priors.
Musk’s optimism about Optimus lies not in its current capabilities but in its potential. In five years, with advances in simulation, compute power, and AI architectures, Optimus could become a versatile, general-purpose robot capable of adapting to diverse tasks. To realize this, Tesla is investing heavily in the necessary infrastructure:
Colossus Data Centers: Massive computational resources to train AI models and run simulations.
Durable Robot Hardware: Robots built to operate reliably over long periods.
Low-Power, High-Performance Inference Engines: Onboard chips that enable real-time, offline decision-making.
These investments, while costly and speculative today, position Tesla to capitalize on the exponential growth of computational power, aligning with the Bitter Lesson’s philosophy of betting on scale.
NVIDIA’s Role: Enabling the Bitter Lesson
NVIDIA complements Tesla’s vision by providing the computational backbone for such ambitious projects. Through its Omniverse platform, NVIDIA offers high-fidelity simulation environments powered by OpenUSD and RTX rendering, ideal for training robots like Optimus in virtual worlds. NVIDIA’s GPUs and AI chips, such as the H100 and upcoming Blackwell series, deliver the raw compute needed for reinforcement learning at scale. Additionally, tools like Isaac Sim (built on Omniverse) enable developers to simulate robotics scenarios with photorealistic physics, directly supporting the Bitter Lesson’s emphasis on learning through computation rather than human-crafted rules.
NVIDIA’s own strategy is deeply Bitter Lesson-pilled. Rather than focusing on niche, specialized hardware, NVIDIA develops general-purpose GPUs and AI platforms that power diverse applications—from autonomous vehicles to robotics to generative AI. This flexibility allows NVIDIA to dominate industries where computational scale is the key to progress, reinforcing the Bitter Lesson’s principle of prioritizing general, compute-driven solutions.
Why It Matters
Tesla’s Optimus and NVIDIA’s computational platforms illustrate a shared commitment to the Bitter Lesson: build systems that harness computation to learn and adapt autonomously, even if the payoff is years away. While Optimus may seem rudimentary today, its development mirrors Tesla’s earlier bet on Full Self-Driving (FSD). Just as FSD’s over-engineered inference chips seemed excessive in 2016 but enabled autonomous driving by 2025, Optimus’s infrastructure is a long-term investment in a future where humanoid robots are ubiquitous.
The distinction between goal-oriented learning and skill-specific programming is critical. By training robots to pursue abstract objectives in simulated environments, Tesla and NVIDIA are unlocking the potential for machines that don’t just mimic human tasks but continuously improve through computation. This approach avoids the pitfalls of short-term, human-designed solutions that quickly become obsolete as compute power grows.
How can humans scale with compute?
The Bitter Lesson teaches us to prioritize general methods that leverage computation over specialized, short-term solutions. For individuals, this means developing skills that align with the tools and systems driving the future. One standout example is simulation, which is rapidly becoming the cornerstone of innovation across industries. Just as manufacturing once defined industrial progress, simulation will redefine how we design and build everything—from cars and rockets to cities and robots.
Simulation allows us to prototype, test, and refine complex systems in virtual environments before physical production. This approach is compute-intensive, relying on high-fidelity physics engines, general AI architectures, and vast datasets to model real-world dynamics. By mastering skills in simulation—such as building physics engines, designing generative AI models, or creating data-driven virtual environments—you position yourself at the forefront of industries like aerospace, automotive, robotics, and beyond.
A Case Study: Working on Starship at SpaceX
Consider a career at SpaceX, working on the Starship program, which aims to make humanity multi-planetary. What skills would make you indispensable? The traditional path might emphasize mechanical engineering or rocket manufacturing. However, in a compute-driven future, the ability to design and test Starship in virtual environments will be critical. Here’s why:
Physics Engines Grounded in Real Data: Developing simulations that accurately model aerodynamics, material stress, or orbital mechanics requires physics engines trained on real-world data. These engines must scale with compute to handle increasingly complex scenarios, such as landing on Mars.
General Architectures: AI models that power simulations should be flexible and adaptable, capable of learning new tasks without being overly specialized. This mirrors the Bitter Lesson’s emphasis on general methods over hand-engineered solutions.
End-to-End Virtual Design: Future products like Starship will be prototyped, tested, and optimized in virtual environments. Only after rigorous simulation will physical production begin, guided by robots seamlessly integrated with the simulation pipeline.
By investing in skills like computational physics, machine learning for simulation, or virtual environment design, you not only become invaluable to companies like SpaceX but also gain the potential to innovate independently—perhaps even competing with industry giants.
Simulation as the New Manufacturing
Today, manufacturing is critical to producing tangible products like cars, rockets, or electronics. While physical production will remain essential, the design process is shifting dramatically. With increasing compute power, products will be:
Prototyped in Virtu: Virtual models will allow engineers to iterate designs rapidly, testing countless configurations without physical prototypes.
Tested in Virtu: Simulations will stress-test products under extreme conditions, from rocket launches to autonomous vehicle navigation, using AI-driven scenarios.
Scaled in Virtu: Production parameters will be optimized in simulation, ensuring efficiency and precision before manufacturing begins.
Produced by Robots: Manufacturing will be executed by robots that integrate directly with simulation data, minimizing errors and adapting to real-time feedback.
This shift reduces costs, accelerates innovation, and democratizes design. A student or professional skilled in simulation can contribute to groundbreaking projects without needing access to physical factories.
How to Apply the Bitter Lesson to Your Career
To align your career with the Bitter Lesson, focus on skills and mindsets that scale with compute:
Learn data driven decision making tools such as simulation: Master platforms like NVIDIA Omniverse, Unreal Engine, or physics-based simulation frameworks (e.g., MuJoCo, Isaac Sim). These tools are becoming industry standards for virtual design.
Understand AI and Machine Learning: Study reinforcement learning, generative AI, and neural network architectures. These are the engines behind adaptive, compute-driven systems.
Embrace General Skills: Avoid hyper-specialization in niche tools that may become obsolete. Instead, build versatile skills in data analysis, and computational modeling.
Think Long-Term: Like Tesla’s bet on FSD chips or Optimus, invest in skills that may seem speculative today but will pay off as compute scales. Simulation and computational physics are prime examples.
Adapt: In a world of continuous learning, cultivate a mindset of lifelong learning to keep pace with evolving technologies. Don’t fall for the sunk cost fallacy.
The Bitter Lesson isn’t just about AI—it’s a blueprint for thriving in a compute-driven future.
By focusing on skills that leverage computation, like simulation, you can contribute to transformative projects while staying resilient in a rapidly changing world. Whether you’re designing the next Starship, optimizing a factory, or building a startup, the ability to harness compute through general, scalable methods will make you indispensable.
In summary, the Bitter Lesson for everyone is clear: just as AI research thrives by betting on computation, so too can you. Embrace simulation as the new manufacturing, develop skills that scale with compute, and prepare for a future where continuous learning and virtual design redefine what’s possible.

