Andrej Karpathy Outlines the Shift From Vibe Coding to Agentic Engineering

Andrej Karpathy

May 1, 2026 · Updated May 7, 2026

Andrej Karpathy detailed a transition toward an agent-native economy where LLMs serve as the primary substrate for computing rather than just tools for acceleration. He introduced agentic engineering as a high-ceiling discipline for building reliable autonomous systems that replace classical code with natural language instructions.

Andrej Karpathy, founder of Eureka Labs, outlined a shift toward agentic engineering as the successor to the vibe coding era. He described a future where LLMs natively handle entire application logic flows—a concept called menugen—and replace complex installation scripts with .md instructions that models interpret and debug inline.

This transition addresses the jaggedness of model capabilities, where an AI might refactor massive codebases but fail simple logic. Karpathy attributes this to reinforcement learning (a training phase where models learn from feedback) economics, where labs prioritize data for high-revenue domains, leaving other tasks off-road and unreliable.

Prepare for this agent-native economy by shifting from writing Software 1.0 code to building Markdown-based knowledge architectures that LLMs process as unstructured data. This involves decomposing products into sensors and actuators that agents control, treating classical CPUs as coprocessors for a primarily neural computing stack.

View the full update on x.com

Andrej Karpathy

@karpathyApr 30

Fireside chat at Sequoia Ascent 2026 from a ~week ago. Some highlights: The first theme I tried to push on is that LLMs are about a lot more than just speeding up what existed before (e.g. coding). Three examples of new horizons: 1. menugen: an app that can be fully engulfed by LLMs, with no classical code needed: input an image, output an image and an LLM can natively do the thing. 2. install .md skills instead of install .sh scripts. Why create a complex Software 1.0 bash script for e.g. installing a piece of software if you can write the installation out in words and say "just show this to your LLM". The LLM is an advanced interpreter of English and can intelligently target installation to your setup, debug everything inline, etc. 3. LLM knowledge bases as an example of something that was *impossible* with classical code because it's computation over unstructured data (knowledge) from arbitrary sources and in arbitrary formats, including simply text articles etc. I pushed on these because in every new paradigm change, the obvious things are always in the realm of speeding up or somehow improving what existed, but here we have examples of functionality that either suddenly perhaps shouldn't even exist (1,2), or was fundamentally not possible before (3). The second (ongoing) theme is trying to explain the pattern of jaggedness in LLMs. How it can be true that a single artifact will simultaneously 1) coherently refactor a 100,000-line code base *and* 2) tell you to walk to the car wash to wash your car. I previously wrote about the source of this as having to do with verifiability of a domain, here I expand on this as having to also do with economics because revenue/TAM dictates what the frontier labs choose to package into training data distributions during RL. You're either in the data distribution (on the rails of the RL circuits) and flying or you're off-roading in the jungle with a machete, in relative terms. Still not 100% satisfied with this, but it's an ongoing struggle to build an accurate model of LLM capabilities if you wish to practically take advantage of their power while avoiding their pitfalls, which brings me to... Last theme is the agent-native economy. The decomposition of products and services into sensors, actuators and logic (split up across all of 1.0/2.0/3.0 computing paradigms), how we can make information maximally legible to LLMs, some words on the quickly emerging agentic engineering and its skill set, related hiring practices, etc., possibly even hints/dreams of fully neural computing handling the vast majority of computation with some help from (classical) CPU coprocessors.

7315.5k

View on X

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

Keep reading

Simon Willison Reports Professional Engineers Are Succumbing to Vibe Coding

Simon Willison observed that the distinction between casual vibe coding and professional agentic engineering is disappearing as AI agents become consistently reliable. He warns that even veteran developers are treating agents as black boxes, shifting the primary software bottleneck from writing code to design and verification.

Andrew Ng Outlines Organizational Shifts for AI Native Engineering Teams

Andrew NgApr 28

Andrew Ng Outlines Organizational Shifts for AI Native Engineering Teams

Andrew Ng argues that agentic coding has shifted the primary software development bottleneck from writing code to strategic decision-making. This transition forces engineering teams to adopt lower product manager ratios and requires developers to expand into design and marketing roles.

Andrej Karpathy replaces complex RAG with LLM-compiled Markdown wikis for research

Andrej KarpathyApr 5

Andrej Karpathy replaces complex RAG with LLM-compiled Markdown wikis for research

Andrej Karpathy detailed a workflow where LLMs compile raw research documents into a structured Markdown wiki managed in Obsidian. This approach shifts token usage from writing code to manipulating knowledge, using agentic loops to maintain and query the data without manual editing.

Cursor Details the Agent Harness Engineering That Drives Coding Performance

CursorApr 30

Cursor Details the Agent Harness Engineering That Drives Coding Performance

Cursor shared a technical deep dive into its agent harness, the orchestration layer that manages context, tools, and error correction for its AI coding agents. The update reveals that agent performance depends less on raw model power and more on specialized engineering like dynamic context discovery and model-specific tool tuning. This shift highlights why the harness is becoming the primary competitive moat for AI-native development tools.