What is vibe coding?
Vibe coding is a way of building software where you describe what you want in natural language, let an AI generate the code, run it, and iterate on the result — without reading, reviewing, or understanding the code the model produces. The term was coined by AI researcher Andrej Karpathy on 2 February 2025, in a post that described a mode where you "fully give in to the vibes, embrace exponentials, and forget that the code even exists." The tweet was, by Karpathy's own later admission, a "shower of thoughts throwaway" — yet it named something millions of developers were already doing and sparked a global conversation about what software development looks like when writing code is no longer the bottleneck. The practical consequence Karpathy didn't dwell on: vibe coding leans entirely on AI coding agents running inside your editor or terminal, and every one of those agents bills by the token. Forgetting that the code exists doesn't make the token meter stop. It makes it run faster. This article covers what vibe coding is, how it differs from traditional and agentic coding, where it genuinely shines, where it quietly bites, and — the part most guides skip — how to keep the token bill from turning a fun weekend hack into an expensive one.How does vibe coding actually work?
Vibe coding is a conversational loop with four steps: describe, generate, run, report.- Describe. You write a natural-language prompt — "build a markdown note app with local storage" — and paste it into a tool like Cursor, Claude Code, or GitHub Copilot Workspace.
- Generate. The model produces the code. You don't read it; you accept it.
- Run. You execute the result. Something works, something breaks, or something looks wrong.
- Report. You paste the error, the weird behaviour, or the next feature request back into the chat. Repeat.
How is vibe coding different from traditional coding?
Traditional coding means reading and writing every line. You understand the abstractions, the data flow, the edge cases. AI tools — autocomplete, inline suggestions — speed up the typing but don't change the ownership model: you are responsible for what gets committed. Vibe coding inverts that. The model writes; you steer. The trade-off is explicit: you move faster on the surface and accumulate debt underneath. Bugs that a careful code review would catch don't get caught, because no careful code review happens. Security issues that a developer would recognise as dangerous pass unnoticed. Technical debt compounds silently because the codebase grows through prompt-and-accept cycles, not through deliberate design decisions. That's not a reason to avoid it — it's a reason to scope it correctly (more on that below). Many experienced developers use vibe coding for throwaway scripts, prototypes, and anything they'd delete in a week without remorse. The problem is when the prototype becomes the product.How is vibe coding different from agentic coding?
The distinction matters for understanding the token cost, so it's worth being precise. Vibe coding is a mindset: low oversight, natural-language-first, accept-what-comes. You can do it with any tool that generates code. The AI is still largely reactive — it responds to your prompts, one exchange at a time. Agentic coding is a mode of operation: the AI agent plans a multi-step task, executes actions (reads files, runs commands, writes tests, edits code), observes results, and loops — autonomously, without a human prompt driving each step. You give the agent a goal; it decides how to achieve it. Most vibe coding today is powered by agentic tools — Claude Code, Cursor's Composer, Codex — but the tools are more capable than the practice demands. You can use an agent as a vibe-coding interface (prompt → accept → next prompt) or as a genuine autonomous executor (goal → agent plans → agent ships). Agentic coding is what happens when you use the second mode deliberately, with oversight. Vibe coding is what happens when you use either mode with none. The reason this matters for cost: an agentic loop, even a short one, can execute dozens of sub-steps before surfacing a result. Each sub-step reads files, runs tools, and writes output back into the context window. Every token used in those steps — input and output — is billed. Vibe coding amplifies this because the low-oversight approach means the agent gets vague goals and compensates by reading more context to orient itself.Where does vibe coding shine?
Used deliberately, vibe coding is genuinely powerful for a specific set of problems: Prototyping and idea validation. You want to know if something is feasible before investing real engineering effort. Speed matters more than quality. A vibe-coded prototype that tells you the idea doesn't work in an afternoon is worth far more than a polished codebase you spent two weeks on. One-off automation and scripts. Tools you'll run once, or utilities that will never see a production server. If it breaks, you delete it. The risk profile of the code matches the stakes. Translating non-code knowledge. Designers, PMs, researchers, and subject-matter experts who can describe what they want in detail but don't write code are the natural vibe coding audience. For them, the gap between "I can describe this" and "I can ship this" collapsed in 2025. Learning by reading generated output. Some developers use vibe coding as a teaching mode — generate a solution, then study what the model produced to understand the pattern. This only works if you do actually read the code, which breaks the pure vibe-coding contract, but it's a useful hybrid. Competitive jam builds. Hackathon projects, game jam entries, short-deadline demos — all contexts where "does it run?" is the only acceptance criterion.Where does vibe coding bite?
The failure modes are predictable once you understand the trade-off: Security in production. A vibe-coded authentication flow, file-upload handler, or API integration almost certainly has vulnerabilities. Not because the model is bad at security — it often knows the secure patterns — but because the feedback loop never includes "does this validate input correctly?" The model optimises for making the test case pass, not for the edge case an attacker will probe. Debugging at scale. When a vibe-coded project grows to several thousand lines, the model's context window starts to strain. You have no mental model of the codebase, so you can't diagnose bugs yourself. You paste errors back into the chat, the model makes a local fix, a different thing breaks, you paste again. The loop can spin for hours on a problem an experienced developer would have spotted in minutes by reading the stack trace. Maintenance and handoff. If you need to hand the project to another developer — or return to it six months later — the absence of any human-authored understanding is a serious liability. The code is coherent enough to run but often not coherent enough to reason about. The token bill. This one is quantifiable, and it's the piece most vibe coders discover too late.Why is vibe coding expensive in tokens?
Every AI coding tool — Claude Code, Cursor, Codex, GitHub Copilot Workspace, Windsurf, Cline, Aider — bills by the token. The price varies by model and provider, but the structure is the same: input tokens (context you send) plus output tokens (code the model generates), with input being the high-volume side and output priced at roughly 5× more per token. Vibe coding drives up token usage in several compounding ways: Vague prompts produce exploratory agents. When you describe what you want loosely, the agent compensates by reading more of the codebase to orient itself. Each file it opens — even files it doesn't need — goes into the context. A 600-line module is roughly 6,000–8,000 tokens, paid again on every subsequent turn until the session ends or compacts. See how to reduce AI coding agent token usage for the mechanics. Long sessions without compaction. The conversation history grows with every exchange. The model re-reads the entire transcript each turn. A three-hour vibe-coding session can accumulate hundreds of thousands of tokens of re-read context — most of it accepted output from earlier turns that is irrelevant to the current task. Error-paste-loop amplification. The describe → run → report cycle, when it goes wrong, can loop many times before converging. Each iteration adds the error output, the model's diagnosis, and its proposed fix to the context. Raw error output from a failing build is often tens of thousands of tokens of noise the model largely ignores. (Check the cost data at AI coding agent token costs.) No pruning discipline. A developer who understands the codebase asks the agent to look at exactly the right file. A vibe coder often prompts "look at the relevant code" and the agent, lacking the developer's navigation intuition, opens too much. The mismatch between what the agent reads and what it needs is the fundamental cost driver. Real-world reports from developers who've tracked this are striking: sessions that burned millions of tokens in a single day, individual runs hitting $30–50 in API costs before the feature was done, projects where the API bill exceeded the value of the software. One developer writing on Medium documented using millions of tokens on Claude Code in a matter of days, then calling vibe coding "a trap" — not because the output was bad but because the bill was invisible until it wasn't. Developer and blogger Russ White, writing in May 2026, documented session costs in detail, finding that unmanaged vibe-coding sessions routinely cost far more than equivalent work done with explicit context control (source).How do you keep vibe coding cheap?
The cost of vibe coding is almost entirely a function of how much context the agent reads, not how much code it writes. That's the lever. Here's how to pull it: Narrow the scope before you prompt. "Fix the broken login button" burns far fewer tokens than "look at the auth system and fix it." Precision is free and it restricts the agent's read surface immediately. The agent retrieves less, writes less diagnosis, and solves faster. Use semantic retrieval instead of whole-file reads. Tools that let the agent search the codebase by meaning — finding the function that handles a specific behaviour — replace broad file reads with targeted symbol pulls. Instead of reading five files to find the one function, the agent reads the one function. This is the biggest single lever for navigation-heavy sessions: done well, a task that would have cost 25,000 tokens of file reads costs 2,000. Filter command output before it hits the context. Don't let a failingnpm install or a noisy test run pour its full output into the context window. The agent needs the error, not the 10,000-token transcript leading to it. Pipe noisy commands through a filter; terse flags (--porcelain, --quiet) on git and similar tools cost nothing and save constantly.
Compact aggressively. Most AI coding tools have a context-compaction mechanism. Use it between tasks, not when you're already out of context. Compacting after a discrete task — "I've added the login feature, now let's handle the reset flow" — keeps the growing transcript from compounding.
Audit your MCP tools. If you run several MCP servers, every connected tool definition is re-sent to the model on every turn whether or not you use it. Removing tools you don't actively use, or loading them lazily only when relevant, removes an invisible recurring cost that scales with how many integrations you've added.
Keep the stable context cacheable. Instructions, project conventions, and reference material that don't change within a session should sit in a stable prefix the model's prompt cache can serve at a fraction of the cost of a fresh read. On Claude, cached input tokens cost roughly 10% of uncached ones.
If applying all of this by hand sounds like it defeats the casual appeal of vibe coding, that's the gap Tokenade closes. It applies these levers automatically — semantic search, output filtering, structure-first reads, lazy tool loading — without requiring you to change how you prompt. The overhead of token efficiency disappears, which means you can keep the loose, exploratory feel of vibe coding without the bill that normally comes with it. Free up to ~20 million tokens, then Pro at $9.90/month.
What goes wrong (anti-patterns)
Accepting broken code and iterating by feel. The loop of "run → paste error → accept fix → run again" can converge on working code, but it can also spiral. Each iteration adds more context, each fix can introduce new breakage, and without any mental model of the codebase, you can't tell if you're making progress or drifting. Set a turn budget: if the agent hasn't resolved an error in three attempts, stop and read the error yourself. Treating generated code as reviewed code. Especially dangerous for anything that touches authentication, file I/O, external APIs, or user-supplied input. "It works in my test" is not security review. Vibe-coded paths to production should go through at least a manual audit of the surfaces that matter. Letting the session run too long. The longer a session runs without compaction, the more you're paying to re-read old context. Long sessions also correlate with degrading output quality as the context window fills with earlier, possibly contradictory, instructions. Break at natural task boundaries. Connecting every MCP server you've ever installed. More integrations is not better if you're not using them. Every loaded tool definition inflates every turn, for every task, whether or not that tool is relevant. Be deliberate about what's loaded. Optimising the output, not the input. Asking the model to "be concise" or "write less explanation" is real but modest. Output is expensive per token but low in volume; input is cheaper per token but the volume is the whole problem. Cutting what you send in is the game; cutting what the model writes is a rounding error.Frequently asked questions
What is vibe coding?
Vibe coding is a software development approach where you describe what you want to an AI in natural language and accept what it produces, without reading or maintaining the generated code. The term was coined by Andrej Karpathy on 2 February 2025 and quickly became the dominant shorthand for low-oversight, intent-driven AI development.Is vibe coding good or bad?
It depends on the use case. Vibe coding is genuinely effective for rapid prototyping, throwaway scripts, and contexts where shipping speed matters more than code quality. It's risky for anything that reaches production users, handles sensitive data, or needs to be maintained over time. The honest answer is that it's a tool with a specific risk profile — powerful inside that profile, dangerous outside it.Why is vibe coding expensive?
Vibe coding is expensive because the AI agents it relies on bill by the token, and the low-oversight, exploratory style of vibe coding drives agents to read more context than they need. Vague goals lead to wide file reads; long sessions accumulate re-read history; error-paste loops compound the cost with each iteration. The bill is invisible until it isn't — developers have reported spending tens to hundreds of dollars in a single productive-feeling session without realising it. The fix is to control what the agent reads, not to change how you prompt.Does vibe coding require learning to code?
No, and that's both its appeal and its limitation. You can build working software with natural-language intent alone — Karpathy's original framing was that LLMs had become good enough that writing code was no longer the bottleneck. But the absence of coding knowledge also removes the ability to diagnose failures, evaluate security, or maintain the result. Vibe coding without any technical background works until it doesn't, and when it doesn't, the path forward is opaque.Which tools are used for vibe coding?
Any AI coding assistant can support the vibe-coding approach. Popular choices include Claude Code, Cursor, GitHub Copilot Workspace, Windsurf, Codex CLI, and Cline. The tools differ in how much autonomy they offer, how they handle context, and how they bill — but the describe → generate → run → report loop works across all of them. See best AI coding tools for a comparison.See also:
- Agentic coding — how autonomous agents extend what vibe coding can do, and what it costs.
- How to reduce AI coding agent token usage — the mechanics behind every lever mentioned above.
- Context engineering for AI coding agents — the discipline that makes agents work with less.
- AI coding agent token costs — the prices behind the token math.
- What is a token? — the unit everything is billed in.
Up to 88% fewer tokens. Zero config.
Tokenade is the simplest way to cut what your coding agent sends to the model — set it up once, save on every prompt. Works with Claude Code, Cursor, Codex, Copilot & more.