AI tokens just got way more expensive. The token crunch is here!
It is more important than ever to be smart about your token use. There’s a natural rhythm to working with AI: you give a task, it moves forward, and you follow up and refine. But you’ll have to fight that tendency. Long conversations feel natural, almost like pair programming, but the approach has a cost: every turn in the conversation requires sending all your previous turns. (By turn I mean one request-and-response exchange).
AI conversations are stateless: every time you send a message, all previous messages must be sent. The conversation isn’t a back-and-forth conversation where the AI remembers its place. You have to start over with every request to the LLM.
Visualization: Stateless conversations quickly use up a lot of tokens.
One Shot Development
Here I’ll cover some practical techniques to keeping conversations short and completing tasks in a single turn. No back and forth, only tasks and then tasks that build on previous tasks. You’ll have something to commit after every turn.
Benefit 1: Token savings, lower cost.
Benefit 2: Higher accuracy. You can expect higher accuracy by focusing your energy on single-turn conversations. Why? When you guide a model in small steps, you’re asking it to hold a growing conversation in its context window, reconcile potentially contradictory instructions, and maintain intent across turns.
Benefit 3: Build your skill set. Multi-turn development trains you to be vague, because you know you can always follow up. One-shotting trains you to be precise and comprehensive.
Techniques
- Maintain a tree of README files
- Use consistent vocabulary
- Create and use skills and commands
- Manage memories
- Use strong verification methods
- Be detailed
- Use a planning step
- Continue with a planning step
Maintain a tree of README files
Keep the project context the model needs in structured, discoverable documentation. Claude Code and others tend to scour your codebase on every conversation but you can minimize that by providing good documentation.
I’ll call it a tree of README files. Here is an example structure:
project/
├── docs/
│ ├── rules/ ← rules for various types of development
│ │ └── front-end-rules.md ← about your front-end framework and conventions
│ │ └── back-end-rules.md ← about your server-side endpoints and conventions
│ │ └── e2e-rules.md ← how to write end-to-end tests
│ │ └── db-rules.md ← how to write queries or interact with ORM
│ │ └── schema-rules.md ← how to write database schema
│ │ └── scaffolding.md ← how and where to scaffold new files
│ ├── plans/ ← have AI write plans here
│ │ └── implement-a.md
│ └── modules/ ← details about specific application modules
│ └── module-a.md
├── AGENTS.md ← project overview, conventions, rules and links to the docs above
├── CLAUDE.md ← symlink to AGENTS.md
Principles:
- Write with both agents and humans in mind.
- Refer to README files from AGENTS.md and other files as applicable. For
example, your AGENTS.md file should talk about your back-end frameworks and
point agents to look at
docs/rules/back-end-rules.mdfor more information. Then inback-end-rules.md, discuss database interaction and point agents to look atdocs/rules/db-rules.mdfor more information. - In your prompts, mention relevant implementation plans and modules that the agent should consider before moving forward.
- Use consistent vocabulary throughout. Add a domain knowledge glossary to AGENTS.md.
- Ask AI to analyze your codebase and compose first drafts for these markdown files, then tweak and finalize. Example prompt:
Analyze the front-end patterns for this project and produce
instructions suitable for AI agents and humans to follow
established conventions. Record these instructions in
docs/rules/front-end-rules.md and make a note in AGENTS.md
referring to this new document.
- Keep files short—no more than 4kb to 8kb long. When files get too long, ask AI to make them more concise, split them into smaller units, or both.
- Ask AI to update these files when things change. Stale documentation can be worse than no documentation. Make it a habit to review module README every time you add features. Schedule a regular housekeeping task to have AI review the files for inconsistencies and contradictions.
Use consistent vocabulary
Agents are more focused and less prone to error if you use consistent vocabulary through your entire project. As mentioned above, consider adding a glossary to AGENTS.md that defines knowledge specific to your project. No need to outline general programming concepts, but do define business jargon, custom frameworks and uncommon patterns.
Create and use skills and commands
Claude Code and other agents support skills—reusable prompt fragments that encode how to do something specific in your project. They’re not code; they’re instructions, written in plain English, that live as files in your repo. Get smart on how to find community-written skills and make your own.
Agents see the skill description on every conversation and decide when to use them. In that way, they are kind of like a tool.
Commands are similar but the agent doesn’t know about these on startup.
Manage memories
Memories are persistent facts—things that are true about your environment, your preferences, or your project’s state that the model should always know without being told each time.
In Claude Code, for example, memories live in .claude/memory.md.
Every time a memory is created, consider adding the details to your tree of README files instead. As memories accumulate, they can start to take up a lot of space.
Use strong verification methods
At least Format, Lint, and Unit Test.
- Instruct your AI to first format your files with a tool such as Prettier or Biome.
- Then lint with the TypeScript compiler (
tsc) and optionally something like ESLint or Biome. - Ask the agent to verify that unit tests still pass after writing any code.
- Provide instructions for running any other scripts or tools that verify code.
- This is also where you might add a code-review step by a separate agent.
The stronger your verification methods, the fewer loops your agent will take. An Agentic Loop is the process where an agent will write code, verify, and make updates. Keep the agent’s rework to a minimum because it uses tokens the same way multi-turn conversations do. Remember that the Agentic Loop is stateless just like regular conversations.
Be detailed
Vague language produces vague output. Take time to write good requirements and you will save time overall.
Say what you mean, completely. “Add authentication to the checkout flow” is an intent. Better is “Add JWT verification middleware to the POST /checkout endpoint, following the patterns in docs/modules/authentication.md, rejecting unauthenticated requests with a 401 and the standard error envelope.”
State what you don’t want. For example, “Do not use approach X” saves a round of cleanup. If there are wrong answers, rule them out in the prompt.
State what is out of scope. For example, “Do not worry about front-end messaging about the new authentication requirement; we’ll tackle that in a future step.”
Use a planning step
For larger initiatives use one turn to create a plan and store it permanently in
docs/initiatives. Ask the agent to break up the initiative into small steps
that each can be completed in one shot.
Continue with a planning step
Sometimes a one-shot task becomes unexpectedly complex or the agent runs into a
problem or ambiguity. Take a second turn to put the discovered situation into a
new prompt (or a new docs/initiatives file for a larger initiative).
Conclusion
Shift your mindset from “I’ll iterate toward what I want” to “I’ll plan ahead and specify exactly what I want.” It takes practice, but these techniques will make a big difference.
Invest in this time up front and you’ll save time overall. You’ll use fewer tokens and get better results at the same time.
One shot. That’s all you need.