A comprehensive guide to working smarter with AI
Press → or use arrow keys to continue
📚 A collaborator who has processed millions of books and repositories
🎯 Trained to predict the next token based on patterns
🔍 Recognizes patterns from vast training data
⚡ Responds instantly but doesn't truly "understand" in a human sense
🎭 Generates plausible output through statistical probability
| Limitation | Context Engineering Solution |
|---|---|
| ❌ Probabilistic reasoning | ✅ Extended Thinking — enable reasoning tokens for complex problems |
| ❌ No real-time learning | ✅ Maintain CLAUDE.md — update project context after each session |
| ❌ No lasting memory | ✅ Skills & Markdown docs — document decisions and lessons in persistent files |
| ❌ Can't verify execution | ✅ Bash tool — execute and verify code immediately |
| ❌ No browsing (standalone) | ✅ MCP Servers / WebFetch — integrate real-time external data |
| ❌ Context window limits | ✅ RAG + Sub-Agents — load only relevant files; isolate large tasks |
| ❌ No persistent state | ✅ Session summaries — end with "save to PROGRESS.md", start new with context |
| ❌ Expensive repeated context | ✅ Prompt Caching — cache stable context at ~10% of normal token cost |
| Model | Context Window | Best For |
|---|---|---|
| Claude Opus 4.7 | 200K tokens | Complex reasoning, architecture decisions |
| Claude Sonnet 4.6 | 200K tokens | Balanced everyday coding tasks |
| Claude Haiku 4.5 | 200K tokens | Fast, lightweight, high-volume automation |
| GPT-4o | 128K tokens | General purpose, multimodal |
| Gemini 1.5 Pro | 1M+ tokens | Extremely large documents |
README.md — Architecture overview
CONVENTIONS.md — Coding standards
DECISIONS.md — Why we chose X
TROUBLESHOOTING.md — Common fixes
Placed at project root — Claude Code reads it automatically every session
Also supports ~/.claude/CLAUDE.md for global preferences
.claude/skills/.claude/skills/[skill-name]/SKILL.md
| Tool | Purpose | Example Use |
|---|---|---|
| Read | Read file contents | Read source file before editing |
| Edit | Precise string replacement | Fix a specific bug in a file |
| Write | Create or overwrite files | Create a new component |
| Bash | Run shell commands | Run tests, git operations |
| Grep | Search file contents with regex | Find all usages of a function |
| Glob | Find files by pattern | List all *.test.ts files |
| WebFetch / WebSearch | Fetch URLs or search the web | Read live API documentation |
Query schemas, run SQL, analyze query performance
Create issues/PRs, search repos, read commit history
Send messages, read channels, create notifications
Read/write documents, search files
Company APIs, internal tools, legacy systems
On-demand access — no need to paste entire schemas into context
"cache_control": {"type": "ephemeral"} in API calls to mark cacheable blocks
Shell commands that run automatically in response to Claude Code events — configured in .claude/settings.json
Dedicated reasoning tokens Claude uses before producing its final answer — enabling deeper, more reliable multi-step analysis
In conversation:
Via API:
Claude spawns independent agents with their own isolated context windows to handle parallel or isolated subtasks
When: New features, refactoring, architecture
What: Analysis → Plan → Approval → Code
Speed: Slower, deliberate (System 2)
When: Bug fixes, small refactors, docs updates
What: Immediate precise file edits
Speed: Fast, automatic (System 1)
| Project Size | Use RAG? | Reason |
|---|---|---|
| Small (<20 files) | No | Everything fits comfortably in context |
| Medium (20–100 files) | Yes | Selective loading saves significant context |
| Large (100+ files) | Essential | Only efficient path forward |
| Documentation Search | Yes | Find answers without reading everything |
| Legacy Codebase | Essential | Navigate unfamiliar code quickly |
✓ Keep CLAUDE.md updated at project root
✓ Use Plan Mode for complex & architectural work
✓ Provide clear, scoped context in requests
✓ Use Hooks for automated repeatable actions
✓ Verify all AI-generated code before shipping
✓ Break large tasks into focused conversations
✓ Use Sub-Agents for parallel analysis
✓ Cache large, stable context to reduce cost
✗ Dump entire codebases — use RAG instead
✗ Skip planning for big architectural changes
✗ Mix multiple unrelated concerns in one chat
✗ Trust AI output blindly — always test
✗ Rely on conversation history as documentation
✗ Use Edit Mode for multi-file architecture changes
✗ Ask Claude to "remember" things — use hooks or CLAUDE.md
✗ Let AI make final architectural decisions
❌ Loads ALL user files (20 files, 30K tokens)
❌ Loads ALL API endpoints (15 files, 20K tokens)
❌ Loads ALL tests (25 files, 25K tokens)
❌ Loads database models (10 files, 10K tokens)
❌ Loads documentation (5 files, 15K tokens)
Total: 75 files, 100K tokens consumed upfront
Only 100K left for actual conversation
Result: Messy, scattered, incomplete responses
Create implementation_plan.md in Plan Mode
Implement Phase 1 from plan
JWT, password hashing, endpoints
Verify everything works end-to-end
✅ Each conversation uses <50% context
✅ Clear documentation trail — easy to review and debug
✅ Fully working auth system delivered
"The best code is not written — it's orchestrated."
You're the conductor Claude is your orchestra