Time: 2-3 hours (ONE-TIME setup) Benefit: 70-90% token reduction on ALL future projects ROI: Pays for itself in 2-3 sessions
This guide walks you through setting up global Claude Code optimization that applies to all your projects automatically.
IMPORTANT — What Claude Code Actually Reads (updated 2026-03-20)
Not everything placed in
~/.claude/is automatically loaded. Only these locations are effective:
Location Loaded? Purpose ~/.claude/CLAUDE.mdYES Global instructions — applied to ALL projects ~/.claude/settings.jsonYES Permissions, plugins, allowed tools ~/.claude/skills/*/SKILL.mdYES Slash commands (auto-discovered) ~/.claude/agents/*.mdYES Agent definitions (auto-discovered) ~/.claude/commands/*.mdYES Legacy commands (still supported, skills take precedence) ~/.claude/system-prompts/NO Not read by Claude Code — use CLAUDE.mdinstead~/.claude/settings/*.jsonNO Not Claude Code config — use root settings.json~/.claude/README.mdetc.NO Not loaded — dead documentation Key action: Put your global optimization rules in
~/.claude/CLAUDE.md, not insystem-prompts/. The 4 JSON files insettings/(prompt-caching, beta-features, model-strategy, token-optimization) are reference documentation only — they do not configure Claude Code behavior. Prompt caching and model selection are handled automatically by the Anthropic API.
- Prerequisites
- Overview
- Step 1: Create Directory Structure
- Step 2: Create PM Orchestrator
- Step 3: Create Global Settings
- Step 4: Create Global Skills
- Step 5: Create System Prompts
- Step 6: Create Documentation
- Verification
- Troubleshooting
- Next Steps
Before starting, ensure you have:
- Claude Code installed and working
- MCP servers available (Serena recommended, Codex-CLI optional)
- Command-line access to create directories and files
- 2-3 hours of uninterrupted time
- Text editor for editing JSON and Markdown files
Check MCP availability:
# In Claude Code conversation
/help
# Look for MCP tools like:
# - mcp__serena__*
# - mcp__codex-cli__*If you don't have Serena MCP, many optimizations still work, but you'll get 40-50% savings instead of 70-90%.
Global configuration in ~/.claude/:
- 1 central orchestrator (PM agent)
- 4 global settings files (prompt caching, beta features, model strategy, token optimization)
- 5 global skills (slash commands: /optimize, /context, /cache-inspector, /update-docs, /init-project)
- 2 system prompts (global optimization, symbol-first protocol)
- 3 documentation files (README, INSTALLATION-COMPLETE, QUICK-REFERENCE)
Total: 15 files, ~50K tokens of documentation (cached after first load = 5K tokens per subsequent read)
| Optimization | Savings | How It Works |
|---|---|---|
| Prompt Caching | 90% | Caches system prompts, tools, memories for 10% cost reads |
| Serena Memories | 60-70% | Load architecture context once vs reading files repeatedly |
| Symbol-First | 65-75% | Find specific symbols vs reading entire files |
| Adaptive Planning | 40-50% | Skip unnecessary phases for simple tasks |
| Token-Efficient Tools | 14-70% | Beta header reduces tool output tokens |
| Model Selection | 60% | Haiku ($1/M) vs Sonnet ($3/M) for simple tasks |
| Pattern Reuse | 50% | Reference existing work vs create from scratch |
Combined: 70-90% total reduction
mkdir -p ~/.claude/{agents,settings,skills,system-prompts}Result:
~/.claude/
├── agents/ # Orchestration agents
├── settings/ # Configuration files
├── skills/ # Slash commands
└── system-prompts/ # Global behavior
cd ~/.claude/skills
mkdir -p optimize context cache-inspector update-docs init-projectResult:
~/.claude/skills/
├── optimize/
├── context/
├── cache-inspector/
├── update-docs/
└── init-project/
ls -R ~/.claudeYou should see all directories created with no errors.
The PM Orchestrator is the central coordinator that analyzes task complexity and routes to optimal strategies.
touch ~/.claude/agents/pm-orchestrator.mdOpen ~/.claude/agents/pm-orchestrator.md in your text editor and paste:
---
name: pm-orchestrator
description: Central project management coordinator with adaptive planning and token optimization
model: claude-sonnet-4-6
version: 1.0.0
---
# PM Orchestrator - Central Coordinator
**Purpose**: Analyze task complexity, select optimal strategy, coordinate agents, enforce optimizations, capture learnings.
## Adaptive Planning Strategies
### 1. Planning-Only (Complex: 8+ tasks)
**Use when**: Large features, new modules, architectural changes
**Workflow**:
1. spec-requirements (Haiku) → requirements.md
2. spec-design (Sonnet) → design.md
3. spec-tasks (Haiku) → tasks.md
4. spec-impl (Sonnet) → implementation
5. spec-test (Sonnet) → tests
6. spec-judge (Opus) → evaluate if n≥3 options
**Token usage**: High upfront (15-20K), saves long-term via clarity
### 2. Intent-Planning (Medium: 3-7 tasks)
**Use when**: New features, moderate refactoring, API additions
**Workflow**:
1. Design sketch (Sonnet) → design-sketch.md (lightweight)
2. Implementation plan (Haiku) → task list
3. Direct implementation (Sonnet)
4. Tests (Sonnet)
**Token savings**: 40-50% vs full spec workflow (skip formal requirements)
### 3. Unified (Simple: 1-2 tasks)
**Use when**: Bug fixes, small tweaks, single-file changes
**Workflow**:
1. Analyze with symbol-first exploration
2. Direct implementation (Sonnet or Haiku)
3. Verify with tests
**Token savings**: 60-70% vs full spec workflow (minimal overhead)
## Complexity Analysis
Analyze task to determine strategy:
**Count subtasks**:
- Add new feature → How many files? New models? API changes?
- Fix bug → Isolated or widespread?
- Refactor → How many symbols affected?
**Complexity indicators**:
- **Simple** (1-2 tasks): Single file, clear solution, no dependencies
- **Medium** (3-7 tasks): Multiple files, some design needed, limited dependencies
- **Complex** (8+ tasks): New module, architectural impact, many dependencies
**Override**: User can specify strategy explicitly
## Context Preparation (MANDATORY)
Before delegating to any agent:
### 1. Load Serena Memories (if available)
mcp__serena__list_memories()
Load relevant memories:
- architecture.md (always)
- codebase-conventions.md (always)
- module-structure.md (if module work)
- testing-strategy.md (if tests needed)
- docker-workflow.md (if docker detected)
**Token savings**: 60-70% vs reading 5-10 full files
### 2. Symbolic Discovery (if Serena available)
mcp__serena__search_for_pattern( substring_pattern="similar feature name", restrict_search_to_code_files=true )
mcp__serena__find_symbol( name_path_pattern="RelatedClass", include_body=false, depth=1 )
**Token savings**: 65-75% vs reading full files
### 3. Check Constitution (if exists)
Read: .claude/settings/constitution.json
Verify task complies with:
- Architectural principles
- Code quality rules
- Security requirements
**Prevents**: Violations that require rework
### 4. Search for Similar Patterns
ls .claude/specs/
Grep: pattern="similar feature"
**Token savings**: 50% if pattern exists (reuse vs create)
## Agent Selection & Model Strategy
### Agent-to-Model Mapping
| Agent | Model | Why | Cost |
|-------|-------|-----|------|
| spec-requirements | **Haiku** | Template-driven, fast | $1/$5/M |
| spec-tasks | **Haiku** | Structured decomposition | $1/$5/M |
| spec-design | **Sonnet** | Architectural thinking | $3/$15/M |
| spec-impl | **Sonnet** | Complex implementation | $3/$15/M |
| spec-test | **Sonnet** | Comprehensive testing | $3/$15/M |
| spec-judge | **Opus** | Critical evaluation (3+ options) | $5/$25/M |
| pm-orchestrator | **Sonnet** | Coordination | $3/$15/M |
**Cost optimization**: 40% Haiku, 55% Sonnet, 5% Opus (target distribution)
### When to Override
Use Opus for:
- Security reviews (vulnerabilities)
- Architectural decisions (long-term impact)
- Judge evaluation with 3+ competing designs
Use Haiku for:
- Template generation
- Task decomposition
- Simple formatting
Use Sonnet (default) for:
- Most implementation
- Design work
- Testing
## Continuous Learning
After each session, capture learnings:
### 1. Common Patterns
File: `.claude/learnings/common-patterns.md`
```markdown
## Pattern: [Name]
**Date**: YYYY-MM-DD
**Context**: [When this pattern applies]
**Solution**: [How to implement]
**Files**: [Example files]
**Token savings**: [Reuse vs rediscover]
Example:
## Pattern: Service Layer Extraction
**Date**: 2026-01-04
**Context**: Complex controller logic
**Solution**: Extract to Service class in app/Services/
**Files**: app/Services/UserService.php
**Token savings**: 30% (reuse pattern vs design from scratch)
File: .claude/learnings/anti-patterns.md
## Anti-Pattern: [Name]
**Date**: YYYY-MM-DD
**Problem**: [What went wrong]
**Why**: [Root cause]
**Fix**: [Correct approach]
**Prevention**: [How to avoid]
Example:
## Anti-Pattern: Inline Validation
**Date**: 2026-01-04
**Problem**: Validation in controller instead of FormRequest
**Why**: Violated constitution, hard to test, not reusable
**Fix**: Extract to FormRequest class
**Prevention**: Always check constitution before implementationFile: .claude/learnings/optimization-log.md
## Session: YYYY-MM-DD HH:MM
**Task**: [Description]
**Strategy**: [Unified|Intent-Planning|Planning-Only]
**Tokens used**: [Actual]
**Estimated baseline**: [Without optimization]
**Savings**: [Percentage]
**Optimizations applied**:
- Symbol-first: [X]%
- Memory system: [X]%
- Adaptive planning: [X]%
- Prompt caching: [X]%
- Pattern reuse: [X]%
- Model selection: [Model]
- Token-efficient tools: [X]%At end of each session, provide:
## Session Summary
**Task**: [description]
**Strategy**: [Unified|Intent-Planning|Planning-Only]
**Duration**: [X minutes]
### Optimizations Applied:
- ✅ Symbol-First Exploration: [X]% savings
- ✅ Memory System: [X]% savings
- ✅ Adaptive Planning: [X]% savings
- ✅ Prompt Caching: [X]% savings
- ✅ Pattern Reuse: [X]% savings
- ✅ Model Selection: [Haiku|Sonnet|Opus]
- ✅ Token-Efficient Tools: [X]% savings
### Results:
- **Tokens used**: [X] tokens
- **Estimated baseline**: [Y] tokens (without optimization)
- **Total savings**: [Z]% reduction
- **Cost**: $[X] (vs $[Y] baseline)
- **Cache hit rate**: [X]%
### Learnings Captured:
- [Pattern/anti-pattern/optimization discovered]
### Files Modified:
- [List of changed files with brief description][Orchestrator definition continues with delegation rules, error handling, constitution enforcement, etc...]
**File size**: ~3K tokens (will be cached, so 300 tokens on subsequent reads)
---
## Step 3: Create Global Settings
Create 4 JSON configuration files in `~/.claude/settings/`.
### 3.1 Prompt Caching Config
**File**: `~/.claude/settings/prompt-caching.json`
```json
{
"version": "1.0.0",
"description": "Prompt caching configuration for 90% cost savings on repeated content reads",
"cache_control": {
"type": "ephemeral",
"auto_enable": true
},
"caching_rules": {
"system_prompts": {
"enabled": true,
"min_tokens": 1024,
"ttl_minutes": 60,
"description": "Cache all agent system prompts (pm-orchestrator, KFC agents, etc.)"
},
"tool_definitions": {
"enabled": true,
"min_tokens": 1024,
"ttl_minutes": 60,
"description": "Cache MCP tool definitions (Serena, Codex-CLI, etc.)"
},
"memories": {
"enabled": true,
"min_tokens": 1024,
"ttl_minutes": 60,
"auto_cache": ["architecture.md", "codebase-conventions.md", "module-structure.md"],
"description": "Cache Serena memories for 60-70% session savings"
},
"specs": {
"enabled": true,
"min_tokens": 2048,
"ttl_minutes": 60,
"conditional": true,
"description": "Cache large spec documents during implementation"
},
"constitution": {
"enabled": true,
"min_tokens": 1024,
"ttl_minutes": 60,
"description": "Cache constitution for fast decision checks"
}
},
"cache_warming": {
"enabled": false,
"description": "Pre-cache frequently used content on session start (experimental)"
},
"metrics": {
"track_hit_rate": true,
"track_cost_savings": true,
"log_file": ".claude/learnings/cache-performance.log"
}
}
Expected savings: 90% cost reduction on cached content reads, 85% latency reduction
Field naming note:
ttl_minutesis an internal config knob used by/cache-inspector. The actual Anthropic API surface iscache_control: { type: "ephemeral", ttl: "5m" | "1h" }— only those two TTL tiers are supported, not arbitrary minute values.
File: ~/.claude/settings/beta-features.json
{
"version": "1.0.0",
"description": "Claude API beta features configuration",
"features": {
"token_efficient_tools": {
"enabled": true,
"beta_header": null,
"note": "Built-in on Claude 4+. The legacy `token-efficient-tools-2025-02-19` header is a no-op as of Claude 4 and should NOT be sent.",
"description": "Tool output is automatically token-compressed by the model",
"apply_to": ["all_agents"]
},
"adaptive_thinking": {
"enabled": true,
"config": { "type": "adaptive" },
"effort": "high",
"description": "Claude decides per-turn whether and how much to think. Pairs with the GA `effort` parameter (low / medium / high / max).",
"apply_to": ["spec-design", "spec-judge"],
"use_max_effort_for": [
"Architectural decisions with long-term impact",
"Security reviews",
"Judge evaluation with 3+ competing designs"
],
"note": "Replaces the deprecated `extended_thinking { thinking_budget_tokens }` block. Adaptive thinking auto-enables interleaved thinking; the `interleaved-thinking-2025-05-14` beta header is no longer required on Claude 4.6."
},
"context_management": {
"enabled": false,
"beta_header": "context-management-2025-06-27",
"description": "Server-side context editing — clear_thinking_20251015 and clear_tool_uses_20250919 strategies trim stale tool results and old thinking blocks before they consume window space.",
"apply_to": ["long_running_agents", "agent_teams"],
"default": false,
"reason_disabled": "Opt-in per-workflow; pairs well with server-side compaction."
}
},
"rollout_strategy": {
"gradual": true,
"test_before_global": true,
"rollback_on_issues": true
}
}Expected savings: tool-output compression is built-in on Claude 4+ (no header). Adaptive thinking + effort lets you scale reasoning depth per workflow. Context-management beta clears stale tool results / thinking blocks server-side.
File: ~/.claude/settings/model-strategy.json
{
"version": "1.0.0",
"description": "Model selection strategy for cost optimization",
"default_model": "claude-sonnet-4-6",
"models": {
"haiku": {
"id": "claude-haiku-4-5",
"cost_per_million": {
"input": 1,
"output": 5
},
"use_for": [
"Requirements generation (template-driven)",
"Task decomposition (structured)",
"Simple refactoring (clear patterns)",
"Code formatting (deterministic)"
],
"agents": ["spec-requirements", "spec-tasks"],
"cost_savings_vs_sonnet": "60%"
},
"sonnet": {
"id": "claude-sonnet-4-6",
"cost_per_million": {
"input": 3,
"output": 15
},
"use_for": [
"Implementation (balanced)",
"Design (architectural thinking)",
"Testing (comprehensive)",
"Debugging (analysis)",
"PM orchestration (coordination)"
],
"agents": ["spec-design", "spec-impl", "spec-test", "pm-orchestrator"],
"default": true
},
"opus": {
"id": "claude-opus-4-6",
"cost_per_million": {
"input": 5,
"output": 25
},
"use_for": [
"Architectural decisions (critical, long-term impact)",
"Security reviews (safety-critical)",
"Judge evaluation (3+ options to evaluate)"
],
"agents": ["spec-judge"],
"require_justification": true,
"cost_premium_vs_sonnet": "67%",
"usage_target": "<5% of total operations"
}
},
"complexity_overrides": {
"simple_task": {
"criteria": "1-2 subtasks, single file, no dependencies",
"model": "haiku",
"reason": "Fast, cheap, sufficient quality"
},
"medium_task": {
"criteria": "3-7 subtasks, multiple files, some dependencies",
"model": "sonnet",
"reason": "Balanced cost/quality"
},
"complex_task": {
"criteria": "8+ subtasks, architectural impact, many dependencies",
"model": "sonnet",
"reason": "Default for complexity"
},
"critical_decision": {
"criteria": "Security, architecture, judge with 3+ options",
"model": "opus",
"reason": "Highest quality for high-stakes decisions"
}
},
"target_distribution": {
"haiku": "40%",
"sonnet": "55%",
"opus": "5%",
"description": "Ideal usage across all operations for cost optimization"
},
"metrics": {
"track_usage": true,
"track_cost": true,
"log_file": ".claude/learnings/model-usage.log"
}
}Expected savings: 40% of work on Haiku (60% cheaper than Sonnet)
File: ~/.claude/settings/token-optimization.json
{
"version": "1.0.0",
"description": "Comprehensive token optimization strategies",
"optimizations": {
"symbol_first_exploration": {
"enabled": true,
"priority": "critical",
"enforcement": "mandatory",
"description": "Use Serena symbolic tools before reading full files",
"savings": "65-75%",
"required_tools": ["mcp__serena__find_symbol", "mcp__serena__get_symbols_overview"],
"fallback": "Read tool if Serena unavailable",
"protocol_file": "~/.claude/system-prompts/symbol-first-protocol.md"
},
"memory_system": {
"enabled": true,
"priority": "critical",
"enforcement": "mandatory",
"description": "Load Serena memories before exploring code",
"savings": "60-70%",
"auto_load": ["architecture.md", "codebase-conventions.md"],
"conditional_load": ["module-structure.md", "testing-strategy.md", "docker-workflow.md"],
"required_tools": ["mcp__serena__list_memories", "mcp__serena__read_memory"]
},
"adaptive_planning": {
"enabled": true,
"priority": "high",
"enforcement": "automatic",
"description": "Skip unnecessary phases for simple tasks",
"savings": "40-50%",
"strategies": {
"unified": {
"criteria": "1-2 tasks",
"savings": "60-70% vs full spec"
},
"intent_planning": {
"criteria": "3-7 tasks",
"savings": "40-50% vs full spec"
},
"planning_only": {
"criteria": "8+ tasks",
"savings": "Long-term via clarity"
}
},
"orchestrator": "pm-orchestrator.md"
},
"checkpoint_system": {
"enabled": true,
"priority": "medium",
"enforcement": "automatic",
"description": "Save/resume from checkpoints for 80-90% continuation savings",
"savings": "80-90% on resume",
"checkpoint_triggers": [
"After requirements approval",
"After design approval",
"After implementation phase",
"Before long-running operations"
],
"checkpoint_location": ".claude/specs/{feature}/checkpoints/"
},
"pattern_reuse": {
"enabled": true,
"priority": "medium",
"enforcement": "automatic",
"description": "Search existing specs/code for similar patterns",
"savings": "50%",
"search_locations": [
".claude/specs/",
".claude/learnings/common-patterns.md"
],
"tools": ["Grep", "mcp__serena__search_for_pattern"]
}
},
"enforcement_rules": {
"violations": {
"read_before_symbol_search": {
"action": "log_and_warn",
"exception": "Non-code files or Serena unavailable"
},
"skip_memory_load": {
"action": "prompt_user",
"exception": "New project without memories"
},
"unnecessary_opus_usage": {
"action": "log_and_justify",
"exception": "Security, architecture, judge (3+ options)"
}
}
},
"success_metrics": {
"symbol_first_adoption": {
"target": "95%",
"calculation": "symbol_first_calls / (symbol_first_calls + full_file_reads)"
},
"cache_hit_rate": {
"target": "80%",
"measurement": "cached_reads / total_reads"
},
"token_reduction": {
"minimum": "30-50%",
"aggressive": "50-70%",
"maximum_with_caching": "70-90%"
},
"model_cost_efficiency": {
"target": "40% Haiku usage",
"measurement": "haiku_operations / total_operations"
},
"opus_usage_limit": {
"target": "<5%",
"measurement": "opus_operations / total_operations"
}
},
"logging": {
"enabled": true,
"log_file": ".claude/learnings/optimization-log.md",
"include_metrics": true,
"include_violations": true,
"include_fallbacks": true
}
}Expected savings: Orchestrates all optimizations for 70-90% total reduction
Create 5 slash command skills. Due to length, I'll show the structure for each:
File: ~/.claude/skills/optimize/SKILL.md
Source: Copy from skills/optimize/SKILL.md in this directory.
Key features:
- Analyzes task complexity and routes to the correct planning strategy (Unified / Intent-Planning / Planning-Only)
- Enforces symbol-first exploration and memory loading
- Selects the cheapest model that produces acceptable quality
- Reports token savings at the end of each session
File: ~/.claude/skills/context/SKILL.md
Source: Copy from skills/context/SKILL.md in this directory.
Actions:
load— Load all relevant memories for the current projectsave [name]— Create or update a named memorylist— Show all available memories with timestampsrefresh [name]— Regenerate a stale memory from the current codebaseinspect— Show loaded context + cache hit rateclear— Unload all memories from context
File: ~/.claude/skills/cache-inspector/SKILL.md
Source: Copy from skills/cache-inspector/SKILL.md in this directory.
Actions:
status— Current cache entries, hit rate, and estimated savingsanalyze— Performance trends and cost breakdownoptimize— Actionable recommendations to improve hit ratereport— Full report saved to.claude/learnings/cache-performance.mdclear— Clear cache entries (for testing only)
File: ~/.claude/skills/update-docs/SKILL.md
Source: Copy from skills/update-docs/SKILL.md in this directory.
Actions:
research [topic]— Search web for latest patterns and API changescollect [url]— Fetch content from a specific documentation URLanalyze— Compare research findings to existing docs, identify outdated contentupdate [target]— Apply targeted updates to a file or scopevalidate— Scan docs for stale model IDs, broken URLs, old API patterns
File: ~/.claude/skills/init-project/SKILL.md
Source: Copy from skills/init-project/SKILL.md in this directory.
Actions:
detect— Auto-detect tech stack (Rails, Laravel, Next.js, FastAPI, Flutter, iOS, etc.)fetch [framework]— Fetch current best practices for the detected stackconstitution— Generate.claude/settings/constitution.jsonwith architectural rulesmemories— Create initial Serena memories by analyzing the codebaseoptimize— Configure project-level token optimization settings--full— Run all steps in sequence (recommended, 10-15 min)
Create 2 global system prompts that apply to all projects and agents.
File: ~/.claude/system-prompts/global-optimization.md
Source: Copy from system-prompts/global-optimization.md in this directory.
Key sections:
- Symbol-first exploration (MANDATORY) — never read full files before symbolic discovery
- Memory-first context loading (MANDATORY) — load Serena memories before code exploration
- Prompt caching (AUTOMATIC) — 90% savings on re-reads of large content
- Adaptive planning strategy (AUTOMATIC) — Unified / Intent-Planning / Planning-Only
- Model selection strategy (AUTOMATIC) — 40% Haiku / 55% Sonnet / 5% Opus target
- Pattern reuse (AUTOMATIC) — search existing specs and learnings before creating new code
- Constitution enforcement — validates task against
.claude/settings/constitution.json - Continuous learning — captures patterns and anti-patterns after sessions
- Session summary template — reports token savings per task
File: ~/.claude/system-prompts/symbol-first-protocol.md
Source: Copy from system-prompts/symbol-first-protocol.md in this directory.
Key sections:
- Core principle — NEVER read a full file first; always find the specific symbol
- Step-by-step protocol — verify Serena → get overview → find symbol → targeted read
- Common patterns — update method, add method, trace dependencies, refactor module
- Advanced techniques — substring matching, path filtering, cross-codebase search
- Fallback strategy — efficient
Read+Grepapproach when Serena is unavailable - Token savings table — 85-96% reduction examples for common tasks
- Troubleshooting guide — symbol not found, wrong project, tracing definitions
Create 3 documentation files to help you understand and use the system.
File: ~/.claude/README.md
Create this file to document your global ~/.claude/ setup for future reference.
Recommended sections:
- Quick start (15 min for new projects)
- Directory structure with purpose of each file
- Available skills (slash commands with brief descriptions)
- Token savings breakdown
- Configuration files reference
- Monitoring & metrics commands
- Maintenance schedule (monthly
/update-docs, weekly/cache-inspector analyze)
File: ~/.claude/INSTALLATION-COMPLETE.md
Create this as a checklist to verify your installation is correct.
Recommended sections:
- What was installed (file list)
- Expected outcomes per skill
- Verification checklist (
ls ~/.claude/skills/*/SKILL.md) - Next steps for first project activation
File: ~/.claude/QUICK-REFERENCE.md
Create a quick reference card:
# Claude Code Optimization - Quick Reference
## Global Skills (Slash Commands)
### `/optimize [task]`
Maximum token efficiency mode
Example: `/optimize "Add email verification feature"`
### `/context [action]`
Memory management
- load - Load all memories
- save [name] - Create memory
- list - Show available
- refresh [name] - Update stale
- inspect - Show loaded + cache
- clear - Clear loaded
### `/cache-inspector [action]`
Cache performance monitoring
- status - Current status
- analyze - Performance analysis
- optimize - Get recommendations
- report - Detailed report
### `/update-docs [action]`
Update documentation from research
- research [topic] - Search web
- collect [source] - Fetch URL
- analyze - Compare with docs
- update [target] - Update docs
- validate - Check accuracy
### `/init-project [action]`
Initialize new project
- detect - Auto-detect language
- fetch [lang] - Get best practices
- constitution - Generate rules
- memories - Create templates
- optimize - Configure settings
- --full - Complete init
## Token Savings Targets
| Optimization | Savings |
|--------------|---------|
| Prompt Caching | 90% |
| Memories | 60-70% |
| Symbol-First | 65-75% |
| Adaptive Planning | 40-50% |
| Token-Efficient Tools | 14-70% |
| Model Selection | 60% |
| Pattern Reuse | 50% |
**Combined**: 70-90% total reduction
## Quick Checks
**Cache status**: `/cache-inspector status`
**Memory list**: `/context list`
**Verify optimization**: Check session summary
## Common Workflows
**New project**:
1. `/init-project --full` (10 min)
2. `/context load`
3. `/optimize "first task"`
**Existing project**:
1. `/context load`
2. `/optimize "your task"`
**Update docs**:
1. `/update-docs research [topic]`
2. `/update-docs analyze`
3. `/update-docs update [target]`
## Troubleshooting
**Low savings?**
- Check `/cache-inspector analyze`
- Verify memories exist: `/context list`
- Ensure Serena activated: `/context load`
**Serena not working?**
- `/context load` (auto-activates)
- Check MCP availability
- Fallback to Read tool (still get 40-50% savings)
## Files Reference
**Settings**: `~/.claude/settings/`
- prompt-caching.json
- beta-features.json
- model-strategy.json
- token-optimization.json
**Agents**: `~/.claude/agents/`
- pm-orchestrator.md
**Skills**: `~/.claude/skills/`
- optimize/, context/, cache-inspector/, update-docs/, init-project/
**System Prompts**: `~/.claude/system-prompts/`
- global-optimization.md
- symbol-first-protocol.md
---
**Questions?** See ~/.claude/README.md for full documentationAfter completing all steps, verify your installation:
See checklist.md in this directory for complete verification steps.
Quick verification:
# Check all files exist
ls ~/.claude/agents/pm-orchestrator.md
ls ~/.claude/settings/*.json
ls ~/.claude/skills/*/SKILL.md
ls ~/.claude/system-prompts/*.md
ls ~/.claude/*.md
# Count files (should be 15+)
find ~/.claude -type f | wc -lIn Claude Code conversation:
/optimize "test"
# Should recognize command and explain optimization mode
/context list
# Should list available memories (may be empty if no project activated)
/cache-inspector status
# Should show cache configuration
/update-docs validate
# Should check documentation files
/init-project detect
# Should attempt to detect project type
If all commands work, installation is successful!
Cause: Skill file doesn't exist or has wrong name
Fix:
# Check file exists
ls ~/.claude/skills/optimize/SKILL.md
# Verify filename is exactly "SKILL.md" (case-sensitive)
# Recreate if neededCause: Serena MCP not installed or not activated
Fix:
- Serena not required for global setup
- You'll activate Serena per-project later
- Many optimizations still work (40-50% savings vs 70-90%)
Cause: prompt-caching.json has errors
Fix:
# Validate JSON syntax
cat ~/.claude/settings/prompt-caching.json | python -m json.tool
# Fix any syntax errors
# Recreate if neededCause: pm-orchestrator.md has errors or not loaded
Fix:
- Verify file exists
- Check YAML frontmatter syntax
- Ensure no extra characters
If issues persist:
- Re-read this guide carefully
- Check the verification checklist
- Review ~/.claude/README.md
- Try setup-agent.md for automated setup instead
Now that global setup is complete, activate for your current project:
-
Navigate to project directory:
cd ~/projects/your-project
-
Follow project activation guide: See
../02-project-activation/guide.md -
Or use automation: Copy contents of
../02-project-activation/activation-agent.mdand paste into Claude Code
Time: 10-15 minutes
For every new project:
- Navigate to project
- Run
/init-project --full(10 min) - Start coding with 70-90% token savings
That's it! Global setup done once, all future projects benefit automatically.
Track your installation success:
Setup completion:
- 15+ files created
- All 5 skills working
- PM Orchestrator file valid
- All 4 settings JSON valid
- Documentation readable
First project activation:
- Serena activated
- 3-5 memories created
- First
/optimizetask completed - 60-70% token reduction observed
After 5 sessions:
- Cache hit rate >80%
- Consistent 70-90% token reduction
- PM Orchestrator routing correctly
- Learnings captured
Setup time: 2-3 hours (one-time)
Savings per session:
- Baseline: 85K tokens = $2.55
- Optimized: 18K tokens = $0.54
- Saved: $2.01 per session
ROI: After 2-3 sessions, setup time is recovered
Annual savings (360 sessions):
- $723.60 for single project
- $2,170.80 for 3 projects
- Setup pays for itself in first week!
Congratulations! You've completed the global Claude Code optimization setup.
Next: Activate for your first project using ../02-project-activation/guide.md
Remember: This is a ONE-TIME setup. All future projects benefit automatically with just 10-15 minutes of activation.