diff --git a/docs/01-why-tokens-matter.md b/docs/01-why-tokens-matter.md index eb4de36..eb16a0b 100644 --- a/docs/01-why-tokens-matter.md +++ b/docs/01-why-tokens-matter.md @@ -51,23 +51,29 @@ Every token you send or receive has a cost. Here's how: Understanding what Copilot does behind the scenes helps you optimize: -```text -┌─────────────────────────────────────────────────┐ -│ Context Window │ -│ │ -│ ┌──────────────────┐ ┌─────────────────────┐ │ -│ │ INPUT TOKENS │ │ OUTPUT TOKENS │ │ -│ │ │ │ │ │ -│ │ System prompt │ │ The response │ │ -│ │ + copilot- │ │ you receive │ │ -│ │ instructions │ │ │ │ -│ │ + file context │ │ │ │ -│ │ + conversation │ │ │ │ -│ │ history │ │ │ │ -│ │ + YOUR prompt │ │ │ │ -│ └──────────────────┘ └─────────────────────┘ │ -└─────────────────────────────────────────────────┘ -``` + - **System prompt:** Copilot's own instructions (you can't control this) - **`copilot-instructions.md`:** Your project-level instructions — loaded on **every** interaction diff --git a/docs/01-why-tokens-matter.zh-TW.md b/docs/01-why-tokens-matter.zh-TW.md index 244c568..5978108 100644 --- a/docs/01-why-tokens-matter.zh-TW.md +++ b/docs/01-why-tokens-matter.zh-TW.md @@ -51,23 +51,29 @@ Token 是大型語言模型讀寫時使用的基本單位。它不是單字, 了解 Copilot 背後實際做了什麼,才能知道該怎麼最佳化: -```text -┌─────────────────────────────────────────────────┐ -│ Context Window │ -│ │ -│ ┌──────────────────┐ ┌─────────────────────┐ │ -│ │ INPUT TOKENS │ │ OUTPUT TOKENS │ │ -│ │ │ │ │ │ -│ │ System prompt │ │ 你收到的回應 │ │ -│ │ + copilot- │ │ │ │ -│ │ instructions │ │ │ │ -│ │ + file context │ │ │ │ -│ │ + conversation │ │ │ │ -│ │ history │ │ │ │ -│ │ + 你的 prompt │ │ │ │ -│ └──────────────────┘ └─────────────────────┘ │ -└─────────────────────────────────────────────────┘ -``` + - **System prompt:** Copilot 自身的內建指示(你無法控制) - **`copilot-instructions.md`:** 專案層級指示,**每次互動都會載入** diff --git a/docs/08-mcp-tool-costs.md b/docs/08-mcp-tool-costs.md index 0374c1f..686039f 100644 --- a/docs/08-mcp-tool-costs.md +++ b/docs/08-mcp-tool-costs.md @@ -55,17 +55,24 @@ This isn't free. Each tool definition costs approximately: Here's where it gets expensive: -```text -Tools loaded = servers × tools_per_server × tokens_per_tool - -Example (heavy setup): - 10 MCP servers × 5 tools each × 200 tokens avg = 10,000 tokens - -Agent mode runs 5-25 steps per task. -Tool definitions reload EVERY step. - -10,000 tokens × 15 steps = 150,000 tokens just for tool definitions. -``` + That's 150K tokens doing nothing but telling the agent what tools exist. Before any actual work happens. diff --git a/docs/08-mcp-tool-costs.zh-TW.md b/docs/08-mcp-tool-costs.zh-TW.md index eb46273..e0459cf 100644 --- a/docs/08-mcp-tool-costs.zh-TW.md +++ b/docs/08-mcp-tool-costs.zh-TW.md @@ -47,15 +47,24 @@ Buffer: 40.4k (20%) 真正貴的是它會被重複載入: -```text -Tools loaded = servers × tools_per_server × tokens_per_tool - -Example: -10 MCP servers × 5 tools × 200 tokens = 10,000 tokens - -Agent mode 走 15 steps: -10,000 × 15 = 150,000 tokens -``` + 也就是說,還沒做任何真正工作,就先花了 15 萬個 token 讓 agent 知道有哪些工具可用。 diff --git a/docs/09-comparisons-data.md b/docs/09-comparisons-data.md index c9a0213..adfb7dd 100644 --- a/docs/09-comparisons-data.md +++ b/docs/09-comparisons-data.md @@ -151,15 +151,36 @@ Does compression hurt output quality? The research says: **rarely, and only at e The savings curve is not linear. The first 30% of compression (dropping filler) is free. The next 20% (fragments, abbreviations) is nearly free. Beyond that, each additional compression point risks quality. -```text -Savings vs. Quality Risk: - -Quality ████████████████████████████████████░░░░░░░░░ -Risk ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░███████████████ - 0% 20% 40% 60% 80% - Token Savings → - lite full ultra extreme -``` + **Sweet spot: full caveman (30-50% input token savings; 40-55% output savings with terse system instructions).** Maximum return, negligible risk. diff --git a/docs/09-comparisons-data.zh-TW.md b/docs/09-comparisons-data.zh-TW.md index b7c7035..07c1537 100644 --- a/docs/09-comparisons-data.zh-TW.md +++ b/docs/09-comparisons-data.zh-TW.md @@ -99,6 +99,37 @@ 再往後 20% 也通常很划算。 超過那個點後,每多壓一點,都更可能帶來誤解。 + + **建議甜蜜點:** Full caveman。 --- diff --git a/docs/10-practical-setup.md b/docs/10-practical-setup.md index 71caac9..c65687e 100644 --- a/docs/10-practical-setup.md +++ b/docs/10-practical-setup.md @@ -371,28 +371,39 @@ Each mode has a fundamentally different token cost profile: Understanding the loop helps you minimize steps: -```text -Step 1: Load context - ├── System prompt (~500 tokens) - ├── copilot-instructions.md (~50-1500 tokens) - ├── Tool definitions (~2,000-20,000 tokens) - ├── Conversation history (growing) - └── YOUR prompt - → Send to LLM → Get response - -Step 2: LLM decides to call a tool - ├── Tool call (function + params) → output tokens - ├── Tool result → input tokens (next step) - └── Reasoning about result → output tokens - -Step 3: Another tool call (or generate response) - ├── ALL of Step 1's context reloaded - ├── + Step 2's tool call and result - └── + growing conversation - → Send to LLM again - -... repeat 5-25 times -``` + **Key insight:** Context grows with every step. Step 15 carries all the context from steps 1-14 plus the original prompt. This is why long agent sessions get expensive fast. diff --git a/docs/10-practical-setup.zh-TW.md b/docs/10-practical-setup.zh-TW.md index 641abfa..2032ad3 100644 --- a/docs/10-practical-setup.zh-TW.md +++ b/docs/10-practical-setup.zh-TW.md @@ -234,6 +234,40 @@ Agent Mode 常比 Ask Mode 貴上很多倍。 每多一步,完整 context 都可能再重送一次,而且還會帶上前一步的結果,因此後期步驟會越來越貴。 + + ### 4.5.3 如何減少 Agent 步數 - **Prompt 要精準,並加上 acceptance criteria** diff --git a/docs/stylesheets/extra.css b/docs/stylesheets/extra.css index 672518d..4666050 100644 --- a/docs/stylesheets/extra.css +++ b/docs/stylesheets/extra.css @@ -46,6 +46,245 @@ margin: 0 auto 1.5rem; } +.md-typeset .token-context-diagram { + margin: 1.5rem 0; +} + +.md-typeset .token-context-diagram__frame { + padding: 1.25rem; + border: 1px solid var(--md-default-fg-color--lightest); + border-radius: 1rem; + background: linear-gradient(180deg, rgba(15, 118, 110, 0.08), rgba(49, 70, 89, 0.04)); +} + +.md-typeset .token-context-diagram__title { + margin: 0 0 1rem; + text-align: center; + font-size: 0.9rem; + font-weight: 700; + letter-spacing: 0.08em; + text-transform: uppercase; + color: var(--md-default-fg-color--light); +} + +.md-typeset .token-context-diagram__columns { + display: grid; + grid-template-columns: minmax(0, 1.2fr) minmax(0, 0.9fr); + gap: 1rem; +} + +.md-typeset .token-context-diagram__panel { + display: flex; + flex-direction: column; + padding: 1rem 1.1rem; + border: 1px solid color-mix(in srgb, var(--md-accent-fg-color) 18%, white); + border-radius: 0.85rem; + background-color: var(--md-default-bg-color); + box-shadow: 0 0.5rem 1.2rem rgba(15, 23, 42, 0.06); +} + +.md-typeset .token-context-diagram__panel > h3 { + margin: 0 0 0.75rem; + font-size: 0.95rem; + text-transform: uppercase; + letter-spacing: 0.06em; + flex-shrink: 0; +} + +.md-typeset .token-context-diagram__panel > ul { + margin: 0; + padding-left: 1.15rem; +} + +.md-typeset .token-context-diagram__panel > ul li + li { + margin-top: 0.35rem; +} + +.md-typeset .token-context-diagram__panel-body { + flex: 1; + display: flex; + align-items: center; + margin: 0; + padding-bottom: 1.5rem; +} + +.md-typeset .token-context-diagram__panel-body > p { + margin: 0; + font-size: 1rem; + font-weight: 600; +} + +.md-typeset .guide-visual { + margin: 1.5rem 0; + padding: 1.25rem; + border: 1px solid var(--md-default-fg-color--lightest); + border-radius: 1rem; + background: linear-gradient(180deg, rgba(15, 118, 110, 0.08), rgba(49, 70, 89, 0.04)); +} + +.md-typeset .guide-visual__title { + margin: 0 0 1rem; + text-align: center; + font-size: 0.9rem; + font-weight: 700; + letter-spacing: 0.08em; + text-transform: uppercase; + color: var(--md-default-fg-color--light); +} + +.md-typeset .guide-visual__grid { + display: grid; + gap: 1rem; +} + +.md-typeset .guide-visual__grid--2 { + grid-template-columns: repeat(2, minmax(0, 1fr)); +} + +.md-typeset .guide-visual__grid--3 { + grid-template-columns: repeat(3, minmax(0, 1fr)); +} + +.md-typeset .guide-visual__card { + display: flex; + flex-direction: column; + gap: 0.75rem; + padding: 1rem 1.1rem; + border: 1px solid color-mix(in srgb, var(--md-accent-fg-color) 18%, white); + border-radius: 0.85rem; + background-color: var(--md-default-bg-color); + box-shadow: 0 0.5rem 1.2rem rgba(15, 23, 42, 0.06); +} + +.md-typeset .guide-visual__card > h4, +.md-typeset .guide-visual__card > p { + margin: 0; +} + +.md-typeset .guide-visual__card > h4 { + font-size: 0.95rem; + text-transform: uppercase; + letter-spacing: 0.06em; +} + +.md-typeset .guide-visual__card--step > h4 { + display: grid; + gap: 0.15rem; +} + +.md-typeset .guide-visual__step-label, +.md-typeset .guide-visual__step-copy { + display: block; +} + +.md-typeset .guide-visual__card--step .guide-visual__list { + align-self: stretch; + box-sizing: border-box; + margin-left: 0 !important; + margin-right: 0 !important; + margin-inline-start: 0 !important; + margin-inline-end: 0 !important; + width: 100%; + padding-left: 1.35rem !important; + padding-inline-start: 1.35rem !important; + padding-right: 0.35rem; +} + +.md-typeset .guide-visual__card--step .guide-visual__list li { + margin-left: 0 !important; + margin-inline-start: 0 !important; + text-align: left; +} + +.md-typeset .guide-visual__list { + margin: 0; + padding-left: 1.15rem; +} + +.md-typeset .guide-visual__list li + li { + margin-top: 0.35rem; +} + +.md-typeset .guide-visual__math { + margin: 0; + padding: 0.85rem 1rem; + border-radius: 0.75rem; + background: rgba(49, 70, 89, 0.08); + font-family: var(--md-code-font-family); + font-size: 0.9rem; + line-height: 1.5; +} + +.md-typeset .guide-visual__metric { + font-size: 1.4rem; + font-weight: 800; + line-height: 1.2; +} + +.md-typeset .guide-visual__note { + color: var(--md-default-fg-color--light); + font-size: 0.8rem; +} + +.md-typeset .guide-visual__flow { + display: flex; + align-items: center; + gap: 0.75rem; +} + +.md-typeset .guide-visual__flow::before { + content: "→"; + color: var(--md-accent-fg-color); + font-weight: 800; +} + +.md-typeset .guide-visual__curve { + display: grid; + gap: 0.85rem; +} + +.md-typeset .guide-visual__curve-row { + display: grid; + grid-template-columns: 5.5rem minmax(0, 1fr); + gap: 0.75rem; + align-items: center; +} + +.md-typeset .guide-visual__curve-label { + font-weight: 700; +} + +.md-typeset .guide-visual__curve-bar { + position: relative; + height: 1rem; + overflow: hidden; + border-radius: 999px; + background: rgba(49, 70, 89, 0.12); +} + +.md-typeset .guide-visual__curve-fill { + height: 100%; + border-radius: inherit; + background: linear-gradient(90deg, rgba(15, 118, 110, 0.95), rgba(15, 118, 110, 0.45)); +} + +.md-typeset .guide-visual__curve-fill--risk { + background: linear-gradient(90deg, rgba(148, 163, 184, 0.35), rgba(239, 68, 68, 0.9)); +} + +.md-typeset .guide-visual__scale, +.md-typeset .guide-visual__ticks { + display: flex; + justify-content: space-between; + gap: 0.5rem; + font-size: 0.78rem; + color: var(--md-default-fg-color--light); +} + +.md-typeset .guide-visual__ticks { + font-weight: 600; +} + .md-typeset blockquote { border-left-color: var(--md-accent-fg-color); } @@ -99,6 +338,24 @@ } @media screen and (max-width: 45rem) { + .md-typeset .token-context-diagram__frame { + padding: 1rem; + } + + .md-typeset .token-context-diagram__columns { + grid-template-columns: 1fr; + } + + .md-typeset .guide-visual { + padding: 1rem; + } + + .md-typeset .guide-visual__grid--2, + .md-typeset .guide-visual__grid--3, + .md-typeset .guide-visual__curve-row { + grid-template-columns: 1fr; + } + .site-footer-meta { grid-template-columns: 1fr; justify-items: center;