diff --git a/docs/01-why-tokens-matter.md b/docs/01-why-tokens-matter.md
index eb4de36..eb16a0b 100644
--- a/docs/01-why-tokens-matter.md
+++ b/docs/01-why-tokens-matter.md
@@ -51,23 +51,29 @@ Every token you send or receive has a cost. Here's how:
Understanding what Copilot does behind the scenes helps you optimize:
-```text
-┌─────────────────────────────────────────────────┐
-│ Context Window │
-│ │
-│ ┌──────────────────┐ ┌─────────────────────┐ │
-│ │ INPUT TOKENS │ │ OUTPUT TOKENS │ │
-│ │ │ │ │ │
-│ │ System prompt │ │ The response │ │
-│ │ + copilot- │ │ you receive │ │
-│ │ instructions │ │ │ │
-│ │ + file context │ │ │ │
-│ │ + conversation │ │ │ │
-│ │ history │ │ │ │
-│ │ + YOUR prompt │ │ │ │
-│ └──────────────────┘ └─────────────────────┘ │
-└─────────────────────────────────────────────────┘
-```
+
+
+
Context Window
+
+
+ Input tokens
+
+ - System prompt
+ copilot-instructions.md
+ - File context
+ - Conversation history
+ - Your prompt
+
+
+
+ Output tokens
+
+
The response you receive
+
+
+
+
+
- **System prompt:** Copilot's own instructions (you can't control this)
- **`copilot-instructions.md`:** Your project-level instructions — loaded on **every** interaction
diff --git a/docs/01-why-tokens-matter.zh-TW.md b/docs/01-why-tokens-matter.zh-TW.md
index 244c568..5978108 100644
--- a/docs/01-why-tokens-matter.zh-TW.md
+++ b/docs/01-why-tokens-matter.zh-TW.md
@@ -51,23 +51,29 @@ Token 是大型語言模型讀寫時使用的基本單位。它不是單字,
了解 Copilot 背後實際做了什麼,才能知道該怎麼最佳化:
-```text
-┌─────────────────────────────────────────────────┐
-│ Context Window │
-│ │
-│ ┌──────────────────┐ ┌─────────────────────┐ │
-│ │ INPUT TOKENS │ │ OUTPUT TOKENS │ │
-│ │ │ │ │ │
-│ │ System prompt │ │ 你收到的回應 │ │
-│ │ + copilot- │ │ │ │
-│ │ instructions │ │ │ │
-│ │ + file context │ │ │ │
-│ │ + conversation │ │ │ │
-│ │ history │ │ │ │
-│ │ + 你的 prompt │ │ │ │
-│ └──────────────────┘ └─────────────────────┘ │
-└─────────────────────────────────────────────────┘
-```
+
+
+
Context Window
+
+
+ 輸入 token
+
+ - System prompt
+ copilot-instructions.md
+ - File context
+ - Conversation history
+ - 你的 prompt
+
+
+
+
+
+
- **System prompt:** Copilot 自身的內建指示(你無法控制)
- **`copilot-instructions.md`:** 專案層級指示,**每次互動都會載入**
diff --git a/docs/08-mcp-tool-costs.md b/docs/08-mcp-tool-costs.md
index 0374c1f..686039f 100644
--- a/docs/08-mcp-tool-costs.md
+++ b/docs/08-mcp-tool-costs.md
@@ -55,17 +55,24 @@ This isn't free. Each tool definition costs approximately:
Here's where it gets expensive:
-```text
-Tools loaded = servers × tools_per_server × tokens_per_tool
-
-Example (heavy setup):
- 10 MCP servers × 5 tools each × 200 tokens avg = 10,000 tokens
-
-Agent mode runs 5-25 steps per task.
-Tool definitions reload EVERY step.
-
-10,000 tokens × 15 steps = 150,000 tokens just for tool definitions.
-```
+
+
Reloaded Tool Cost
+
+
+ Formula
+ Tools loaded = servers x tools_per_server x tokens_per_tool
+ That whole bundle reloads on every agent step.
+
+
+ Heavy setup example
+ 10 MCP servers x 5 tools x 200 tokens = 10,000 tokens
+
+
10,000 tokens x 15 steps
+
+ 150,000 tokens
+
+
+
That's 150K tokens doing nothing but telling the agent what tools exist. Before any actual work happens.
diff --git a/docs/08-mcp-tool-costs.zh-TW.md b/docs/08-mcp-tool-costs.zh-TW.md
index eb46273..e0459cf 100644
--- a/docs/08-mcp-tool-costs.zh-TW.md
+++ b/docs/08-mcp-tool-costs.zh-TW.md
@@ -47,15 +47,24 @@ Buffer: 40.4k (20%)
真正貴的是它會被重複載入:
-```text
-Tools loaded = servers × tools_per_server × tokens_per_tool
-
-Example:
-10 MCP servers × 5 tools × 200 tokens = 10,000 tokens
-
-Agent mode 走 15 steps:
-10,000 × 15 = 150,000 tokens
-```
+
+
工具成本會一直重載
+
+
+ 公式
+ Tools loaded = servers x tools_per_server x tokens_per_tool
+ 整包工具定義會在每個 agent step 再載一次。
+
+
+ 重度設定範例
+ 10 MCP servers x 5 tools x 200 tokens = 10,000 tokens
+
+
10,000 tokens x 15 steps
+
+ 150,000 tokens
+
+
+
也就是說,還沒做任何真正工作,就先花了 15 萬個 token 讓 agent 知道有哪些工具可用。
diff --git a/docs/09-comparisons-data.md b/docs/09-comparisons-data.md
index c9a0213..adfb7dd 100644
--- a/docs/09-comparisons-data.md
+++ b/docs/09-comparisons-data.md
@@ -151,15 +151,36 @@ Does compression hurt output quality? The research says: **rarely, and only at e
The savings curve is not linear. The first 30% of compression (dropping filler) is free. The next 20% (fragments, abbreviations) is nearly free. Beyond that, each additional compression point risks quality.
-```text
-Savings vs. Quality Risk:
-
-Quality ████████████████████████████████████░░░░░░░░░
-Risk ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░███████████████
- 0% 20% 40% 60% 80%
- Token Savings →
- lite full ultra extreme
-```
+
+
Savings vs. Quality Risk
+
+
+ 0%
+ 20%
+ 40%
+ 60%
+ 80%
+
+
+ lite
+ full
+ ultra
+ extreme
+
+
**Sweet spot: full caveman (30-50% input token savings; 40-55% output savings with terse system instructions).** Maximum return, negligible risk.
diff --git a/docs/09-comparisons-data.zh-TW.md b/docs/09-comparisons-data.zh-TW.md
index b7c7035..07c1537 100644
--- a/docs/09-comparisons-data.zh-TW.md
+++ b/docs/09-comparisons-data.zh-TW.md
@@ -99,6 +99,37 @@
再往後 20% 也通常很划算。
超過那個點後,每多壓一點,都更可能帶來誤解。
+
+
效益與風險曲線
+
+
+ 0%
+ 20%
+ 40%
+ 60%
+ 80%
+
+
+ lite
+ full
+ ultra
+ extreme
+
+
+
**建議甜蜜點:** Full caveman。
---
diff --git a/docs/10-practical-setup.md b/docs/10-practical-setup.md
index 71caac9..c65687e 100644
--- a/docs/10-practical-setup.md
+++ b/docs/10-practical-setup.md
@@ -371,28 +371,39 @@ Each mode has a fundamentally different token cost profile:
Understanding the loop helps you minimize steps:
-```text
-Step 1: Load context
- ├── System prompt (~500 tokens)
- ├── copilot-instructions.md (~50-1500 tokens)
- ├── Tool definitions (~2,000-20,000 tokens)
- ├── Conversation history (growing)
- └── YOUR prompt
- → Send to LLM → Get response
-
-Step 2: LLM decides to call a tool
- ├── Tool call (function + params) → output tokens
- ├── Tool result → input tokens (next step)
- └── Reasoning about result → output tokens
-
-Step 3: Another tool call (or generate response)
- ├── ALL of Step 1's context reloaded
- ├── + Step 2's tool call and result
- └── + growing conversation
- → Send to LLM again
-
-... repeat 5-25 times
-```
+
+
Agent Mode Loop
+
+
+ Step 1:Load context
+
+ - System prompt (~500 tokens)
+ copilot-instructions.md (~50-1500 tokens)
+ - Tool definitions (~2,000-20,000 tokens)
+ - Conversation history (growing)
+ - Your prompt
+
+ Send to LLM → get response
+
+
+ Step 2:Call tool
+
+ - Tool call (function + params) → output tokens
+ - Tool result → input tokens
+ - Reasoning about result → output tokens
+
+
+
+ Step 3:Repeat
+
+ - All of Step 1 context reloads
+ - + prior tool call and result
+ - + growing conversation
+
+ Repeat 5-25 times
+
+
+
**Key insight:** Context grows with every step. Step 15 carries all the context from steps 1-14 plus the original prompt. This is why long agent sessions get expensive fast.
diff --git a/docs/10-practical-setup.zh-TW.md b/docs/10-practical-setup.zh-TW.md
index 641abfa..2032ad3 100644
--- a/docs/10-practical-setup.zh-TW.md
+++ b/docs/10-practical-setup.zh-TW.md
@@ -234,6 +234,40 @@ Agent Mode 常比 Ask Mode 貴上很多倍。
每多一步,完整 context 都可能再重送一次,而且還會帶上前一步的結果,因此後期步驟會越來越貴。
+
+
Agent Mode 迴圈
+
+
+ Step 1:載入 context
+
+ - System prompt(約 500 tokens)
+ copilot-instructions.md(約 50-1500 tokens)
+ - Tool definitions(約 2,000-20,000 tokens)
+ - Conversation history(持續增加)
+ - 你的 prompt
+
+ 送進 LLM → 取得回應
+
+
+ Step 2:呼叫工具
+
+ - Tool call(function + params)→ output tokens
+ - Tool result → 下一步的 input tokens
+ - 對結果做判斷 → output tokens
+
+
+
+ Step 3:再次重送
+
+ - Step 1 的 context 全部重載
+ - + Step 2 的工具呼叫與結果
+ - + 持續成長的對話內容
+
+ 重複 5-25 次
+
+
+
+
### 4.5.3 如何減少 Agent 步數
- **Prompt 要精準,並加上 acceptance criteria**
diff --git a/docs/stylesheets/extra.css b/docs/stylesheets/extra.css
index 672518d..4666050 100644
--- a/docs/stylesheets/extra.css
+++ b/docs/stylesheets/extra.css
@@ -46,6 +46,245 @@
margin: 0 auto 1.5rem;
}
+.md-typeset .token-context-diagram {
+ margin: 1.5rem 0;
+}
+
+.md-typeset .token-context-diagram__frame {
+ padding: 1.25rem;
+ border: 1px solid var(--md-default-fg-color--lightest);
+ border-radius: 1rem;
+ background: linear-gradient(180deg, rgba(15, 118, 110, 0.08), rgba(49, 70, 89, 0.04));
+}
+
+.md-typeset .token-context-diagram__title {
+ margin: 0 0 1rem;
+ text-align: center;
+ font-size: 0.9rem;
+ font-weight: 700;
+ letter-spacing: 0.08em;
+ text-transform: uppercase;
+ color: var(--md-default-fg-color--light);
+}
+
+.md-typeset .token-context-diagram__columns {
+ display: grid;
+ grid-template-columns: minmax(0, 1.2fr) minmax(0, 0.9fr);
+ gap: 1rem;
+}
+
+.md-typeset .token-context-diagram__panel {
+ display: flex;
+ flex-direction: column;
+ padding: 1rem 1.1rem;
+ border: 1px solid color-mix(in srgb, var(--md-accent-fg-color) 18%, white);
+ border-radius: 0.85rem;
+ background-color: var(--md-default-bg-color);
+ box-shadow: 0 0.5rem 1.2rem rgba(15, 23, 42, 0.06);
+}
+
+.md-typeset .token-context-diagram__panel > h3 {
+ margin: 0 0 0.75rem;
+ font-size: 0.95rem;
+ text-transform: uppercase;
+ letter-spacing: 0.06em;
+ flex-shrink: 0;
+}
+
+.md-typeset .token-context-diagram__panel > ul {
+ margin: 0;
+ padding-left: 1.15rem;
+}
+
+.md-typeset .token-context-diagram__panel > ul li + li {
+ margin-top: 0.35rem;
+}
+
+.md-typeset .token-context-diagram__panel-body {
+ flex: 1;
+ display: flex;
+ align-items: center;
+ margin: 0;
+ padding-bottom: 1.5rem;
+}
+
+.md-typeset .token-context-diagram__panel-body > p {
+ margin: 0;
+ font-size: 1rem;
+ font-weight: 600;
+}
+
+.md-typeset .guide-visual {
+ margin: 1.5rem 0;
+ padding: 1.25rem;
+ border: 1px solid var(--md-default-fg-color--lightest);
+ border-radius: 1rem;
+ background: linear-gradient(180deg, rgba(15, 118, 110, 0.08), rgba(49, 70, 89, 0.04));
+}
+
+.md-typeset .guide-visual__title {
+ margin: 0 0 1rem;
+ text-align: center;
+ font-size: 0.9rem;
+ font-weight: 700;
+ letter-spacing: 0.08em;
+ text-transform: uppercase;
+ color: var(--md-default-fg-color--light);
+}
+
+.md-typeset .guide-visual__grid {
+ display: grid;
+ gap: 1rem;
+}
+
+.md-typeset .guide-visual__grid--2 {
+ grid-template-columns: repeat(2, minmax(0, 1fr));
+}
+
+.md-typeset .guide-visual__grid--3 {
+ grid-template-columns: repeat(3, minmax(0, 1fr));
+}
+
+.md-typeset .guide-visual__card {
+ display: flex;
+ flex-direction: column;
+ gap: 0.75rem;
+ padding: 1rem 1.1rem;
+ border: 1px solid color-mix(in srgb, var(--md-accent-fg-color) 18%, white);
+ border-radius: 0.85rem;
+ background-color: var(--md-default-bg-color);
+ box-shadow: 0 0.5rem 1.2rem rgba(15, 23, 42, 0.06);
+}
+
+.md-typeset .guide-visual__card > h4,
+.md-typeset .guide-visual__card > p {
+ margin: 0;
+}
+
+.md-typeset .guide-visual__card > h4 {
+ font-size: 0.95rem;
+ text-transform: uppercase;
+ letter-spacing: 0.06em;
+}
+
+.md-typeset .guide-visual__card--step > h4 {
+ display: grid;
+ gap: 0.15rem;
+}
+
+.md-typeset .guide-visual__step-label,
+.md-typeset .guide-visual__step-copy {
+ display: block;
+}
+
+.md-typeset .guide-visual__card--step .guide-visual__list {
+ align-self: stretch;
+ box-sizing: border-box;
+ margin-left: 0 !important;
+ margin-right: 0 !important;
+ margin-inline-start: 0 !important;
+ margin-inline-end: 0 !important;
+ width: 100%;
+ padding-left: 1.35rem !important;
+ padding-inline-start: 1.35rem !important;
+ padding-right: 0.35rem;
+}
+
+.md-typeset .guide-visual__card--step .guide-visual__list li {
+ margin-left: 0 !important;
+ margin-inline-start: 0 !important;
+ text-align: left;
+}
+
+.md-typeset .guide-visual__list {
+ margin: 0;
+ padding-left: 1.15rem;
+}
+
+.md-typeset .guide-visual__list li + li {
+ margin-top: 0.35rem;
+}
+
+.md-typeset .guide-visual__math {
+ margin: 0;
+ padding: 0.85rem 1rem;
+ border-radius: 0.75rem;
+ background: rgba(49, 70, 89, 0.08);
+ font-family: var(--md-code-font-family);
+ font-size: 0.9rem;
+ line-height: 1.5;
+}
+
+.md-typeset .guide-visual__metric {
+ font-size: 1.4rem;
+ font-weight: 800;
+ line-height: 1.2;
+}
+
+.md-typeset .guide-visual__note {
+ color: var(--md-default-fg-color--light);
+ font-size: 0.8rem;
+}
+
+.md-typeset .guide-visual__flow {
+ display: flex;
+ align-items: center;
+ gap: 0.75rem;
+}
+
+.md-typeset .guide-visual__flow::before {
+ content: "→";
+ color: var(--md-accent-fg-color);
+ font-weight: 800;
+}
+
+.md-typeset .guide-visual__curve {
+ display: grid;
+ gap: 0.85rem;
+}
+
+.md-typeset .guide-visual__curve-row {
+ display: grid;
+ grid-template-columns: 5.5rem minmax(0, 1fr);
+ gap: 0.75rem;
+ align-items: center;
+}
+
+.md-typeset .guide-visual__curve-label {
+ font-weight: 700;
+}
+
+.md-typeset .guide-visual__curve-bar {
+ position: relative;
+ height: 1rem;
+ overflow: hidden;
+ border-radius: 999px;
+ background: rgba(49, 70, 89, 0.12);
+}
+
+.md-typeset .guide-visual__curve-fill {
+ height: 100%;
+ border-radius: inherit;
+ background: linear-gradient(90deg, rgba(15, 118, 110, 0.95), rgba(15, 118, 110, 0.45));
+}
+
+.md-typeset .guide-visual__curve-fill--risk {
+ background: linear-gradient(90deg, rgba(148, 163, 184, 0.35), rgba(239, 68, 68, 0.9));
+}
+
+.md-typeset .guide-visual__scale,
+.md-typeset .guide-visual__ticks {
+ display: flex;
+ justify-content: space-between;
+ gap: 0.5rem;
+ font-size: 0.78rem;
+ color: var(--md-default-fg-color--light);
+}
+
+.md-typeset .guide-visual__ticks {
+ font-weight: 600;
+}
+
.md-typeset blockquote {
border-left-color: var(--md-accent-fg-color);
}
@@ -99,6 +338,24 @@
}
@media screen and (max-width: 45rem) {
+ .md-typeset .token-context-diagram__frame {
+ padding: 1rem;
+ }
+
+ .md-typeset .token-context-diagram__columns {
+ grid-template-columns: 1fr;
+ }
+
+ .md-typeset .guide-visual {
+ padding: 1rem;
+ }
+
+ .md-typeset .guide-visual__grid--2,
+ .md-typeset .guide-visual__grid--3,
+ .md-typeset .guide-visual__curve-row {
+ grid-template-columns: 1fr;
+ }
+
.site-footer-meta {
grid-template-columns: 1fr;
justify-items: center;