Skip to content

Commit d1e97a3

Browse files
ochafikclaude
andcommitted
pdf-server: improve annotation discoverability + add E2E tests
- display_pdf result text now explicitly lists annotation capabilities (highlights, stamps, notes, etc.) instead of vague "navigate, search, zoom, etc." - Restructured interact tool description: annotations promoted to top, with clear type reference, JSON example, and bold section headers - Added pdf-annotations.spec.ts with 6 E2E tests covering: - Result text mentions annotation capabilities - interact tool available in dropdown - add_annotations renders highlight - Multiple annotation types render (highlight, note, stamp, freetext, rectangle) - remove_annotations removes from DOM - highlight_text finds and highlights text Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 9c15304 commit d1e97a3

File tree

2 files changed

+348
-30
lines changed

2 files changed

+348
-30
lines changed

examples/pdf-server/server.ts

Lines changed: 37 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -844,7 +844,7 @@ Accepts:
844844
content: [
845845
{
846846
type: "text",
847-
text: `Displaying PDF (viewUUID: ${uuid}): ${normalized}. Use the interact tool with this viewUUID to navigate, search, zoom, etc.`,
847+
text: `Displaying PDF (viewUUID: ${uuid}): ${normalized}.\n\nUse the interact tool with this viewUUID to:\n- ANNOTATE: add highlights, underlines, notes, stamps (APPROVED/DRAFT/…), rectangles, freetext, strikethroughs\n- HIGHLIGHT TEXT by search query (highlight_text action)\n- NAVIGATE pages, SEARCH text, ZOOM\n- GET PAGES: extract text and/or screenshots from page ranges (get_pages action)\n- FILL FORM fields\n- DOWNLOAD the annotated PDF\n\nThe viewer supports full annotation capabilities — use add_annotations to mark up the document.`,
848848
},
849849
],
850850
structuredContent: {
@@ -864,37 +864,44 @@ Accepts:
864864
"interact",
865865
{
866866
title: "Interact with PDF",
867-
description: `Send an action to an existing PDF viewer. Actions are queued and batched.
868-
869-
Actions:
870-
- navigate: Go to a page. Requires \`page\`.
871-
- search: Search text and highlight matches in UI. Requires \`query\`. Results (with excerpts, pages, offsets) appear in model context.
872-
- find: Search text silently (no UI change). Requires \`query\`. Results appear in model context only.
873-
- search_navigate: Jump to a search match. Requires \`matchIndex\` (from search/find results).
874-
- zoom: Set zoom level. Requires \`scale\` (0.5–3.0).
875-
- add_annotations: Add annotations to the PDF. Requires \`annotations\` array. Each annotation has \`id\` (string), \`type\`, and \`page\` (1-indexed).
876-
- update_annotations: Partially update existing annotations. Requires \`annotations\` array (id + type required, other fields optional).
877-
- remove_annotations: Remove annotations by ID. Requires \`ids\` array.
878-
- highlight_text: Find text and highlight it. Requires \`query\`. Optional \`page\` (defaults to all pages), \`color\`, \`content\` (tooltip).
879-
- fill_form: Fill form fields. Requires \`fields\` array of { name, value }.
880-
- get_pages: Get text and/or screenshots from pages without navigating. Uses \`intervals\` (page ranges with optional start/end, e.g. [{start:1,end:5}], [{}] for all). Optional \`getText\` (default true), \`getScreenshots\` (default false). Max 20 pages. Returns page content directly.
881-
882-
Annotation types (all use PDF points, 72 dpi, bottom-left origin; colors are CSS strings):
883-
- highlight: \`{id, type:"highlight", page, rects:[{x,y,width,height}], color?, content?}\` — yellow semi-transparent overlay
884-
- underline: \`{id, type:"underline", page, rects:[{x,y,width,height}], color?}\` — red underline
885-
- strikethrough: \`{id, type:"strikethrough", page, rects:[{x,y,width,height}], color?}\` — line through text
886-
- note: \`{id, type:"note", page, x, y, content, color?}\` — sticky note icon with tooltip
887-
- rectangle: \`{id, type:"rectangle", page, x, y, width, height, color?, fillColor?}\` — box outline/fill
888-
- freetext: \`{id, type:"freetext", page, x, y, content, fontSize?, color?}\` — text at position
889-
- stamp: \`{id, type:"stamp", page, x, y, label, color?, rotation?}\` — label is one of: APPROVED, DRAFT, CONFIDENTIAL, FINAL, VOID, REJECTED
890-
891-
Example — add a highlight and a stamp:
867+
description: `Interact with a PDF viewer: annotate, navigate, search, extract pages, fill forms.
868+
869+
**ANNOTATION** — You can add visual annotations to any page. Use add_annotations with an array of annotation objects.
870+
Each annotation needs: id (unique string), type, page (1-indexed).
871+
Coordinates use PDF points (72 dpi), bottom-left origin.
872+
873+
Annotation types:
874+
• highlight: rects:[{x,y,width,height}], color?, content? — semi-transparent overlay on text regions
875+
• underline: rects:[{x,y,width,height}], color? — underline below text
876+
• strikethrough: rects:[{x,y,width,height}], color? — line through text
877+
• note: x, y, content, color? — sticky note icon with tooltip
878+
• rectangle: x, y, width, height, color?, fillColor? — outlined/filled box
879+
• freetext: x, y, content, fontSize?, color? — arbitrary text label
880+
• stamp: x, y, label (APPROVED|DRAFT|CONFIDENTIAL|FINAL|VOID|REJECTED), color?, rotation? — stamp overlay
881+
882+
Example — add a highlight and a stamp on page 1:
892883
\`\`\`json
893-
{"action":"add_annotations","viewUUID":"...","annotations":[
894-
{"id":"h1","type":"highlight","page":1,"rects":[{"x":72,"y":700,"width":200,"height":12}],"color":"rgba(255,255,0,0.5)"},
895-
{"id":"s1","type":"stamp","page":1,"x":300,"y":500,"label":"APPROVED","color":"#00aa00","rotation":-15}
884+
{"action":"add_annotations","viewUUID":"","annotations":[
885+
{"id":"h1","type":"highlight","page":1,"rects":[{"x":72,"y":700,"width":200,"height":12}]},
886+
{"id":"s1","type":"stamp","page":1,"x":300,"y":500,"label":"APPROVED","color":"green","rotation":-15}
896887
]}
897-
\`\`\``,
888+
\`\`\`
889+
890+
**HIGHLIGHT TEXT** — highlight_text: auto-find and highlight text by query. Requires \`query\`. Optional: page, color, content.
891+
892+
**ANNOTATION MANAGEMENT**:
893+
• update_annotations: partial update (id+type required). • remove_annotations: remove by ids.
894+
895+
**NAVIGATION & SEARCH**:
896+
• navigate: go to page (requires \`page\`)
897+
• search: highlight matches in UI (requires \`query\`). Results in model context.
898+
• find: silent search, no UI change (requires \`query\`). Results in model context.
899+
• search_navigate: jump to match (requires \`matchIndex\`)
900+
• zoom: set scale 0.5–3.0 (requires \`scale\`)
901+
902+
**PAGE EXTRACTION** — get_pages: extract text/screenshots from page ranges without navigating. \`intervals\` = [{start?,end?}], e.g. [{}] for all. \`getText\` (default true), \`getScreenshots\` (default false). Max 20 pages.
903+
904+
**FORMS** — fill_form: fill fields with \`fields\` array of {name, value}.`,
898905
inputSchema: {
899906
viewUUID: z
900907
.string()

tests/e2e/pdf-annotations.spec.ts

Lines changed: 311 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,311 @@
1+
import { test, expect, type Page } from "@playwright/test";
2+
3+
// Increase timeout for these tests — PDF loading from arxiv can be slow
4+
test.setTimeout(120000);
5+
6+
/**
7+
* PDF Annotation E2E Tests
8+
*
9+
* Tests the annotation capabilities of the PDF server through the basic-host UI.
10+
* Verifies that annotations can be added, rendered, and interacted with.
11+
*/
12+
13+
/** Wait for the MCP App to load inside nested iframes. */
14+
async function waitForAppLoad(page: Page) {
15+
const outerFrame = page.frameLocator("iframe").first();
16+
await expect(outerFrame.locator("iframe")).toBeVisible({ timeout: 30000 });
17+
}
18+
19+
/** Get the app frame locator (nested: sandbox > app) */
20+
function getAppFrame(page: Page) {
21+
return page.frameLocator("iframe").first().frameLocator("iframe").first();
22+
}
23+
24+
/** Load the PDF server and call display_pdf with the default PDF. */
25+
async function loadPdfServer(page: Page) {
26+
await page.goto("/?theme=hide");
27+
await expect(page.locator("select").first()).toBeEnabled({ timeout: 30000 });
28+
await page.locator("select").first().selectOption({ label: "PDF Server" });
29+
await page.click('button:has-text("Call Tool")');
30+
await waitForAppLoad(page);
31+
}
32+
33+
/**
34+
* Extract the viewUUID from the display_pdf result panel.
35+
* The tool result is displayed as JSON in a collapsible panel.
36+
*/
37+
async function extractViewUUID(page: Page): Promise<string> {
38+
// Wait for the Tool Result panel to appear — it contains "📤 Tool Result"
39+
const resultPanel = page.locator('text="📤 Tool Result"').first();
40+
await expect(resultPanel).toBeVisible({ timeout: 30000 });
41+
42+
// The result preview shows the first 100 chars including "viewUUID: ..."
43+
// Click to expand the result panel to see the full JSON
44+
await resultPanel.click();
45+
46+
// Wait for the expanded result content to appear
47+
const resultContent = page.locator("pre").last();
48+
await expect(resultContent).toBeVisible({ timeout: 5000 });
49+
50+
const resultText = (await resultContent.textContent()) ?? "";
51+
52+
// Extract viewUUID from the JSON result
53+
// The text content includes: "Displaying PDF (viewUUID: <uuid>): ..."
54+
const match = resultText.match(/viewUUID["\s:]+([a-f0-9-]{36})/);
55+
if (!match) {
56+
throw new Error(
57+
`Could not extract viewUUID from result: ${resultText.slice(0, 200)}`,
58+
);
59+
}
60+
return match[1];
61+
}
62+
63+
/**
64+
* Call the interact tool with the given input JSON.
65+
* Selects the interact tool from the dropdown, fills the input, and clicks Call Tool.
66+
*/
67+
async function callInteract(page: Page, input: Record<string, unknown>) {
68+
// Select "interact" in the tool dropdown (second select on the page)
69+
const toolSelect = page.locator("select").nth(1);
70+
await toolSelect.selectOption("interact");
71+
72+
// Fill the input textarea with the JSON
73+
const inputTextarea = page.locator("textarea");
74+
await inputTextarea.fill(JSON.stringify(input));
75+
76+
// Click "Call Tool"
77+
await page.click('button:has-text("Call Tool")');
78+
}
79+
80+
/** Wait for the PDF canvas to render (ensures the page is ready for annotations). */
81+
async function waitForPdfCanvas(page: Page) {
82+
const appFrame = getAppFrame(page);
83+
await expect(appFrame.locator("canvas").first()).toBeVisible({
84+
timeout: 30000,
85+
});
86+
// Wait a bit for fonts and text layer to stabilize
87+
await page.waitForTimeout(2000);
88+
}
89+
90+
test.describe("PDF Server - Annotations", () => {
91+
test("display_pdf result mentions annotation capabilities", async ({
92+
page,
93+
}) => {
94+
await loadPdfServer(page);
95+
96+
// Wait for result to appear
97+
const resultPanel = page.locator('text="📤 Tool Result"').first();
98+
await expect(resultPanel).toBeVisible({ timeout: 30000 });
99+
100+
// Expand the result panel
101+
await resultPanel.click();
102+
const resultContent = page.locator("pre").last();
103+
await expect(resultContent).toBeVisible({ timeout: 5000 });
104+
const resultText = (await resultContent.textContent()) ?? "";
105+
106+
// Verify the result text mentions annotation capabilities
107+
expect(resultText).toContain("ANNOTATE");
108+
expect(resultText).toContain("add_annotations");
109+
expect(resultText).toContain("highlights");
110+
expect(resultText).toContain("stamps");
111+
expect(resultText).toContain("annotation capabilities");
112+
});
113+
114+
test("interact tool is available in tool dropdown", async ({ page }) => {
115+
await loadPdfServer(page);
116+
117+
// Verify the interact tool is available in the tool dropdown
118+
const toolSelect = page.locator("select").nth(1);
119+
const options = await toolSelect.locator("option").allTextContents();
120+
expect(options).toContain("interact");
121+
});
122+
123+
test("add_annotations renders highlight on the page", async ({ page }) => {
124+
await loadPdfServer(page);
125+
await waitForPdfCanvas(page);
126+
127+
const viewUUID = await extractViewUUID(page);
128+
129+
// Add a highlight annotation on page 1
130+
await callInteract(page, {
131+
viewUUID,
132+
action: "add_annotations",
133+
annotations: [
134+
{
135+
id: "test-highlight-1",
136+
type: "highlight",
137+
page: 1,
138+
rects: [{ x: 72, y: 700, width: 300, height: 14 }],
139+
color: "rgba(255, 255, 0, 0.4)",
140+
},
141+
],
142+
});
143+
144+
// Wait for the interact result
145+
await page.waitForTimeout(1000);
146+
147+
// Verify the annotation appears in the annotation layer inside the app frame
148+
const appFrame = getAppFrame(page);
149+
const annotationLayer = appFrame.locator("#annotation-layer");
150+
await expect(annotationLayer).toBeVisible({ timeout: 5000 });
151+
152+
// Check that a highlight annotation element was rendered
153+
const highlightEl = appFrame.locator(".annotation-highlight");
154+
await expect(highlightEl.first()).toBeVisible({ timeout: 5000 });
155+
});
156+
157+
test("add_annotations renders multiple annotation types", async ({
158+
page,
159+
}) => {
160+
await loadPdfServer(page);
161+
await waitForPdfCanvas(page);
162+
163+
const viewUUID = await extractViewUUID(page);
164+
165+
// Add multiple annotation types at once
166+
await callInteract(page, {
167+
viewUUID,
168+
action: "add_annotations",
169+
annotations: [
170+
{
171+
id: "test-highlight",
172+
type: "highlight",
173+
page: 1,
174+
rects: [{ x: 72, y: 700, width: 300, height: 14 }],
175+
color: "rgba(255, 255, 0, 0.4)",
176+
},
177+
{
178+
id: "test-note",
179+
type: "note",
180+
page: 1,
181+
x: 400,
182+
y: 600,
183+
content: "Important finding!",
184+
color: "#ffeb3b",
185+
},
186+
{
187+
id: "test-stamp",
188+
type: "stamp",
189+
page: 1,
190+
x: 300,
191+
y: 400,
192+
label: "APPROVED",
193+
color: "#4caf50",
194+
rotation: -15,
195+
},
196+
{
197+
id: "test-freetext",
198+
type: "freetext",
199+
page: 1,
200+
x: 100,
201+
y: 300,
202+
content: "See section 3.2",
203+
fontSize: 14,
204+
color: "#1976d2",
205+
},
206+
{
207+
id: "test-rect",
208+
type: "rectangle",
209+
page: 1,
210+
x: 50,
211+
y: 200,
212+
width: 500,
213+
height: 100,
214+
color: "#f44336",
215+
},
216+
],
217+
});
218+
219+
await page.waitForTimeout(1500);
220+
221+
const appFrame = getAppFrame(page);
222+
223+
// Verify each annotation type is rendered
224+
await expect(appFrame.locator(".annotation-highlight").first()).toBeVisible(
225+
{
226+
timeout: 5000,
227+
},
228+
);
229+
await expect(appFrame.locator(".annotation-note").first()).toBeVisible({
230+
timeout: 5000,
231+
});
232+
await expect(appFrame.locator(".annotation-stamp").first()).toBeVisible({
233+
timeout: 5000,
234+
});
235+
await expect(appFrame.locator(".annotation-freetext").first()).toBeVisible({
236+
timeout: 5000,
237+
});
238+
await expect(appFrame.locator(".annotation-rectangle").first()).toBeVisible(
239+
{ timeout: 5000 },
240+
);
241+
});
242+
243+
test("remove_annotations removes annotation from DOM", async ({ page }) => {
244+
await loadPdfServer(page);
245+
await waitForPdfCanvas(page);
246+
247+
const viewUUID = await extractViewUUID(page);
248+
249+
// Add an annotation
250+
await callInteract(page, {
251+
viewUUID,
252+
action: "add_annotations",
253+
annotations: [
254+
{
255+
id: "to-remove",
256+
type: "highlight",
257+
page: 1,
258+
rects: [{ x: 72, y: 700, width: 300, height: 14 }],
259+
},
260+
],
261+
});
262+
263+
await page.waitForTimeout(1000);
264+
265+
const appFrame = getAppFrame(page);
266+
await expect(appFrame.locator(".annotation-highlight").first()).toBeVisible(
267+
{
268+
timeout: 5000,
269+
},
270+
);
271+
272+
// Remove the annotation
273+
await callInteract(page, {
274+
viewUUID,
275+
action: "remove_annotations",
276+
ids: ["to-remove"],
277+
});
278+
279+
await page.waitForTimeout(1000);
280+
281+
// Verify the annotation is gone
282+
await expect(appFrame.locator(".annotation-highlight")).toHaveCount(0, {
283+
timeout: 5000,
284+
});
285+
});
286+
287+
test("highlight_text finds and highlights text", async ({ page }) => {
288+
await loadPdfServer(page);
289+
await waitForPdfCanvas(page);
290+
291+
const viewUUID = await extractViewUUID(page);
292+
293+
// Use highlight_text to find and highlight "Attention" in the PDF
294+
await callInteract(page, {
295+
viewUUID,
296+
action: "highlight_text",
297+
query: "Attention",
298+
color: "rgba(0, 200, 255, 0.4)",
299+
});
300+
301+
await page.waitForTimeout(2000);
302+
303+
const appFrame = getAppFrame(page);
304+
// highlight_text creates highlight annotations, so we should see at least one
305+
await expect(appFrame.locator(".annotation-highlight").first()).toBeVisible(
306+
{
307+
timeout: 10000,
308+
},
309+
);
310+
});
311+
});

0 commit comments

Comments
 (0)