Skip to content

feat(screenshot): target a single window via window_name / window_pid#233

Open
exalsch wants to merge 13 commits into
CursorTouch:mainfrom
exalsch:feat/window-targeted-screenshot
Open

feat(screenshot): target a single window via window_name / window_pid#233
exalsch wants to merge 13 commits into
CursorTouch:mainfrom
exalsch:feat/window-targeted-screenshot

Conversation

@exalsch
Copy link
Copy Markdown

@exalsch exalsch commented May 8, 2026

Summary

Adds window_name (fuzzy title match) and window_pid (exact process id) parameters to the Screenshot and Snapshot tools. When either is set, the tool resolves that window, brings it to the foreground (unless focus_window=False), reads its bounding rectangle, and captures only that area. window_name / window_pid and display are mutually exclusive.

Stacked on #232. This branch builds on feat/screenshot-flash-overlay; the diff currently shows both PRs' changes. Once #232 lands, this branch will rebase cleanly to show only the window-targeting commits.

Why

Right now there is no way to ask the server for "just this window". The only capture controls are display=[N] (whole monitor) or full virtual screen. Agents that want a specific app's screenshot have to guess the right display, capture the whole thing, and trust the LLM to crop visually. With window_name / window_pid the server does the lookup and returns just the relevant pixels. Combined with #232's flash overlay this also gives the user a clear visual confirmation of which window was just sampled.

Implementation

  • New module src/windows_mcp/desktop/window_resolver.py
    • enumerate_visible_windows()EnumWindows filtered to visible, valid HWNDs.
    • resolve_window(name=..., pid=..., windows=...) — PID match takes precedence; name match uses fuzzywuzzy.process.extractOne with a 70 score cutoff (consistent with Desktop._find_window_by_name). Untitled PID-matched windows are tolerated; titled ones are preferred.
    • get_window_rect(hwnd)DwmGetWindowAttribute(DWMWA_EXTENDED_FRAME_BOUNDS) first so the DWM drop-shadow on Aero windows doesn't inflate the rect; GetWindowRect fallback.
  • Desktop.resolve_window_capture_rect(name=..., pid=..., focus=True) glues resolver + focus together, reusing the existing bring_window_to_top logic, and returns (uia.Rect, title). Sleeps 50 ms after focus so DWM has time to repaint before the rect is read.
  • Desktop.get_state(...) gains an optional capture_rect: uia.Rect | None parameter that overrides display_indices and is rejected if both are supplied.
  • capture_desktop_state plumbs window_name / window_pid / focus_window through, validates the mutual exclusion, and surfaces the resolved title in the response metadata as Target Window: <name>.
  • Screenshot and Snapshot tool descriptions updated to document the new parameters.

Tests

  • tests/test_window_resolver.py — 12 cases (enumeration filtering, PID prefers titled, PID falls back to first match, PID-not-found error, fuzzy name match, fuzzy cutoff rejection, untitled-only error, PID-takes-precedence-over-name, DWM success path, DWM fallback to GetWindowRect, DWM-raises fallback).
  • Integration: existing test_snapshot_display_filter.py continues to pass — the new capture_rect parameter is purely additive and the display_indices-based flow is unchanged.

Test plan

  • Screenshot(window_name="cotire") returns just that window
  • Screenshot(window_pid=12345) returns just that PID's titled window
  • Screenshot(window_name="...", focus_window=False) returns the window without focusing it (errors clearly if minimized)
  • Screenshot(window_name="...", display=[0]) rejects with a clear error
  • Snapshot(use_vision=True, window_name="...") returns the window with the UI tree filtered to that region
  • DWM extended-frame rect is used (no extra drop-shadow padding) for Aero windows
  • pytest tests/test_window_resolver.py passes (12/12)

🤖 Generated with Claude Code

After every screenshot, draw a glowing orange-red border over the
captured area on a transparent always-on-top Tk overlay for ~2.5s.
Region captures get a solid border that fades out at the end; full-
screen captures show a bell-fading inner border on the union of all
monitor rects. The overlay is started *after* capture and cancelled
before any subsequent capture so it never appears in the captured
image.

Set WINDOWS_MCP_DISABLE_FLASH=1/true/yes/on to suppress the effect.
The conftest disables the flash for the test suite to avoid Tk
threads racing with pytest teardown.
@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Add window-targeted screenshot capture and post-capture flash overlay

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Add window-targeted screenshot capture via window_name (fuzzy match) and window_pid (exact
  PID)
• Implement post-capture visual feedback with orange-red flash overlay on transparent always-on-top
  window
• New window_resolver.py module for window enumeration, resolution, and rect retrieval using DWM
  extended bounds
• Plumb window targeting through Screenshot and Snapshot tools with mutual exclusion validation
  against display
Diagram
flowchart LR
  A["Screenshot/Snapshot Tool"] -->|window_name or window_pid| B["capture_desktop_state"]
  B -->|resolve_window_capture_rect| C["window_resolver"]
  C -->|enumerate_visible_windows| D["Win32 EnumWindows"]
  C -->|fuzzy/PID match| E["resolve_window"]
  E -->|DwmGetWindowAttribute| F["get_window_rect"]
  F -->|fallback| G["GetWindowRect"]
  B -->|capture_rect| H["Desktop.get_state"]
  H -->|get_screenshot| I["flash_overlay.show_capture_flash"]
  I -->|Tk daemon thread| J["Visual Confirmation Border"]
Loading

Grey Divider

File Changes

1. src/windows_mcp/desktop/flash_overlay.py ✨ Enhancement +189/-0

New module for post-screenshot visual feedback overlay

src/windows_mcp/desktop/flash_overlay.py


2. src/windows_mcp/desktop/window_resolver.py ✨ Enhancement +115/-0

New module for window resolution by name or PID

src/windows_mcp/desktop/window_resolver.py


3. src/windows_mcp/desktop/service.py ✨ Enhancement +87/-20

Integrate window resolver and flash overlay into Desktop service

src/windows_mcp/desktop/service.py


View more (7)
4. src/windows_mcp/tools/_snapshot_helpers.py ✨ Enhancement +27/-8

Add window targeting parameters to capture_desktop_state helper

src/windows_mcp/tools/_snapshot_helpers.py


5. src/windows_mcp/tools/snapshot.py ✨ Enhancement +21/-8

Add window_name, window_pid, focus_window parameters to tools

src/windows_mcp/tools/snapshot.py


6. tests/conftest.py 🧪 Tests +8/-0

Disable flash overlay during test suite to prevent Tk race conditions

tests/conftest.py


7. tests/test_flash_overlay.py 🧪 Tests +117/-0

New unit tests for flash overlay lifecycle and env-var gating

tests/test_flash_overlay.py


8. tests/test_window_resolver.py 🧪 Tests +148/-0

New unit tests for window enumeration, resolution, and rect retrieval

tests/test_window_resolver.py


9. CLAUDE.md 📝 Documentation +2/-0

Document window resolver and flash overlay environment variables

CLAUDE.md


10. README.md 📝 Documentation +3/-2

Document window targeting and flash overlay in tool descriptions

README.md


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented May 8, 2026

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (1) 📎 Requirement gaps (0)

Grey Divider


Action required

1. Tests missing type hints 📘 Rule violation ⚙ Maintainability
Description
New test functions/fixtures are added without parameter and return type annotations. This violates
the requirement for fully type-annotated function signatures and reduces static analysis/IDE
support.
Code

tests/test_flash_overlay.py[R17-28]

+@pytest.fixture(autouse=True)
+def _reset_active_overlay():
+    """Each test starts and ends with no overlay registered."""
+    with flash_overlay._lock:
+        flash_overlay._active_overlay = None
+    yield
+    with flash_overlay._lock:
+        ov = flash_overlay._active_overlay
+        flash_overlay._active_overlay = None
+    if ov is not None:
+        ov.stop_event.set()
+
Evidence
PR Compliance ID 4 requires type hints on all added/modified function signatures. The new test
fixture/function signatures are unannotated (no parameter/return types) in the added test modules.

CLAUDE.md
tests/test_flash_overlay.py[17-28]
tests/test_window_resolver.py[18-25]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Newly-added test functions/fixtures do not include type annotations for parameters and return types.
## Issue Context
Compliance requires all added/modified function signatures to be fully type-annotated.
## Fix Focus Areas
- tests/test_flash_overlay.py[17-117]
- tests/test_window_resolver.py[18-148]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Public APIs lack Google docstrings ✓ Resolved 📘 Rule violation ⚙ Maintainability
Description
New public functions/classes are missing docstrings or use non-Google-style docstrings. This reduces
maintainability and violates the docstring standard for public APIs.
Code

src/windows_mcp/desktop/window_resolver.py[R109-115]

+def is_iconic(hwnd: int) -> bool:
+    return bool(win32gui.IsIconic(hwnd))
+
+
+def restore_if_minimized(hwnd: int) -> None:
+    if is_iconic(hwnd):
+        win32gui.ShowWindow(hwnd, win32con.SW_RESTORE)
Evidence
PR Compliance ID 5 requires Google-style docstrings for public functions/classes. In the new
modules, public APIs such as is_iconic()/restore_if_minimized() have no docstrings, and
show_capture_flash() uses a narrative docstring without Google-style Args/Returns sections.

CLAUDE.md
src/windows_mcp/desktop/window_resolver.py[109-115]
src/windows_mcp/desktop/flash_overlay.py[53-66]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Public functions/classes added in this PR are missing docstrings or are not documented in Google-style format.
## Issue Context
Compliance requires Google-style docstrings for public APIs to keep documentation consistent and easy to maintain.
## Fix Focus Areas
- src/windows_mcp/desktop/window_resolver.py[25-115]
- src/windows_mcp/desktop/flash_overlay.py[37-81]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Daemon Tk thread crash risk ✓ Resolved 🐞 Bug ☼ Reliability
Description
The flash overlay runs a Tk mainloop on a daemon thread and is not joined on shutdown; the test
suite disables it because it can crash the interpreter during teardown, indicating a real lifecycle
reliability hazard.
Code

src/windows_mcp/desktop/flash_overlay.py[R67-80]

+    if _flash_disabled() or not rects:
+        return
+    rects = [tuple(r) for r in rects]
+    overlay = _Overlay()
+    overlay.thread = threading.Thread(
+        target=_run_overlay,
+        args=(rects, full_screen, overlay),
+        name="windows-mcp-flash",
+        daemon=True,
+    )
+    with _lock:
+        global _active_overlay
+        _active_overlay = overlay
+    overlay.thread.start()
Evidence
The overlay is explicitly launched on a daemon thread and runs root.mainloop() in that thread. The
repository’s test configuration disables the overlay globally because this pattern can crash the
interpreter during teardown, demonstrating a known stability issue with the current lifecycle
approach.

src/windows_mcp/desktop/flash_overlay.py[1-6]
src/windows_mcp/desktop/flash_overlay.py[71-76]
src/windows_mcp/desktop/flash_overlay.py[175-177]
tests/conftest.py[5-9]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The flash overlay starts Tk on a **daemon** thread and runs `Tk.mainloop()` there. The test suite disables this feature because it can crash the interpreter during teardown, which strongly suggests the current lifecycle is not shutdown-safe.
### Issue Context
- `show_capture_flash()` spawns a daemon thread.
- `_run_overlay()` creates a `Tk()` root and runs `root.mainloop()`.
- Tests disable the overlay globally due to interpreter crash risk.
### Fix Focus Areas
- src/windows_mcp/desktop/flash_overlay.py[37-80]
- src/windows_mcp/desktop/flash_overlay.py[83-189]
- tests/conftest.py[5-9]
### Suggested fix directions (pick one)
1) **Single long-lived UI thread (recommended):**
- Start one non-daemon UI thread once (with a queue of overlay requests).
- Ensure orderly shutdown via `atexit` (signal + join).
2) **Non-daemon + explicit join on shutdown:**
- Make the thread non-daemon.
- Register an `atexit` hook that calls `cancel_active_flash()` and joins the overlay thread (bounded wait).
3) **Avoid Tk entirely:**
- Use a Win32-native overlay approach so you don’t need Tk’s mainloop in a background thread.
### Acceptance criteria
- No daemon Tk thread remains running at interpreter shutdown.
- Tests no longer need to globally disable the overlay to prevent crashes.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (1)
4. Focus failure not detectable ✓ Resolved 🐞 Bug ≡ Correctness
Description
Desktop.resolve_window_capture_rect relies on catching exceptions from bring_window_to_top to
trigger fallback behavior, but bring_window_to_top catches/logs exceptions internally without
re-raising, so focus failures can silently proceed to capturing the rect/screenshot of potentially
wrong content.
Code

src/windows_mcp/desktop/service.py[R1073-1080]

+        hwnd, title = window_resolver.resolve_window(name=name, pid=pid)
+        if focus:
+            try:
+                self.bring_window_to_top(hwnd)
+            except Exception:
+                logger.debug("bring_window_to_top failed for %s", title, exc_info=True)
+                window_resolver.restore_if_minimized(hwnd)
+            sleep(0.05)
Evidence
The new window-targeted capture path wraps bring_window_to_top in a try/except and uses the except
block for fallback restore. However, bring_window_to_top’s implementation catches exceptions and
logs them without raising, so the new try/except cannot reliably detect failure and will proceed as
if focusing succeeded.

src/windows_mcp/desktop/service.py[1060-1085]
src/windows_mcp/desktop/service.py[538-614]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`resolve_window_capture_rect()` assumes it can detect focus failures by catching exceptions from `bring_window_to_top()`. But `bring_window_to_top()` swallows exceptions internally, so `resolve_window_capture_rect()` can’t reliably know whether the focus/raise succeeded and may capture the wrong pixels.
### Issue Context
- `resolve_window_capture_rect()` uses `try/except` around `bring_window_to_top()` to decide whether to run fallback restore.
- `bring_window_to_top()` catches exceptions and logs, but does not re-raise.
### Fix Focus Areas
- src/windows_mcp/desktop/service.py[1060-1090]
- src/windows_mcp/desktop/service.py[538-614]
### Suggested fix
Choose one:
1) **Return a success boolean** from `bring_window_to_top()` and have `resolve_window_capture_rect()` check it.
2) **Re-raise** inside `bring_window_to_top()` after logging so callers can handle failure.
3) Add a **post-condition check** in `resolve_window_capture_rect()` (e.g., verify `win32gui.GetForegroundWindow() == hwnd` after attempting focus, and raise `WindowNotFoundError`/`ValueError` if not).
Also consider calling `window_resolver.restore_if_minimized(hwnd)` unconditionally when `focus=True` (before focus attempt) to avoid relying on exception-driven fallback.
### Acceptance criteria
- If focus/raise fails, the tool returns a clear error instead of silently capturing whatever is currently on screen.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

5. Flash mispositions on negatives ✓ Resolved 🐞 Bug ≡ Correctness
Description
flash_overlay._run_overlay builds the Tk geometry string as "+{left}+{top}", which becomes "+-N"
when the capture region has negative virtual-screen coordinates, causing the overlay window
placement to fail or be incorrect on common multi-monitor layouts.
Code

src/windows_mcp/desktop/flash_overlay.py[R97-124]

+        left = min(r[0] for r in rects)
+        top = min(r[1] for r in rects)
+        right = max(r[2] for r in rects)
+        bottom = max(r[3] for r in rects)
+        width = right - left
+        height = bottom - top
+        if width <= 0 or height <= 0:
+            return
+
+        root = tk.Tk()
+        root.withdraw()
+        root.overrideredirect(True)
+        root.attributes("-topmost", True)
+        try:
+            root.attributes("-disabled", True)
+        except tk.TclError:
+            pass
+
+        transparent_color = "#010203"
+        try:
+            root.configure(bg=transparent_color)
+            root.attributes("-transparentcolor", transparent_color)
+            canvas_bg = transparent_color
+        except tk.TclError:
+            canvas_bg = root.cget("bg")
+
+        root.geometry(f"{width}x{height}+{left}+{top}")
+
Evidence
The overlay explicitly accepts rects in virtual-screen coordinates (which can be negative on
multi-monitor configurations) and derives left/top from those rects. It then interpolates left/top
directly into a "+{left}+{top}" geometry string, producing malformed geometry tokens like "+-1920"
when left/top are negative.

src/windows_mcp/desktop/flash_overlay.py[53-66]
src/windows_mcp/desktop/flash_overlay.py[97-104]
src/windows_mcp/desktop/flash_overlay.py[123-124]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The flash overlay’s Tk `geometry()` string is constructed as `f"{width}x{height}+{left}+{top}"`. When `left`/`top` are negative (virtual desktop coordinates on multi-monitor setups), this produces `+-` sequences and can break/misplace the overlay.
### Issue Context
`show_capture_flash()` documents that `rects` are in virtual-screen coordinates, so negative coordinates are expected.
### Fix Focus Areas
- src/windows_mcp/desktop/flash_overlay.py[97-124]
### Suggested fix
Construct the geometry string using signed formatting, e.g.:
- `root.geometry(f"{width}x{height}{left:+d}{top:+d}")`
so negatives become `-N` rather than `+-N`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Comment on lines +17 to +28
@pytest.fixture(autouse=True)
def _reset_active_overlay():
"""Each test starts and ends with no overlay registered."""
with flash_overlay._lock:
flash_overlay._active_overlay = None
yield
with flash_overlay._lock:
ov = flash_overlay._active_overlay
flash_overlay._active_overlay = None
if ov is not None:
ov.stop_event.set()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Tests missing type hints 📘 Rule violation ⚙ Maintainability

New test functions/fixtures are added without parameter and return type annotations. This violates
the requirement for fully type-annotated function signatures and reduces static analysis/IDE
support.
Agent Prompt
## Issue description
Newly-added test functions/fixtures do not include type annotations for parameters and return types.

## Issue Context
Compliance requires all added/modified function signatures to be fully type-annotated.

## Fix Focus Areas
- tests/test_flash_overlay.py[17-117]
- tests/test_window_resolver.py[18-148]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment thread src/windows_mcp/desktop/window_resolver.py
Comment thread src/windows_mcp/desktop/flash_overlay.py Outdated
Comment thread src/windows_mcp/desktop/service.py Outdated
Replace the single transparent-canvas overlay with stacked thin
Toplevel "strip" windows along each side of the rect (top, bottom,
left, right). Each strip is a solid orange-red Toplevel with its own
-alpha, so per-layer fade is reliable — the previous -alpha +
-transparentcolor combination renders inconsistently on Windows.

Eight layers per side with quadratic alpha falloff: layer 0 sits on
the rect edge at full opacity, outer layers radiate further out (or
inward, for full-screen) with progressively lower alpha to produce a
soft halo. Time-based modulation still gives a quick fade-in, hold,
and slow fade-out for region captures, and a bell-curve fade for
full-screen.

Tests: tests/test_flash_overlay.py adds TestBuildStripDefs covering
strip placement (region outward, full-screen inward), alpha falloff,
and the no-room edge case.
@exalsch exalsch force-pushed the feat/window-targeted-screenshot branch from ba7cc1e to 1ce1f5d Compare May 8, 2026 20:48
exalsch and others added 5 commits May 8, 2026 23:12
The previous strip-Toplevels rewrite hung Tk's mainloop when launched
on a non-main thread (creating ~32 Toplevels with overrideredirect
froze the loop), so the flash never appeared even though the thread
was alive.

Switch back to a single transparent Tk window with a Canvas, but
produce the halo by drawing concentric border rectangles whose RGB
fades from full orange-red toward pure black. The canvas's
transparent-colour key is set to pure black, so dim outer layers
genuinely become transparent — no -alpha + -transparentcolor combo
(which Windows renders unreliably) is needed. The time fade is done
by re-rendering each frame with a scaled intensity, so the glow
fades in/out smoothly.

12 layers, ~4 px outer halo for region captures, inner glow inset by
4 px from the monitor edge for full-screen captures.

Tests: replace TestBuildStripDefs with TestLayerColor covering full
intensity, half-intensity blend, and the zero-intensity safeguard
(never returns pure black, which would punch through the transparent
key).
Two prior rewrites failed in different ways on the user's machine:
1. The single-window transparent canvas approach (-transparentcolor +
   per-frame redraw) rendered nothing — some Windows configurations
   refuse to map the layered window when -transparentcolor is set.
2. The multi-layer Toplevel-strip approach hung Tk's mainloop when
   ~8 or more Toplevels were created back-to-back on a non-main
   thread (overrideredirect + alpha + topmost).

Reduce the halo to a single ring of 4 thick (8 px) opaque Toplevel
strips per rect — the only configuration confirmed to actually render
and complete cleanly. No transparentcolor, no glow gradient — just
a solid orange-red border that fades in (~15% of duration), holds,
and fades out (~35%).

Drop the test that asserted falling alpha across layers since there
is now only one layer; the remaining strip-placement tests still
exercise the geometry.

Logs the strip count at INFO level so it's possible to verify the
overlay actually fired from server logs.
Drop Tk entirely — neither -transparentcolor (rendered nothing on
some Windows configs) nor multi-Toplevel strips (hangs Tk's mainloop
on a non-main thread once ~6+ Toplevels are created back-to-back)
delivered a usable result.

The new implementation creates a Win32 layered window
(WS_EX_LAYERED | WS_EX_TRANSPARENT | WS_EX_TOPMOST | WS_EX_TOOLWINDOW
| WS_EX_NOACTIVATE) and feeds UpdateLayeredWindow a 32-bit BGRA
top-down DIB section with premultiplied alpha. The halo itself is
rendered with PIL: solid orange-red ring plus a Gaussian-blurred
copy composited underneath, giving a real soft glow that fades from
the rect edge outward. Time fade is a linear scale of the alpha
channel re-pushed each frame (~30 ms cadence), quantised to 32 levels
to skip redundant uploads.

ctypes argtypes/restypes are set explicitly on every Win32 entry
point used because the pointer-sized handle/LRESULT defaults overflow
on x64.

Tests: replace TestBuildStripDefs with TestIntensityCurve (verifies
the bell-curve / hold-then-fade envelopes) and TestPremultipliedBgra
(verifies BGRA byte order, premultiplication, and alpha scaling for
the per-frame intensity multiplier).
The orange ring was nested inward inside the captured rect, so for window-
targeted screenshots (especially borderless Tauri / WebView2 apps where the
DWM extended frame matches the content edge) the halo visibly overlapped
the window content instead of reading as a surround.

Switch the ring direction to outward by default — it now grows into the
existing 42px margin around the rect — and keep the inward nesting only
for the full-screen inner halo via outward=False.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lity

User feedback: the glow was easy to miss for window/region captures —
the 4px band against a 14px blur radius wasn't bright enough to register
when the bulk of the halo sits outside the captured rect on desktop
background.

Double the sharp-ring thickness to 8px and bump the total duration to
3.5s so the user has time to spot it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@exalsch exalsch force-pushed the feat/window-targeted-screenshot branch from 1ce1f5d to f4a8ff5 Compare May 11, 2026 07:58
exalsch and others added 6 commits May 11, 2026 15:12
Two review asks on PR CursorTouch#232:

JezaChen — desktop/service.py is already large; move the flash setup
code into show_capture_flash. Done: show_capture_flash now takes a
single uia.Rect (or None for full-desktop) and resolves rects /
enumerates monitors internally. get_screenshot collapses to a one-liner.

Qodo — show_capture_flash overwrote _active_overlay without signalling
the prior one, so a follow-up call would orphan the running overlay
and cancel_active_flash() could no longer reach it. Fixed with an
atomic swap: save prev under _lock, install new, then signal prev
outside the lock so the prior overlay tears down without blocking the
new caller.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rder

The two new full-screen tests monkeypatched ``sys.modules['windows_mcp.uia']``
but that is bypassed once another test in the suite imports the real uia —
``import windows_mcp.uia as uia`` then resolves via the cached parent-package
attribute rather than sys.modules. Patch ``GetMonitorsRect`` on the real
module instead so the override holds regardless of import order.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds optional window_name (fuzzy title match) and window_pid (exact
process id) parameters to the Screenshot and Snapshot tools so an
agent can capture a single window's bounding rectangle without doing
a full-desktop screenshot or guessing a display index. The targeted
window is brought to the foreground first by default; pass
focus_window=False to skip that. window_name/window_pid and display
are mutually exclusive.

Implementation:
- New module desktop/window_resolver.py: enumerate_visible_windows,
  resolve_window (PID/exact + name/fuzzy), and get_window_rect using
  DwmGetWindowAttribute(DWMWA_EXTENDED_FRAME_BOUNDS) with a
  GetWindowRect fallback so DWM drop-shadow doesn't pollute the rect.
- Desktop.resolve_window_capture_rect ties resolver + focus + rect
  read together and reuses the existing bring_window_to_top logic.
- Desktop.get_state now accepts a capture_rect override that takes
  precedence over display_indices (and rejects passing both).
- capture_desktop_state (snapshot helper) plumbs window_name /
  window_pid / focus_window through, validates mutual exclusion with
  display, and surfaces the resolved title in the Screenshot/Snapshot
  metadata as "Target Window: ...".

Tests: tests/test_window_resolver.py — 12 cases covering enumeration
filtering, PID-vs-name precedence, fuzzy cutoff, and the DWM/
GetWindowRect fallback chain.
When window_name/window_pid is supplied with focus_window=False, the
resolver now refuses unless the target is actually the foreground
window. Otherwise the screenshot would just capture whatever happened
to be on top of it, which is misleading.

- focus_window=True (default): after bring_window_to_top, verify the
  window became foreground and log a warning if it didn't (e.g. when
  Windows blocks SetForegroundWindow for an elevated/UIPI target).
- focus_window=False: raise WindowNotFoundError with a clear message
  if the target is not foreground, in addition to the existing
  minimized check.

Adds is_foreground(hwnd) helper in window_resolver.py and three tests
for it.
…ndow is blocked

bring_window_to_top relies on SetForegroundWindow, which Windows
silently rejects when the calling process didn't own the last input
event — even after the AttachThreadInput dance. The targeted window
is raised in z-order but the active foreground window stays put, so
the screenshot grabs whichever app the user happens to be looking
at instead.

When the post-focus is_foreground check fails, retry once with
SwitchToThisWindow (the undocumented Win32 API the shell uses for
Alt-Tab), which bypasses the foreground lock without injecting
keyboard input. If that still doesn't work, log the existing warning.

Tests cover the new force_foreground helper: it calls
SwitchToThisWindow with (hwnd, True) and silently swallows OSError
from misbehaving drivers.
Qodo review on PR CursorTouch#233 pointed out that resolve_window_capture_rect
relied on catching exceptions from bring_window_to_top, but
bring_window_to_top swallows its own exceptions and only logs — so the
try/except was never a reliable failure signal.

f4a8ff5 added an explicit is_foreground post-condition check plus a
SwitchToThisWindow fallback, but on final failure it only warned and
then captured whatever was on top anyway — exactly the silent-wrong-
content case Qodo flagged.

When focus_window=True and both bring_window_to_top + force_foreground
fail to bring the target forward, raise WindowNotFoundError with a
message that tells the user to focus it manually or pass
focus_window=False. focus_window=False still bypasses this path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@exalsch exalsch force-pushed the feat/window-targeted-screenshot branch from f4a8ff5 to eba890d Compare May 11, 2026 13:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant