feat(screenshot): target a single window via window_name / window_pid by exalsch · Pull Request #233 · CursorTouch/Windows-MCP

exalsch · 2026-05-08T20:15:36Z

Summary

Adds window_name (fuzzy title match) and window_pid (exact process id) parameters to the Screenshot and Snapshot tools. When either is set, the tool resolves that window, brings it to the foreground (unless focus_window=False), reads its bounding rectangle, and captures only that area. window_name / window_pid and display are mutually exclusive.

Stacked on #232. This branch builds on feat/screenshot-flash-overlay; the diff currently shows both PRs' changes. Once #232 lands, this branch will rebase cleanly to show only the window-targeting commits.

Why

Right now there is no way to ask the server for "just this window". The only capture controls are display=[N] (whole monitor) or full virtual screen. Agents that want a specific app's screenshot have to guess the right display, capture the whole thing, and trust the LLM to crop visually. With window_name / window_pid the server does the lookup and returns just the relevant pixels. Combined with #232's flash overlay this also gives the user a clear visual confirmation of which window was just sampled.

Implementation

New module src/windows_mcp/desktop/window_resolver.py
- enumerate_visible_windows() — EnumWindows filtered to visible, valid HWNDs.
- resolve_window(name=..., pid=..., windows=...) — PID match takes precedence; name match uses fuzzywuzzy.process.extractOne with a 70 score cutoff (consistent with Desktop._find_window_by_name). Untitled PID-matched windows are tolerated; titled ones are preferred.
- get_window_rect(hwnd) — DwmGetWindowAttribute(DWMWA_EXTENDED_FRAME_BOUNDS) first so the DWM drop-shadow on Aero windows doesn't inflate the rect; GetWindowRect fallback.
Desktop.resolve_window_capture_rect(name=..., pid=..., focus=True) glues resolver + focus together, reusing the existing bring_window_to_top logic, and returns (uia.Rect, title). Sleeps 50 ms after focus so DWM has time to repaint before the rect is read.
Desktop.get_state(...) gains an optional capture_rect: uia.Rect | None parameter that overrides display_indices and is rejected if both are supplied.
capture_desktop_state plumbs window_name / window_pid / focus_window through, validates the mutual exclusion, and surfaces the resolved title in the response metadata as Target Window: <name>.
Screenshot and Snapshot tool descriptions updated to document the new parameters.

Tests

tests/test_window_resolver.py — 12 cases (enumeration filtering, PID prefers titled, PID falls back to first match, PID-not-found error, fuzzy name match, fuzzy cutoff rejection, untitled-only error, PID-takes-precedence-over-name, DWM success path, DWM fallback to GetWindowRect, DWM-raises fallback).
Integration: existing test_snapshot_display_filter.py continues to pass — the new capture_rect parameter is purely additive and the display_indices-based flow is unchanged.

Test plan

Screenshot(window_name="cotire") returns just that window
Screenshot(window_pid=12345) returns just that PID's titled window
Screenshot(window_name="...", focus_window=False) returns the window without focusing it (errors clearly if minimized)
Screenshot(window_name="...", display=[0]) rejects with a clear error
Snapshot(use_vision=True, window_name="...") returns the window with the UI tree filtered to that region
DWM extended-frame rect is used (no extra drop-shadow padding) for Aero windows
pytest tests/test_window_resolver.py passes (12/12)

🤖 Generated with Claude Code

After every screenshot, draw a glowing orange-red border over the captured area on a transparent always-on-top Tk overlay for ~2.5s. Region captures get a solid border that fades out at the end; full- screen captures show a bell-fading inner border on the union of all monitor rects. The overlay is started *after* capture and cancelled before any subsequent capture so it never appears in the captured image. Set WINDOWS_MCP_DISABLE_FLASH=1/true/yes/on to suppress the effect. The conftest disables the flash for the test suite to avoid Tk threads racing with pytest teardown.

qodo-code-review · 2026-05-08T20:15:55Z

Review Summary by Qodo

Add window-targeted screenshot capture and post-capture flash overlay

✨ Enhancement

Walkthroughs

Description

• Add window-targeted screenshot capture via window_name (fuzzy match) and window_pid (exact
  PID)
• Implement post-capture visual feedback with orange-red flash overlay on transparent always-on-top
  window
• New window_resolver.py module for window enumeration, resolution, and rect retrieval using DWM
  extended bounds
• Plumb window targeting through Screenshot and Snapshot tools with mutual exclusion validation
  against display

Diagram

flowchart LR
  A["Screenshot/Snapshot Tool"] -->|window_name or window_pid| B["capture_desktop_state"]
  B -->|resolve_window_capture_rect| C["window_resolver"]
  C -->|enumerate_visible_windows| D["Win32 EnumWindows"]
  C -->|fuzzy/PID match| E["resolve_window"]
  E -->|DwmGetWindowAttribute| F["get_window_rect"]
  F -->|fallback| G["GetWindowRect"]
  B -->|capture_rect| H["Desktop.get_state"]
  H -->|get_screenshot| I["flash_overlay.show_capture_flash"]
  I -->|Tk daemon thread| J["Visual Confirmation Border"]

File Changes

1. src/windows_mcp/desktop/flash_overlay.py ✨ Enhancement +189/-0

New module for post-screenshot visual feedback overlay

src/windows_mcp/desktop/flash_overlay.py

2. src/windows_mcp/desktop/window_resolver.py ✨ Enhancement +115/-0

New module for window resolution by name or PID

src/windows_mcp/desktop/window_resolver.py

3. src/windows_mcp/desktop/service.py ✨ Enhancement +87/-20

Integrate window resolver and flash overlay into Desktop service

src/windows_mcp/desktop/service.py

View more (7)

4. src/windows_mcp/tools/_snapshot_helpers.py ✨ Enhancement +27/-8

Add window targeting parameters to capture_desktop_state helper

src/windows_mcp/tools/_snapshot_helpers.py

5. src/windows_mcp/tools/snapshot.py ✨ Enhancement +21/-8

Add window_name, window_pid, focus_window parameters to tools

src/windows_mcp/tools/snapshot.py

6. tests/conftest.py 🧪 Tests +8/-0

Disable flash overlay during test suite to prevent Tk race conditions

tests/conftest.py

7. tests/test_flash_overlay.py 🧪 Tests +117/-0

New unit tests for flash overlay lifecycle and env-var gating

tests/test_flash_overlay.py

8. tests/test_window_resolver.py 🧪 Tests +148/-0

New unit tests for window enumeration, resolution, and rect retrieval

tests/test_window_resolver.py

9. CLAUDE.md 📝 Documentation +2/-0

Document window resolver and flash overlay environment variables

CLAUDE.md

10. README.md 📝 Documentation +3/-2

Document window targeting and flash overlay in tool descriptions

README.md

qodo-code-review · 2026-05-08T20:15:57Z

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (1) 📎 Requirement gaps (0)

1. Tests missing type hints 📘 Rule violation ⚙ Maintainability

Description

New test functions/fixtures are added without parameter and return type annotations. This violates
the requirement for fully type-annotated function signatures and reduces static analysis/IDE
support.

Code

tests/test_flash_overlay.py[R17-28]

+@pytest.fixture(autouse=True)
+def _reset_active_overlay():
+    """Each test starts and ends with no overlay registered."""
+    with flash_overlay._lock:
+        flash_overlay._active_overlay = None
+    yield
+    with flash_overlay._lock:
+        ov = flash_overlay._active_overlay
+        flash_overlay._active_overlay = None
+    if ov is not None:
+        ov.stop_event.set()
+

Evidence
PR Compliance ID 4 requires type hints on all added/modified function signatures. The new test
fixture/function signatures are unannotated (no parameter/return types) in the added test modules.
CLAUDE.md
tests/test_flash_overlay.py[17-28]
tests/test_window_resolver.py[18-25]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Newly-added test functions/fixtures do not include type annotations for parameters and return types.
## Issue Context
Compliance requires all added/modified function signatures to be fully type-annotated.
## Fix Focus Areas
- tests/test_flash_overlay.py[17-117]
- tests/test_window_resolver.py[18-148]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. ~~Public APIs lack Google docstrings~~ ✓ Resolved 📘 Rule violation ⚙ Maintainability

Description

New public functions/classes are missing docstrings or use non-Google-style docstrings. This reduces
maintainability and violates the docstring standard for public APIs.

Code

src/windows_mcp/desktop/window_resolver.py[R109-115]

+def is_iconic(hwnd: int) -> bool:
+    return bool(win32gui.IsIconic(hwnd))
+
+
+def restore_if_minimized(hwnd: int) -> None:
+    if is_iconic(hwnd):
+        win32gui.ShowWindow(hwnd, win32con.SW_RESTORE)

Evidence

PR Compliance ID 5 requires Google-style docstrings for public functions/classes. In the new
modules, public APIs such as is_iconic()/restore_if_minimized() have no docstrings, and
show_capture_flash() uses a narrative docstring without Google-style Args/Returns sections.

CLAUDE.md
src/windows_mcp/desktop/window_resolver.py[109-115]
src/windows_mcp/desktop/flash_overlay.py[53-66]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Public functions/classes added in this PR are missing docstrings or are not documented in Google-style format.
## Issue Context
Compliance requires Google-style docstrings for public APIs to keep documentation consistent and easy to maintain.
## Fix Focus Areas
- src/windows_mcp/desktop/window_resolver.py[25-115]
- src/windows_mcp/desktop/flash_overlay.py[37-81]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

3. ~~Daemon Tk thread crash risk~~ ✓ Resolved 🐞 Bug ☼ Reliability

Description

The flash overlay runs a Tk mainloop on a daemon thread and is not joined on shutdown; the test
suite disables it because it can crash the interpreter during teardown, indicating a real lifecycle
reliability hazard.

Code

src/windows_mcp/desktop/flash_overlay.py[R67-80]

+    if _flash_disabled() or not rects:
+        return
+    rects = [tuple(r) for r in rects]
+    overlay = _Overlay()
+    overlay.thread = threading.Thread(
+        target=_run_overlay,
+        args=(rects, full_screen, overlay),
+        name="windows-mcp-flash",
+        daemon=True,
+    )
+    with _lock:
+        global _active_overlay
+        _active_overlay = overlay
+    overlay.thread.start()

Evidence
The overlay is explicitly launched on a daemon thread and runs root.mainloop() in that thread. The
repository’s test configuration disables the overlay globally because this pattern can crash the
interpreter during teardown, demonstrating a known stability issue with the current lifecycle
approach.
src/windows_mcp/desktop/flash_overlay.py[1-6]
src/windows_mcp/desktop/flash_overlay.py[71-76]
src/windows_mcp/desktop/flash_overlay.py[175-177]
tests/conftest.py[5-9]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The flash overlay starts Tk on a **daemon** thread and runs `Tk.mainloop()` there. The test suite disables this feature because it can crash the interpreter during teardown, which strongly suggests the current lifecycle is not shutdown-safe.
### Issue Context
- `show_capture_flash()` spawns a daemon thread.
- `_run_overlay()` creates a `Tk()` root and runs `root.mainloop()`.
- Tests disable the overlay globally due to interpreter crash risk.
### Fix Focus Areas
- src/windows_mcp/desktop/flash_overlay.py[37-80]
- src/windows_mcp/desktop/flash_overlay.py[83-189]
- tests/conftest.py[5-9]
### Suggested fix directions (pick one)
1) **Single long-lived UI thread (recommended):**
- Start one non-daemon UI thread once (with a queue of overlay requests).
- Ensure orderly shutdown via `atexit` (signal + join).
2) **Non-daemon + explicit join on shutdown:**
- Make the thread non-daemon.
- Register an `atexit` hook that calls `cancel_active_flash()` and joins the overlay thread (bounded wait).
3) **Avoid Tk entirely:**
- Use a Win32-native overlay approach so you don’t need Tk’s mainloop in a background thread.
### Acceptance criteria
- No daemon Tk thread remains running at interpreter shutdown.
- Tests no longer need to globally disable the overlay to prevent crashes.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

View more (1)

4. ~~Focus failure not detectable~~ ✓ Resolved 🐞 Bug ≡ Correctness

Description

Desktop.resolve_window_capture_rect relies on catching exceptions from bring_window_to_top to
trigger fallback behavior, but bring_window_to_top catches/logs exceptions internally without
re-raising, so focus failures can silently proceed to capturing the rect/screenshot of potentially
wrong content.

Code

src/windows_mcp/desktop/service.py[R1073-1080]

+        hwnd, title = window_resolver.resolve_window(name=name, pid=pid)
+        if focus:
+            try:
+                self.bring_window_to_top(hwnd)
+            except Exception:
+                logger.debug("bring_window_to_top failed for %s", title, exc_info=True)
+                window_resolver.restore_if_minimized(hwnd)
+            sleep(0.05)

Evidence

The new window-targeted capture path wraps bring_window_to_top in a try/except and uses the except
block for fallback restore. However, bring_window_to_top’s implementation catches exceptions and
logs them without raising, so the new try/except cannot reliably detect failure and will proceed as
if focusing succeeded.

src/windows_mcp/desktop/service.py[1060-1085]
src/windows_mcp/desktop/service.py[538-614]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`resolve_window_capture_rect()` assumes it can detect focus failures by catching exceptions from `bring_window_to_top()`. But `bring_window_to_top()` swallows exceptions internally, so `resolve_window_capture_rect()` can’t reliably know whether the focus/raise succeeded and may capture the wrong pixels.
### Issue Context
- `resolve_window_capture_rect()` uses `try/except` around `bring_window_to_top()` to decide whether to run fallback restore.
- `bring_window_to_top()` catches exceptions and logs, but does not re-raise.
### Fix Focus Areas
- src/windows_mcp/desktop/service.py[1060-1090]
- src/windows_mcp/desktop/service.py[538-614]
### Suggested fix
Choose one:
1) **Return a success boolean** from `bring_window_to_top()` and have `resolve_window_capture_rect()` check it.
2) **Re-raise** inside `bring_window_to_top()` after logging so callers can handle failure.
3) Add a **post-condition check** in `resolve_window_capture_rect()` (e.g., verify `win32gui.GetForegroundWindow() == hwnd` after attempting focus, and raise `WindowNotFoundError`/`ValueError` if not).
Also consider calling `window_resolver.restore_if_minimized(hwnd)` unconditionally when `focus=True` (before focus attempt) to avoid relying on exception-driven fallback.
### Acceptance criteria
- If focus/raise fails, the tool returns a clear error instead of silently capturing whatever is currently on screen.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

5. ~~Flash mispositions on negatives~~ ✓ Resolved 🐞 Bug ≡ Correctness

Description

flash_overlay._run_overlay builds the Tk geometry string as "+{left}+{top}", which becomes "+-N"
when the capture region has negative virtual-screen coordinates, causing the overlay window
placement to fail or be incorrect on common multi-monitor layouts.

Code

src/windows_mcp/desktop/flash_overlay.py[R97-124]

+        left = min(r[0] for r in rects)
+        top = min(r[1] for r in rects)
+        right = max(r[2] for r in rects)
+        bottom = max(r[3] for r in rects)
+        width = right - left
+        height = bottom - top
+        if width <= 0 or height <= 0:
+            return
+
+        root = tk.Tk()
+        root.withdraw()
+        root.overrideredirect(True)
+        root.attributes("-topmost", True)
+        try:
+            root.attributes("-disabled", True)
+        except tk.TclError:
+            pass
+
+        transparent_color = "#010203"
+        try:
+            root.configure(bg=transparent_color)
+            root.attributes("-transparentcolor", transparent_color)
+            canvas_bg = transparent_color
+        except tk.TclError:
+            canvas_bg = root.cget("bg")
+
+        root.geometry(f"{width}x{height}+{left}+{top}")
+

Evidence
The overlay explicitly accepts rects in virtual-screen coordinates (which can be negative on
multi-monitor configurations) and derives left/top from those rects. It then interpolates left/top
directly into a "+{left}+{top}" geometry string, producing malformed geometry tokens like "+-1920"
when left/top are negative.
src/windows_mcp/desktop/flash_overlay.py[53-66]
src/windows_mcp/desktop/flash_overlay.py[97-104]
src/windows_mcp/desktop/flash_overlay.py[123-124]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The flash overlay’s Tk `geometry()` string is constructed as `f"{width}x{height}+{left}+{top}"`. When `left`/`top` are negative (virtual desktop coordinates on multi-monitor setups), this produces `+-` sequences and can break/misplace the overlay.
### Issue Context
`show_capture_flash()` documents that `rects` are in virtual-screen coordinates, so negative coordinates are expected.
### Fix Focus Areas
- src/windows_mcp/desktop/flash_overlay.py[97-124]
### Suggested fix
Construct the geometry string using signed formatting, e.g.:
- `root.geometry(f"{width}x{height}{left:+d}{top:+d}")`
so negatives become `-N` rather than `+-N`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-code-review · 2026-05-08T20:21:55Z

+@pytest.fixture(autouse=True)
+def _reset_active_overlay():
+    """Each test starts and ends with no overlay registered."""
+    with flash_overlay._lock:
+        flash_overlay._active_overlay = None
+    yield
+    with flash_overlay._lock:
+        ov = flash_overlay._active_overlay
+        flash_overlay._active_overlay = None
+    if ov is not None:
+        ov.stop_event.set()
+


1. Tests missing type hints 📘 Rule violation ⚙ Maintainability

New test functions/fixtures are added without parameter and return type annotations. This violates the requirement for fully type-annotated function signatures and reduces static analysis/IDE support.

Agent Prompt

## Issue description Newly-added test functions/fixtures do not include type annotations for parameters and return types. ## Issue Context Compliance requires all added/modified function signatures to be fully type-annotated. ## Fix Focus Areas - tests/test_flash_overlay.py[17-117] - tests/test_window_resolver.py[18-148]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Replace the single transparent-canvas overlay with stacked thin Toplevel "strip" windows along each side of the rect (top, bottom, left, right). Each strip is a solid orange-red Toplevel with its own -alpha, so per-layer fade is reliable — the previous -alpha + -transparentcolor combination renders inconsistently on Windows. Eight layers per side with quadratic alpha falloff: layer 0 sits on the rect edge at full opacity, outer layers radiate further out (or inward, for full-screen) with progressively lower alpha to produce a soft halo. Time-based modulation still gives a quick fade-in, hold, and slow fade-out for region captures, and a bell-curve fade for full-screen. Tests: tests/test_flash_overlay.py adds TestBuildStripDefs covering strip placement (region outward, full-screen inward), alpha falloff, and the no-room edge case.

The previous strip-Toplevels rewrite hung Tk's mainloop when launched on a non-main thread (creating ~32 Toplevels with overrideredirect froze the loop), so the flash never appeared even though the thread was alive. Switch back to a single transparent Tk window with a Canvas, but produce the halo by drawing concentric border rectangles whose RGB fades from full orange-red toward pure black. The canvas's transparent-colour key is set to pure black, so dim outer layers genuinely become transparent — no -alpha + -transparentcolor combo (which Windows renders unreliably) is needed. The time fade is done by re-rendering each frame with a scaled intensity, so the glow fades in/out smoothly. 12 layers, ~4 px outer halo for region captures, inner glow inset by 4 px from the monitor edge for full-screen captures. Tests: replace TestBuildStripDefs with TestLayerColor covering full intensity, half-intensity blend, and the zero-intensity safeguard (never returns pure black, which would punch through the transparent key).

Two prior rewrites failed in different ways on the user's machine: 1. The single-window transparent canvas approach (-transparentcolor + per-frame redraw) rendered nothing — some Windows configurations refuse to map the layered window when -transparentcolor is set. 2. The multi-layer Toplevel-strip approach hung Tk's mainloop when ~8 or more Toplevels were created back-to-back on a non-main thread (overrideredirect + alpha + topmost). Reduce the halo to a single ring of 4 thick (8 px) opaque Toplevel strips per rect — the only configuration confirmed to actually render and complete cleanly. No transparentcolor, no glow gradient — just a solid orange-red border that fades in (~15% of duration), holds, and fades out (~35%). Drop the test that asserted falling alpha across layers since there is now only one layer; the remaining strip-placement tests still exercise the geometry. Logs the strip count at INFO level so it's possible to verify the overlay actually fired from server logs.

Drop Tk entirely — neither -transparentcolor (rendered nothing on some Windows configs) nor multi-Toplevel strips (hangs Tk's mainloop on a non-main thread once ~6+ Toplevels are created back-to-back) delivered a usable result. The new implementation creates a Win32 layered window (WS_EX_LAYERED | WS_EX_TRANSPARENT | WS_EX_TOPMOST | WS_EX_TOOLWINDOW | WS_EX_NOACTIVATE) and feeds UpdateLayeredWindow a 32-bit BGRA top-down DIB section with premultiplied alpha. The halo itself is rendered with PIL: solid orange-red ring plus a Gaussian-blurred copy composited underneath, giving a real soft glow that fades from the rect edge outward. Time fade is a linear scale of the alpha channel re-pushed each frame (~30 ms cadence), quantised to 32 levels to skip redundant uploads. ctypes argtypes/restypes are set explicitly on every Win32 entry point used because the pointer-sized handle/LRESULT defaults overflow on x64. Tests: replace TestBuildStripDefs with TestIntensityCurve (verifies the bell-curve / hold-then-fade envelopes) and TestPremultipliedBgra (verifies BGRA byte order, premultiplication, and alpha scaling for the per-frame intensity multiplier).

The orange ring was nested inward inside the captured rect, so for window- targeted screenshots (especially borderless Tauri / WebView2 apps where the DWM extended frame matches the content edge) the halo visibly overlapped the window content instead of reading as a surround. Switch the ring direction to outward by default — it now grows into the existing 42px margin around the rect — and keep the inward nesting only for the full-screen inner halo via outward=False. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…lity User feedback: the glow was easy to miss for window/region captures — the 4px band against a 14px blur radius wasn't bright enough to register when the bulk of the halo sits outside the captured rect on desktop background. Double the sharp-ring thickness to 8px and bump the total duration to 3.5s so the user has time to spot it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two review asks on PR CursorTouch#232: JezaChen — desktop/service.py is already large; move the flash setup code into show_capture_flash. Done: show_capture_flash now takes a single uia.Rect (or None for full-desktop) and resolves rects / enumerates monitors internally. get_screenshot collapses to a one-liner. Qodo — show_capture_flash overwrote _active_overlay without signalling the prior one, so a follow-up call would orphan the running overlay and cancel_active_flash() could no longer reach it. Fixed with an atomic swap: save prev under _lock, install new, then signal prev outside the lock so the prior overlay tears down without blocking the new caller. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…rder The two new full-screen tests monkeypatched ``sys.modules['windows_mcp.uia']`` but that is bypassed once another test in the suite imports the real uia — ``import windows_mcp.uia as uia`` then resolves via the cached parent-package attribute rather than sys.modules. Patch ``GetMonitorsRect`` on the real module instead so the override holds regardless of import order. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds optional window_name (fuzzy title match) and window_pid (exact process id) parameters to the Screenshot and Snapshot tools so an agent can capture a single window's bounding rectangle without doing a full-desktop screenshot or guessing a display index. The targeted window is brought to the foreground first by default; pass focus_window=False to skip that. window_name/window_pid and display are mutually exclusive. Implementation: - New module desktop/window_resolver.py: enumerate_visible_windows, resolve_window (PID/exact + name/fuzzy), and get_window_rect using DwmGetWindowAttribute(DWMWA_EXTENDED_FRAME_BOUNDS) with a GetWindowRect fallback so DWM drop-shadow doesn't pollute the rect. - Desktop.resolve_window_capture_rect ties resolver + focus + rect read together and reuses the existing bring_window_to_top logic. - Desktop.get_state now accepts a capture_rect override that takes precedence over display_indices (and rejects passing both). - capture_desktop_state (snapshot helper) plumbs window_name / window_pid / focus_window through, validates mutual exclusion with display, and surfaces the resolved title in the Screenshot/Snapshot metadata as "Target Window: ...". Tests: tests/test_window_resolver.py — 12 cases covering enumeration filtering, PID-vs-name precedence, fuzzy cutoff, and the DWM/ GetWindowRect fallback chain.

When window_name/window_pid is supplied with focus_window=False, the resolver now refuses unless the target is actually the foreground window. Otherwise the screenshot would just capture whatever happened to be on top of it, which is misleading. - focus_window=True (default): after bring_window_to_top, verify the window became foreground and log a warning if it didn't (e.g. when Windows blocks SetForegroundWindow for an elevated/UIPI target). - focus_window=False: raise WindowNotFoundError with a clear message if the target is not foreground, in addition to the existing minimized check. Adds is_foreground(hwnd) helper in window_resolver.py and three tests for it.

…ndow is blocked bring_window_to_top relies on SetForegroundWindow, which Windows silently rejects when the calling process didn't own the last input event — even after the AttachThreadInput dance. The targeted window is raised in z-order but the active foreground window stays put, so the screenshot grabs whichever app the user happens to be looking at instead. When the post-focus is_foreground check fails, retry once with SwitchToThisWindow (the undocumented Win32 API the shell uses for Alt-Tab), which bypasses the foreground lock without injecting keyboard input. If that still doesn't work, log the existing warning. Tests cover the new force_foreground helper: it calls SwitchToThisWindow with (hwnd, True) and silently swallows OSError from misbehaving drivers.

Qodo review on PR CursorTouch#233 pointed out that resolve_window_capture_rect relied on catching exceptions from bring_window_to_top, but bring_window_to_top swallows its own exceptions and only logs — so the try/except was never a reliable failure signal. f4a8ff5 added an explicit is_foreground post-condition check plus a SwitchToThisWindow fallback, but on final failure it only warned and then captured whatever was on top anyway — exactly the silent-wrong- content case Qodo flagged. When focus_window=True and both bring_window_to_top + force_foreground fail to bring the target forward, raise WindowNotFoundError with a message that tells the user to focus it manually or pass focus_window=False. focus_window=False still bypasses this path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

qodo-code-review Bot reviewed May 8, 2026

View reviewed changes

exalsch force-pushed the feat/window-targeted-screenshot branch from ba7cc1e to 1ce1f5d Compare May 8, 2026 20:48

exalsch and others added 5 commits May 8, 2026 23:12

exalsch force-pushed the feat/window-targeted-screenshot branch from 1ce1f5d to f4a8ff5 Compare May 11, 2026 07:58

exalsch and others added 6 commits May 11, 2026 15:12

exalsch force-pushed the feat/window-targeted-screenshot branch from f4a8ff5 to eba890d Compare May 11, 2026 13:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(screenshot): target a single window via window_name / window_pid#233

feat(screenshot): target a single window via window_name / window_pid#233
exalsch wants to merge 13 commits into
CursorTouch:mainfrom
exalsch:feat/window-targeted-screenshot

exalsch commented May 8, 2026

Uh oh!

qodo-code-review Bot commented May 8, 2026

Uh oh!

qodo-code-review Bot commented May 8, 2026 •

edited

Loading

Uh oh!

qodo-code-review Bot May 8, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

exalsch commented May 8, 2026

Summary

Why

Implementation

Tests

Test plan

Uh oh!

qodo-code-review Bot commented May 8, 2026

Review Summary by Qodo

Walkthroughs

File Changes

Uh oh!

qodo-code-review Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Uh oh!

qodo-code-review Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

qodo-code-review Bot commented May 8, 2026 •

edited

Loading