Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 21 additions & 1 deletion view.c
Original file line number Diff line number Diff line change
Expand Up @@ -112,17 +112,31 @@ view_position_all(struct cg_server *server)
}
}

static void
handle_surface_destroy(struct wl_listener *listener, void *data)
{
struct cg_view *view = wl_container_of(listener, view, surface_destroy);
/* wlroots is about to free scene_tree via the subsurface tree listener
* (registered before us). Null it out so view_unmap() skips the destroy. */
view->scene_tree = NULL;
}

void
view_unmap(struct cg_view *view)
{
wl_list_remove(&view->link);
/* Always registered in view_map() before any unmap can fire. */
wl_list_remove(&view->surface_destroy.link);

wl_list_remove(&view->request_activate.link);
wl_list_remove(&view->request_close.link);
wlr_foreign_toplevel_handle_v1_destroy(view->foreign_toplevel_handle);
view->foreign_toplevel_handle = NULL;

wlr_scene_node_destroy(&view->scene_tree->node);
if (view->scene_tree) {
wlr_scene_node_destroy(&view->scene_tree->node);
view->scene_tree = NULL;
}

view->wlr_surface->data = NULL;
view->wlr_surface = NULL;
Expand Down Expand Up @@ -152,6 +166,12 @@ view_map(struct cg_view *view, struct wlr_surface *surface)
goto fail;
view->scene_tree->node.data = view;

/* Register after wlr_scene_subsurface_tree_create() so our listener fires
* after wlroots' internal one, which frees scene_tree on surface destroy.
* This lets us null scene_tree before view_unmap() tries to use it. */
view->surface_destroy.notify = handle_surface_destroy;
wl_signal_add(&surface->events.destroy, &view->surface_destroy);
Comment on lines +172 to +173
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand. wlr_surface.events.unmap is always guaranteed to be called before destroy, so I don't know how view->scene_tree can free itself before we call wlr_scene_node_destroy().

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its entirely possible that I got things wrong. I'm not at all confident in my knowledge of the wlroots or cage codebases.

What I know is that in the described test, view_unmap does raise a SIGSEGV (at least, _on my machine_™)

Adding log lines to this patch in view_unmap, and handle_surface_destroy I see (when running with the detached child process):

00:00:02.989 [../view.c:118] surface destroy
00:00:02.989 [../view.c:137] view_unmap: scene_tree is NULL

Which if I understand correctly means either there is a bug elsewhere (in wlroots?) or unmap is not guaranteed to run before destroy?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the surface object is destroyed, this function gets called in wlroots:

https://gitlab.freedesktop.org/wlroots/wlroots/-/blob/334019f839bf0728d958c179aceed67e0e8db66a/types/wlr_compositor.c#L725

The first thing it does is ensure the surface is unmapped:

https://gitlab.freedesktop.org/wlroots/wlroots/-/blob/334019f839bf0728d958c179aceed67e0e8db66a/types/wlr_compositor.c#L934

Are we sure a single surface is involved?

The script linked above doesn't reproduce the bug on my setup. cage doesn't crash, it exits cleanly.

Copy link
Copy Markdown
Author

@sdumetz sdumetz Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script linked above doesn't reproduce the bug on my setup. cage doesn't crash, it exits cleanly.

Oh that's strange. I didn't think it would be setup-dependant since I saw the issue in the wild on a brand new amd64 mini-PC and on a raspberry PI running cage over a DRM backend (admitttedly, running a modified version with a few patches) and was able to reproduce with this script on my machine (debian/KDE, wayland backend) using the master branch and default options.

The attached script seems to segfault 100% of the time, too.

When compiled in debug mode, the logs' tail looks like this:

00:00:00.202 [types/wlr_compositor.c:786] New wlr_surface 0x556959ee6d10 (res 0x556959d49bb0)
00:00:00.202 [xwayland/xwm.c:2055] New xwayland surface: 0x556959ee6d10
00:00:00.202 [xwayland/xwm.c:695] XCB_ATOM_NET_STARTUP_ID: (null)
00:00:00.203 [xwayland/xwm.c:1105] unhandled X11 property 283 (WM_STATE) for window 4194314
00:00:00.203 [backend/wayland/output.c:433] Primary buffer size mismatch
00:00:00.215 [types/scene/wlr_scene.c:2189] Direct scan-out enabled
Sending SIGTERM to cage (pid 65809) only, xclock stays alive...
(EE) failed to read Wayland events: Connection reset by peer
./reproduce_crash.sh : ligne 46 : 65809 Erreur de segmentation  (core dumped)"$CAGE" -- xclock
X connection to :1 broken (explicit kill or server shutdown).

Where pretty much everything looks ok to me, except for the core dump. Could it be a xwayland issue rather than a cage / wlroots issue?


view->wlr_surface = surface;
surface->data = view;

Expand Down
1 change: 1 addition & 0 deletions view.h
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ struct cg_view {
struct wl_list link; // server::views
struct wlr_surface *wlr_surface;
struct wlr_scene_tree *scene_tree;
struct wl_listener surface_destroy; /* nullifies scene_tree when wlroots frees it */

/* The view has a position in layout coordinates. */
int lx, ly;
Expand Down