Skip to content

CORE-14879: Migrate "Roles" Admin API v1 Security Endpoints to v2 ConnectRPC API#29072

Merged
nguyen-andrew merged 7 commits intoredpanda-data:devfrom
nguyen-andrew:admin-api-v2-roles
Jan 7, 2026
Merged

CORE-14879: Migrate "Roles" Admin API v1 Security Endpoints to v2 ConnectRPC API#29072
nguyen-andrew merged 7 commits intoredpanda-data:devfrom
nguyen-andrew:admin-api-v2-roles

Conversation

@nguyen-andrew
Copy link
Member

This PR introduces comprehensive support and testing for Role Management in the Admin API v2.

The main changes include:

  • Implementation of the Role Management endpoints for Admin API v2, enabling administrators to get, list, create, and delete roles, as well as manage their members. Write operations are proxied to the controller leader, while read operations can be served locally, consistent with v1 API behavior.
  • Generation of Ducktape protobuf clients to test the new Admin API v2 Roles security endpoints.
  • Introduction of a new test file for Admin API v2 role management, replicating and extending tests from the original rbac_tests.py to ensure functional coverage.

https://redpandadata.atlassian.net/browse/CORE-14879

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v25.3.x
  • v25.2.x
  • v25.1.x

Release Notes

Features

  • Added role management in Admin API v2, including create, list, get, and delete roles, and manage role members.

Copilot AI review requested due to automatic review settings December 19, 2025 17:44
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces comprehensive role management capabilities to the Admin API v2 by implementing new ConnectRPC endpoints for creating, retrieving, listing, updating, and deleting roles, along with managing role members. The implementation includes backend services in C++, Python client wrappers, and extensive Ducktape test coverage to ensure functional parity with the v1 RBAC API.

Key Changes:

  • Implementation of seven new role management RPC methods in Admin API v2's SecurityService
  • Leader-based request proxying for write operations (create, update, delete) while enabling local serving for read operations
  • Comprehensive test suite covering authorization, CRUD operations, member management, ACL integration, and persistence across restarts

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
proto/redpanda/core/admin/v2/security.proto Defines protobuf messages and RPC service methods for role management
proto/redpanda/core/admin/v2/BUILD Adds googleapis dependencies for field_behavior and resource annotations
src/v/redpanda/admin/services/security.h Declares role management method signatures in security_service_impl
src/v/redpanda/admin/services/security.cc Implements role management operations with validation, conversion helpers, and error handling
src/v/redpanda/admin/services/BUILD Adds utils dependency for redirect_to_leader functionality
src/v/redpanda/application_admin.cc Updates security_service_impl instantiation to include metadata_cache reference
tests/rptest/clients/admin/proto/redpanda/core/admin/v2/security_pb2.py Generated Python protobuf code for role management messages
tests/rptest/clients/admin/proto/redpanda/core/admin/v2/security_pb2.pyi Generated Python type stubs for role management
tests/rptest/clients/admin/proto/redpanda/core/admin/v2/security_pb2_connect.py Generated Python ConnectRPC client methods for role operations
tests/rptest/tests/rbac_test_v2.py New comprehensive test suite for Admin API v2 role management
tools/type-checking/type-check-strictness.json Registers new test file for type checking

@nguyen-andrew nguyen-andrew self-assigned this Dec 19, 2025
@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Dec 19, 2025

CI test results

test results on build#78219
test_class test_method test_arguments test_kind job_url test_status passed reason test_history
NodesDecommissioningTest test_decommission_status null integration https://buildkite.com/redpanda/redpanda/builds/78219#019b37dc-7c56-496b-a8cb-ea807285313c FLAKY 36/41 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0417, p0=0.0841, reject_threshold=0.0100. adj_baseline=0.1199, p1=0.4680, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=NodesDecommissioningTest&test_method=test_decommission_status
test results on build#78409
test_class test_method test_arguments test_kind job_url test_status passed reason test_history
MountUnmountIcebergTest test_simple_remount {"cloud_storage_type": 1} integration https://buildkite.com/redpanda/redpanda/builds/78409#019b53d4-0da2-4d4b-8042-fed89174e779 FLAKY 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.1800, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.4486, p1=0.0026, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=MountUnmountIcebergTest&test_method=test_simple_remount
NodesDecommissioningTest test_decommissioning_cancel_ongoing_movements {"cloud_topic": true} integration https://buildkite.com/redpanda/redpanda/builds/78409#019b53d4-0da4-4430-832a-f3c6e5305c41 FLAKY 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0000, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=NodesDecommissioningTest&test_method=test_decommissioning_cancel_ongoing_movements
SIPartitionMovementTest test_cross_shard {"cloud_storage_type": 1, "num_to_upgrade": 0, "with_cloud_topics": true} integration https://buildkite.com/redpanda/redpanda/builds/78409#019b53d4-0da4-4430-832a-f3c6e5305c41 FLAKY 8/11 Test FAILS after retries.Significant increase in flaky rate(baseline=0.0058, p0=0.0015, reject_threshold=0.0100) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=SIPartitionMovementTest&test_method=test_cross_shard

Copy link
Contributor

@rockwotj rockwotj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, will leave the rest of the review to Mike

Comment on lines 249 to 264
const auto role_opt = _controller->get_role_store().local().get(role_name);

if (!role_opt) {
vlog(securitylog.debug, "Role '{}' does not exist", role_name);
throw serde::pb::rpc::not_found_exception(
ssx::sformat("Role '{}' does not exist", role_name));
}

auto curr_members = role_opt->members();
curr_members.insert(members_to_add.begin(), members_to_add.end());

const security::role role{curr_members};

const auto err
= co_await _controller->get_security_frontend().local().update_role(
role_name, role, model::timeout_clock::now() + role_operation_timeout);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably not a big deal, but there is no consistency guarantees between these operations so it's possible other updates are clobbered in the meantime. The "correct" way to fix this is to have a dedicated controller command for adding/removing members from roles.

Since this isn't possible to do safely today, it's probably OK, but if there is time we should consider following up with a dedicated controller command.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I've tried to capture this in a Jira ticket here: https://redpandadata.atlassian.net/browse/CORE-15076

Comment on lines 663 to 665
@parametrize(delete_acls=True)
@parametrize(delete_acls=False)
@parametrize(delete_acls=None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this turns into 3 different clusters getting spun up and down, consider just having a for loop in the test instead.

@nguyen-andrew nguyen-andrew marked this pull request as draft December 19, 2025 23:38
@nguyen-andrew
Copy link
Member Author

Force push: minor adjustments to address PR comments.

@nguyen-andrew nguyen-andrew force-pushed the admin-api-v2-roles branch 2 times, most recently from 861b1be to 802f8d6 Compare December 23, 2025 03:11
@nguyen-andrew
Copy link
Member Author

nguyen-andrew commented Dec 23, 2025

Force pushes:

  • Rebase on latest dev
  • Minor adjustments
    • Renaming role_operation_timeout to security_operation_timeout
    • Originally, role names were only validated in CreateRole, but now I've added validation of role names to other role operations too.

@nguyen-andrew
Copy link
Member Author

Force push:

  • Adding security_test unit tests to validate correctness of the protobuf <-> role converters.
  • Minor refactoring

@nguyen-andrew nguyen-andrew marked this pull request as ready for review December 25, 2025 04:07
@vbotbuildovich
Copy link
Collaborator

Retry command for Build#78409

please wait until all jobs are finished before running the slash command

/ci-repeat 1
skip-redpanda-build
skip-units
skip-rebase
tests/rptest/tests/partition_movement_test.py::SIPartitionMovementTest.test_cross_shard@{"cloud_storage_type":1,"num_to_upgrade":0,"with_cloud_topics":true}

@nguyen-andrew
Copy link
Member Author

Force push to rebase on latest dev.

Copy link
Contributor

@michael-redpanda michael-redpanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good - I'm not sure we need the last two commits - should you just change the tests to use the v2 endpoint rather than duplicating them?

Comment on lines +160 to +161
// The Role to create.
Role role = 1 [(google.api.field_behavior) = REQUIRED];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember where the discussion landed by can a user provide RoleMembers when creating a role?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, with this approach you can provide RoleMembers when creating the role. The CreateRoleRequest message includes the entire Role resource, so you'd be able to specify its RoleMembers as well. Here's an example:

curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer $TOKEN" --data '{"role": {"name": "role1", "members": [{"user": {"name": "admin"}}, {"user": {"name": "testuser1"}}]}}' 

@nguyen-andrew
Copy link
Member Author

looks good - I'm not sure we need the last two commits - should you just change the tests to use the v2 endpoint rather than duplicating them?

@michael-redpanda
I leaned toward duplicating the ducktape tests so we don’t lose coverage on the existing v1 API while it’s still in use. My thinking was that we can remove the v1 tests once we officially deprecate or remove the v1 API, but until then this keeps coverage intact. Happy to change this if you think replacing them is the better move though.

@nguyen-andrew nguyen-andrew marked this pull request as draft January 6, 2026 01:07
@nguyen-andrew nguyen-andrew force-pushed the admin-api-v2-roles branch 2 times, most recently from 0b73a01 to f2413a1 Compare January 6, 2026 01:14
@nguyen-andrew
Copy link
Member Author

Force pushes:

@nguyen-andrew
Copy link
Member Author

Force push to rebase on latest dev.

@nguyen-andrew nguyen-andrew marked this pull request as ready for review January 6, 2026 01:21
@nguyen-andrew
Copy link
Member Author

Force push again to rebase on actual latest dev (to fix "out-of-date with the base branch" error).

This commit adds the protobuf messages and RPCs for role management for
the Admin API V2, and it extends the security service implementation to
include unimplemented stubs for the new role management methods:
- CreateRole
- GetRole
- ListRoles
- AddRoleMembers
- RemoveRoleMembers
- DeleteRole
- ListCurrentUserRoles
Generate Ducktape protobuf clients to test the new Admin API v2 Roles
security endpoints.
Implemented the Role Management endpoints for the Admin API v2, allowing
administrators to get, list, create, and delete roles as well as manage
their members. All write operations are proxied to the controller leader
while read operations can be served locally. This is consistent with the
existing behavior of the v1 Admin API.
Added a new file to test the Admin API v2 for role management. This file
will be based on the existing rbac_tests.py and will aim to provide the
same level of functional testing and coverage by using the original
rbac_tests.py as a reference. This initial commit will replicate most
of the tests from the RBACTest class for the v2 API (some TODOs remain).

Additionally, updated tools/type-checking/type-check-strictness.json to
include the new rbac_test_v2.py file for type checking.
Replicating the RBACEndToEndTest class from rbac_test.py to ensure
comprehensive testing of RBAC functionalities in the Admin API v2.
Replicating the RolePersistenceTest class from rbac_test.py to ensure
comprehensive testing of RBAC functionalities in the Admin API v2.
Adds unit tests for the Admin API v2 security service role management
functionality, covering:
- Role name validation (valid names, invalid characters, edge cases)
- Conversion between protobuf role types and security subsystem types
- Round-trip conversion correctness for single/multiple members
- Member deduplication behavior in security roles

These tests validate the transformation layer between the Admin API v2
protobuf interface and the internal security subsystem, ensuring role
and member data is correctly converted in both directions.
@nguyen-andrew
Copy link
Member Author

Force push to rebase against dev again... last rebase was on a broken dev.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

with expect_role_error(ConnectErrorCode.INVALID_ARGUMENT):
self.superuser_admin.create_role(
role=self.ROLE_NAME0, members=[invalid_member]
)
Copy link

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test test_invalid_create_role validates role name rejection for invalid characters, but it doesn't test validation for role names exceeding maximum length limits (if any exist). Consider adding a test case for excessively long role names.

Copilot uses AI. Check for mistakes.
Comment on lines +262 to +264
@ignore
@cluster(num_nodes=3)
def test_list_role_filter(self):
Copy link

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is marked with @ignore and has no implementation. Since v2 list_roles doesn't support filters yet (as indicated by the TODO on line 232 in security.cc), consider removing this placeholder test or adding a clear TODO comment explaining when this functionality will be implemented.

Copilot uses AI. Check for mistakes.
@nguyen-andrew nguyen-andrew merged commit 22699f8 into redpanda-data:dev Jan 7, 2026
30 checks passed
@nguyen-andrew nguyen-andrew deleted the admin-api-v2-roles branch January 7, 2026 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants