Add serializer-derived field introspection to ValuesViewset by rtibbles · Pull Request #14327 · learningequality/kolibri

rtibbles · 2026-03-04T01:36:23Z

Summary

Enables ValuesViewset to automatically derive values(), field_map, and many=True consolidation from serializer_class — no more manually maintained field tuples that drift out of sync.

Serializer introspection module and 58-test suite for the core derivation logic
Benchmark script for measuring serialization performance and detecting regressions
Migrates FacilityUserViewSet, PublicFacilityUserViewSet, and FacilityUserSignUpViewSet as first validation (all 166 existing auth tests pass unchanged)

References

Partial work towards #14036

Reviewer guidance

Start with kolibri/core/utils/serializer_introspection.py — the core derivation logic
Then BaseValuesViewset._ensure_initialized in kolibri/core/api.py — where derivation is triggered and cached
The auth migration (kolibri/core/auth/api.py) shows the pattern in practice
Risky areas: _auto_consolidate (groupby dedup correctness), _field_matches_inferred_type (false negatives cause silent type mismatches)

AI usage

Developed collaboratively with Claude Code (Opus 4.6). Used for initial implementation, test suite, and iterative refinement based on benchmark results and review feedback. All code reviewed and verified against the existing auth test suite.

Adds an integration benchmark for comparing ValuesViewset serialization approaches, measuring performance across synthetic and real-world data. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Enables ValuesViewset to automatically derive its values() tuple, field_map, and consolidation logic from the serializer_class, removing the need for manually maintained field configuration. Includes serializer introspection utility, comprehensive test suite, and updated documentation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Removes explicit values tuples and field_map dicts from FacilityUserViewSet, PublicFacilityUserViewSet, and FacilityUserSignUpViewSet, relying on the serializer to drive field selection and transformation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-03-04T02:13:11Z

Build Artifacts

Asset type	Download link
PEX file	kolibri-0.19.2rc0.dev0_git.8.ge0e6445e.pex
Windows Installer (EXE)	kolibri-0.19.2rc0.dev0+git.8.ge0e6445e-windows-setup-unsigned.exe
Debian Package	kolibri_0.19.2rc0.dev0+git.8.ge0e6445e-0ubuntu1_all.deb
Mac Installer (DMG)	kolibri-0.19.2rc0.dev0+git.8.ge0e6445e.dmg
Android Package (APK)	kolibri-0.19.2rc0.dev0+git.8.ge0e6445e-0.1.7-debug.apk
Raspberry Pi Image	kolibri-pi-image-0.19.2rc0.dev0+git.8.ge0e6445e.zip
TAR file	kolibri-0.19.2rc0.dev0+git.8.ge0e6445e.tar.gz
WHL file	kolibri-0.19.2rc0.dev0+git.8.ge0e6445e-py2.py3-none-any.whl

rtibblesbot

Well-structured first step toward serializer-derived field introspection for ValuesViewset. The core introspection module is solid, the FacilityUser viewset migrations are clean, and the 58-test suite plus benchmark script provide good coverage and regression tooling.

CI passing. No UI changes.

suggestion: deferred_fields semantics for non-nested-serializer fields (see inline)
suggestion: Thread safety comment precision (see inline)
praise: Ordering preservation in _auto_consolidate (see inline)
praise: Type-inference optimization in _field_matches_inferred_type (see inline)
praise: Removal of FacilityUserConsolidateMixin and its manual re-sort — the new auto-consolidate is more robust

@rtibblesbot's comments are generated by an LLM, and should be evaluated accordingly

How was this generated?

Reviewed the pull request diff checking for:

Correctness: bugs, edge cases, undocumented behavior, resource leaks, hardcoded values
Design: unnecessary complexity, naming, readability, comment accuracy, redundant state
Architecture: duplicated concerns, minimal interfaces, composition over inheritance
Testing: behavior-based assertions, mocks only at hard boundaries, accurate coverage
Completeness: missing dependencies, unupdated usages, i18n, accessibility, security
Principles: DRY (same reason to change), SRP, Rule of Three (no premature abstraction)
Checked CI status and linked issue acceptance criteria
For UI changes: inspected screenshots for layout, visual completeness, and consistency

rtibblesbot · 2026-03-06T03:50:19Z

kolibri/core/auth/api.py

-    field_map = {
-        "is_superuser": lambda x: bool(x.pop("devicepermissions__is_superuser")),
-    }
+    deferred_fields = ("roles",)


suggestion: deferred_fields = ("roles",) here works correctly — the field is skipped during introspection and populated manually in consolidate(). However, roles on PublicFacilityUserSerializer is a ListField(child=CharField()), not a nested ModelSerializer. The deferred_fields docstring and documentation describe it as "nested serializer field names that should be fetched separately," so this usage is semantically imprecise.

This won't cause bugs (the introspection code handles it: ListField fails _is_nested_serializer(), hits the field_name in deferred_fields: continue branch, and gets skipped), but it could confuse future developers who expect deferred_fields entries to be usable with serialize_queryset().

Consider either:

Adding a brief comment here explaining that roles is deferred because it's populated manually in consolidate() (not via serialize_queryset()), or

Documenting in api_patterns.rst that deferred_fields can also be used for non-nested fields that are populated manually

rtibblesbot · 2026-03-06T03:50:19Z

kolibri/core/api.py

+        Deferred from __init_subclass__ to avoid instantiating serializers
+        (which may reference querysets) at class definition / import time.
+
+        No lock needed under CPython's GIL: the work is idempotent —


suggestion: The thread-safety reasoning is correct in outcome but could be more precise. Under CPython's GIL, two threads can both read _initialized = False and execute the initialization block concurrently (the check-then-act at line 238 is not atomic across bytecodes). The code is safe because the work is idempotent — both threads produce identical results, so the final class state is correct regardless. The waste is only redundant computation, not corruption.

Consider rewording to: "Safe without a lock under CPython's GIL: concurrent threads may redundantly execute this block, but the work is idempotent — identical serializer always produces identical values/field_map, so the final state is correct regardless of execution order."

rtibblesbot · 2026-03-06T03:50:20Z

kolibri/core/api.py

+            self.__class__._pk_getter = pk_getter
+
+        joined_many = self._joined_many
+        original_pk_order = list(dict.fromkeys(pk_getter(item) for item in items))


praise: Capturing original_pk_order before sorting for groupby, then restoring it at line 616, is an elegant solution. This correctly preserves the queryset's ordering (applied by ValuesViewsetOrderingFilter or the default order_by) through the groupby consolidation — a significant improvement over the old FacilityUserConsolidateMixin which had to manually re-sort based on request query params.

rtibblesbot · 2026-03-06T03:50:20Z

kolibri/core/utils/serializer_introspection.py

+        current_model = related_model
+
+
+def _field_matches_inferred_type(


praise: Good design decision to use exact class matching (type(declared_field) is inferred_class) rather than isinstance. This correctly handles the case where a subclass overrides to_representation() — such fields will always go through the transform path, avoiding silent type mismatches. The FK-field handling (lines 92-100) is also well-reasoned: skipping to_representation for PrimaryKeyRelatedField avoids calling it with a raw value instead of a model instance.

rtibblesbot · 2026-03-06T03:50:20Z

kolibri/core/auth/serializers.py

        return attrs


+class FacilityUserRoleSerializer(serializers.ModelSerializer):


praise: Clean separation of FacilityUserRoleSerializer (read-only, excludes user) from the existing RoleSerializer (includes user for write operations). This avoids leaking the parent user ID into nested role objects in API responses.

rtibblesbot

Solid infrastructure for serializer-derived field introspection. The core introspection module, auto-consolidation, and FacilityUser viewset migrations are well-structured.

CI passing. No UI changes.

Delta from prior review:

ACKNOWLEDGED: deferred_fields semantics for non-nested-serializer fields — behavior is correct as-is (ListField in deferred_fields is simply skipped from values(), populated by consolidate())
ACKNOWLEDGED: Thread safety comment precision — the comment at _ensure_initialized accurately describes CPython GIL safety and free-threaded Python considerations
3 prior praise finding(s) stand; not re-posted.

New findings:

blocking: _build_validation_schema crashes on ListField children in DEBUG mode (see inline)
praise: _get_own classmethod pattern (see inline)

@rtibblesbot's comments are generated by an LLM, and should be evaluated accordingly

How was this generated?

Compared the current PR state against findings from a prior review:

Retrieved prior bot reviews deterministically via the GitHub API
Classified each prior finding as RESOLVED, UNADDRESSED, ACKNOWLEDGED, or CONTESTED
Only raised NEW findings for newly introduced code
Reviewed the pull request diff checking for correctness, design, architecture, testing, completeness, and adherence to DRY/SRP principles
Checked CI status and linked issue acceptance criteria

rtibblesbot · 2026-03-06T17:33:03Z

kolibri/core/api.py

+            if getattr(field, "write_only", False):
+                continue
+            expected_fields.add(field_name)
+            if hasattr(field, "child"):


blocking: _build_validation_schema crashes in DEBUG mode for serializers with ListField(child=...) fields.

hasattr(field, "child") matches both ListSerializer (intended — its .child is a Serializer with .fields) and ListField (unintended — its .child is a plain Field like CharField, which has no .fields attribute).

Concrete trigger: PublicFacilityUserSerializer.roles is ListField(child=CharField()). When PublicFacilityUserViewSet initializes in DEBUG mode, it calls _build_validation_schema, hits this branch, recurses into CharField, and CharField has no .fields → AttributeError.

This isn't caught by tests because Django test settings typically have DEBUG=False, and the test suite's @override_settings(DEBUG=True) tests use custom serializers that don't include ListField.

Fix: guard the recursion to only enter serializer children, e.g.:

if hasattr(field, "child") and hasattr(field.child, "fields"):

or more explicitly:

from rest_framework.serializers import BaseSerializer if hasattr(field, "child") and isinstance(field.child, BaseSerializer):

The same pattern should also be checked in _validate_items_against_schema — if a ListField value (list of primitives like ["admin", "coach"]) reaches the nested validation path, it would try to call .keys() on strings.

rtibblesbot · 2026-03-06T17:33:03Z

kolibri/core/api.py

+        Prevents dynamically cached class attributes (e.g. serializer_class
+        set by get_serializer_class()) from leaking to child classes.
+        """
+        return cls.__dict__.get(attr, default)


praise: _get_own via cls.__dict__.get() is a clean solution for preventing dynamically-cached class attributes from leaking through the MRO to child classes — a common and subtle pitfall with Django class-based views and DRF viewsets.

rtibbles and others added 3 commits March 3, 2026 16:28

Add viewset serialization benchmark script

95a13b3

Adds an integration benchmark for comparing ValuesViewset serialization approaches, measuring performance across synthetic and real-world data. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions bot added DEV: backend Python, databases, networking, filesystem... SIZE: very large labels Mar 4, 2026

rtibbles requested a review from rtibblesbot March 6, 2026 03:43

rtibblesbot reviewed Mar 6, 2026

View reviewed changes

MisRob requested a review from rtibblesbot March 6, 2026 17:25

MisRob added community-review and removed community-review labels Mar 6, 2026

rtibblesbot suggested changes Mar 6, 2026

View reviewed changes

marcellamaki self-requested a review March 10, 2026 17:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add serializer-derived field introspection to ValuesViewset#14327

Add serializer-derived field introspection to ValuesViewset#14327
rtibbles wants to merge 3 commits intolearningequality:release-v0.19.xfrom
rtibbles:viewset

rtibbles commented Mar 4, 2026

Uh oh!

github-actions bot commented Mar 4, 2026

Uh oh!

rtibblesbot left a comment

Uh oh!

rtibblesbot Mar 6, 2026

Uh oh!

rtibblesbot Mar 6, 2026

Uh oh!

rtibblesbot Mar 6, 2026

Uh oh!

rtibblesbot Mar 6, 2026

Uh oh!

rtibblesbot Mar 6, 2026

Uh oh!

rtibblesbot left a comment

Uh oh!

rtibblesbot Mar 6, 2026

Uh oh!

rtibblesbot Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		current_model = related_model


		def _field_matches_inferred_type(

		return attrs


		class FacilityUserRoleSerializer(serializers.ModelSerializer):

Conversation

rtibbles commented Mar 4, 2026

Summary

References

Reviewer guidance

AI usage

Uh oh!

github-actions bot commented Mar 4, 2026

Uh oh!

rtibblesbot left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rtibblesbot left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants