Riyura Backend

Overview & About

This is a sophisticated backend service for Riyura that acts as the core orchestration layer for the modern media streaming platform. Built on Spring Boot 4.x with Java 21, it seamlessly serves rich media content, dynamic discovery features, personalized user experiences, and real-time collaborative watch parties. The system integrates with an external content API for media metadata, Supabase for identity/auth, PostgreSQL for persistent storage, and Redis for high-performance caching, ephemeral party state, and distributed rate limiting.

From managing trending banners and multi-provider stream URL resolution to synchronized watch parties with real-time chat, Riyura powers every essential feature of a full-stack entertainment hub.

System Architecture

The application follows a Modular Monolith architecture that emphasizes high cohesion and loose coupling. Internal structure is organized into distinct, feature-driven modules — each owning its own controllers, services, repositories, models, and DTOs — making the platform highly maintainable, testable, and primed for a seamless migration to microservices if scale demands it.

Dependency Inversion Principle (DIP)

The backend strictly enforces the Dependency Inversion Principle (DIP) through Port Interfaces. Controllers never inject concrete service instances directly; they inject *ServicePort interfaces (e.g., MovieServicePort, RecommendationServicePort). This decoupling makes replacing service implementations and writing isolated unit tests nearly frictionless.

Core Modules

Module	Responsibility
Content	Central discovery engine — processes, filters, caches, and serves movies, TV shows, anime, banners, explore, and search. Orchestrates content API fetches with parallel async calls.
Identity	Protected user context — manages watchlists and watch history. All data is bound strictly to the authenticated user's JWT identity.
Party	Real-time watch party engine — Redis-backed party state, STOMP WebSocket messaging, host migration, buffering sync, and live chat.

Tech Stack

Layer	Technology
Framework	Spring Boot 4.x
Language	Java 21
Database	PostgreSQL with HikariCP connection pooling
ORM	Spring Data JPA / Hibernate
Caching	Redis (Spring Cache abstraction + custom RedisTemplate)
Rate Limiting	Bucket4J + Redis (Lettuce) — distributed, fail-open, tiered
Real-time	STOMP over WebSocket (SockJS fallback)
Security	Spring Security + OAuth2 Resource Server (Supabase JWT, HS256)
HTTP Client	JDK 21 `HttpClient` with connection pooling and timeouts
Async	JDK 21 virtual threads + CompletableFuture (with timeouts)
Content API	External REST API (with retry logic)
Documentation	SpringDoc OpenAPI (Swagger UI)
Build	Maven (Maven Wrapper included)
Utilities	Lombok

Database & Persistence

Riyura uses PostgreSQL as its primary relational store, managed through Spring Data JPA with Hibernate. The schema covers three core entities: stream providers (configures available streaming providers with per-media-type URL templates, quality, priority, and an active toggle), watchlist (persists user-saved content with a metadata snapshot including title, poster, release date, and vote), and watch history (records full playback events with streaming context — provider used, stream ID, episode info, watch duration, and an anime flag for UI hints).

Schema Safety (Production Hardening)

To prevent accidental destructive changes in production environments, Hibernate is explicitly configured with ddl-auto: validate. Automated database migrations (e.g., via Flyway/Liquibase) or manual DBA overrides handle schema changes safely without allowing the ORM to drop tables automatically.

Data Integrity

Both watchlist and watch_history tables enforce unique constraints on (user_id, tmdb_id, media_type) at the database level. This prevents duplicate entries even under concurrent requests (e.g. a user double-tapping "Add to Watchlist"). Services handle DataIntegrityViolationException gracefully — either returning the existing record or ignoring the duplicate — rather than surfacing a 500 error.

Per-user size limits are enforced in application code:

Collection	Max Size
Watchlist	500
Watch History	1000

Pagination

User-specific list endpoints (watchlist, watch history) are paginated using Spring Data's Pageable with a default page size of 10 items. Search results are paginated at 15 items per page. Cache keys include the page number to avoid serving incorrect slices.

Connection Pooling (HikariCP)

Database connections are managed by HikariCP with a maximum pool size of 10, minimum idle of 5, a connection timeout of 5 s, and a max lifetime of 30 min. The pool is intentionally downsized to fit low-spec VPS environments while complementing virtual thread concurrency. Additionally, core tables like watchlist and watch_history are optimized with B-Tree indexes on user_id to ensure O(log N) query performance as the user base grows.

Caching

Riyura uses Redis as its distributed cache, combining Spring's @Cacheable / @CacheEvict abstraction with a custom cache stampede guard (CacheStampedeGuard) that implements four complementary protections against thundering-herd effects when popular keys expire.

Cache Stampede Prevention

Technique	Implementation	Where Applied
Distributed mutex	Redis `SETNX` lock — only one node recomputes at a time; others wait and retry	XFetch, SWR, and cold misses (empty cache)
Cold-miss wait loop	If the cache is empty and multiple threads arrive, one wins the lock and computes; others `sleep(50)` and retry until the value appears	Both `xfetch` and `staleWhileRevalidate`
XFetch (Probabilistic Early Expiration)	Recomputes before TTL expires when `beta × delta × -ln(rand) > remainingTTL` — expensive loaders refresh early while the cache is still warm	MovieService, TvService, AnimeService, MovieDetailService, TvDetailsService, SearchService
Stale-While-Revalidate	Soft TTL (fresh window) + hard TTL; when stale, serve value instantly and refresh in background so users never see latency spikes	BannerService, ExploreService
Proportional TTL jitter	10–20 % of base TTL added at write time — scales correctly for any TTL (5 min → 30–60 s spread; 7 days → 16–33 h spread)	All caches via `CacheStampedeGuard` and `CacheConfig`
`@Cacheable(sync = true)`	Spring's per-JVM mutex for annotation-based caches — single-threaded recompute under concurrent load	WatchlistService, HistoryService

Cache Strategy

Serialization: String keys with JSON values. Uses BasicPolymorphicTypeValidator to strictly allowlist classes for safe polymorphic deserialization, mitigating RCE vulnerabilities.
Null caching: Disabled — absent values are never cached so transient errors don't poison the cache
Background refresh pool: Dedicated cacheRefreshExecutor (4–16 threads) for SWR background refreshes, explicitly using Virtual Threads to prevent carrier thread pinning. CallerRunsPolicy provides back-pressure if the queue is full

Content caches use CacheStampedeGuard and are keyed by their natural discriminator (e.g. limit, query, id). User-specific caches (watchlist, history) use @Cacheable with sync = true and are keyed by userId + ':' + page. Writes trigger targeted @CacheEvict using specific user keys to invalidate only the affected user's data without wiping the entire cache.

Redis Party State

Beyond Spring Cache, Redis is also used directly for watch party state (via RedisTemplate). Party objects are serialized as JSON and stored with a fixed party TTL constant defined in RedisConfig. This keeps party state distributed and resilient without requiring an in-memory server-side session.

Redis Operational Safety (SCAN)

Internal admin monitoring avoids using the blocking O(N) KEYS * command. Key iterations (e.g., in CacheMonitorController) use SCAN with ScanOptions to lazily and non-blockingly evaluate the data store, protecting the single-threaded Redis core from stalling under high capacity instances.

Concurrency

Virtual Threads (JDK 21)

Virtual threads are enabled globally via:

spring:
  threads:
    virtual:
      enabled: true

This configures both the Tomcat request handler thread pool and all @Async tasks to run on JDK 21 virtual threads. Virtual threads park instead of blocking during I/O (API calls, DB queries, Redis ops), meaning thousands of concurrent requests can be handled with minimal OS thread overhead — no executor tuning required.

WebSocket Executors

All three STOMP channel executors (broker, inbound, outbound) in WebSocketConfig are explicitly backed by a virtual thread executor:

executor = Executors.newVirtualThreadPerTaskExecutor()

This ensures WebSocket message handling inherits the same non-blocking scalability as the HTTP layer.

Dedicated Virtual Thread Executors

All services that use CompletableFuture for parallel API calls supply a dedicated virtual thread executor (Executors.newVirtualThreadPerTaskExecutor()) rather than relying on the ForkJoinPool.commonPool(). This prevents content API I/O from starving the shared pool used by framework internals and other CompletableFuture operations. Manual thread pools are entirely stripped out. Furthermore, anywhere execution needs to pause (e.g. wait-loops inside CacheStampedeGuard), we use Thread.sleep(Duration) instead of Thread.sleep(long). This explicitly instructs the JVM to park the virtual thread appropriately without pinning its underlying OS carrier thread.

Parallel Content Fetching (CompletableFuture)

Several services fire multiple content API calls in parallel using CompletableFuture, combining results after all futures complete. Every future carries an 8-second timeout (orTimeout(8, SECONDS)) to prevent a single slow upstream call from blocking the entire request indefinitely. Because the underlying threads are virtual, parallel calls are cheap even at high fan-out:

Service	Parallel Operations
`BannerService`	Trending movies + trending TV
`ExploreService`	Movies + TV
`AnimeService`	Anime movies + anime TV
`SearchService`	Multi-search + company search
`MovieDetailService`	Movie details + credits
`TvDetailsService`	TV details + credits
`TvPlayerService`	All seasons with episodes
`HistoryService`	TV show metadata + episode metadata

Content API Client Retry

The content API client (TmdbClient) uses a built-in retry mechanism — 3 attempts with a 150 ms backoff — to gracefully handle transient API failures without propagating errors to the client. All content services route their external API calls through TmdbClient.fetchWithRetry() to benefit from this resilience.

AI Recommendations (Gemini)

Riyura features an intelligent, personalized recommendation engine powered by Google's Gemini AI (gemini-2.0-flash). It uses a Retrieval-Augmented Generation (RAG) pipeline — Gemini never invents content; it selects from a pre-fetched candidate pool of real, verified TMDB items.

RAG Pipeline

RETRIEVE — Fetch TMDB /recommendations for each of the user's 3 seed history items in parallel using a dedicated virtual thread executor. → builds a ~60-item candidate pool with full metadata (tmdb_id, title, year, etc.)
AUGMENT — Build a compact prompt: user's recent watches + watchlist + the candidate pool
GENERATE — Gemini selects exactly 8 items from the pool and returns [{ tmdb_id, reason }] Structured output schema enforces valid integer tmdb_ids — no hallucinations possible
ENRICH — Look up chosen tmdb_ids in the already-fetched pool map (O(1), zero extra calls)
SAVE — Persist to PostgreSQL in the background using virtual threads (fire-and-forget)

Resilience & Rate Limit Protection

TMDB rate limiting: Candidate pool fetch is gated by Semaphore(20) — optimized for high-concurrency fetching on virtual threads. 429 errors retry 3× with exponential backoff.
Gemini retries: Catches 503, 429, and read timeouts — retries 3× with exponential backoff. Hard client errors (400, 401, 403) are not retried.
Hallucination-proof: Gemini can only return tmdb_id values present in the pool. Any ID not found in the pool is silently skipped rather than crashing the batch.

Response Fields

Each recommendation returns: tmdbId, title, year, mediaType, reason, and seasons/episodes (TV only).

Caching

Recommendations are cached in PostgreSQL and served instantly on subsequent calls. The full RAG pipeline only runs when ?refresh=true is explicitly passed.

WebSocket & Watch Parties

Riyura supports real-time synchronized watch parties powered by STOMP over WebSocket with a SockJS fallback. The WebSocket endpoint is exposed at /ws; party state lives in Redis and messaging is handled via a simple in-memory broker broadcasting to /topic destinations. Transport limits are set to a 128 KB message size, 512 KB send buffer, and a 20 s send time limit.

Authentication

WebSocketAuthInterceptor intercepts every CONNECT frame, extracts the JWT from the Authorization: Bearer header, and validates it against the Supabase secret. On success it writes userId and userName into the STOMP session attributes and assigns a unique principal (userId-sessionId) per connection — allowing a user to hold multiple simultaneous connections.

Party Messaging

PartyWebSocketController handles all inbound STOMP messages under /app/party/{partyId}/. Participants can join a party, send chat messages, report buffering state, and send heartbeats. The host additionally controls playback sync and can toggle strict sync mode. All events are broadcast to /topic/party/{partyId} so every member receives them in real time. Per-user acknowledgments (e.g. heartbeat ACK) are sent to the user's private queue.

Party Lifecycle & Features

Host migration: When the host disconnects, WebSocketEventListener automatically promotes the next participant to host and broadcasts NEW_HOST_ASSIGNED
Participant cap: Parties are limited to a maximum of 20 participants — join attempts beyond this limit receive a 400 error
Party ID validation: All party IDs are validated against a strict alphanumeric regex (^[A-Za-z0-9]{1,20}$) to prevent Redis key injection
Zombie eviction: On each WebSocket heartbeat, participants inactive for more than 45 seconds are removed from the party
Auto-Sync on Join: Newly joined participants immediately receive a directed SYNC event with the host's current playback position to synchronize instantly
Participant Sync Pull: A failsafe /request-sync button allows any participant to request a manual sync. The server reads the current state from Redis and returns a directed SYNC event exclusively to the requester without interrupting the host
Buffering sync: When a participant reports buffering, PartyService tracks all buffering participants. In strict sync mode this triggers a FORCE_PAUSE to all members; once all are ready a RESUME is broadcast
Strict sync mode: When enabled (host only), any participant buffering pauses playback for everyone
Latency compensation: Sync commands carry a clientTime timestamp that is validated for plausibility (positive, not in the future, within 30 s of server time). The service applies a compensation offset based on round-trip time before broadcasting the target playback position
Chat history: The last 50 chat messages are stored in the Redis party state and replayed on join
Party cleanup: When the last participant leaves or the party TTL expires, the party is removed from Redis

Security & Authentication

Riyura uses Spring Security configured as an OAuth2 Resource Server with Supabase as the identity provider.

Algorithm: HS256 (HMAC-SHA256) with a custom NimbusJwtDecoder
Issuer validation: Supabase project URL
JWT subject: Used as userId throughout the system (UUID format)
CORS: Configured globally via SecurityConfig using the APP_FRONTEND_URL environment variable — no per-controller @CrossOrigin annotations. Allowed headers are narrowed to Authorization, Content-Type, and Accept

Content discovery endpoints (/api/movies/**, /api/tv/**, /api/anime/**, /api/search/**, /api/banner/**, /api/explore/**) are fully public. User-specific endpoints (/api/profile/**, /api/watchlist/**, /api/party/**) require a valid Supabase-issued JWT. Observability and testing endpoints (/api/test/**) are strictly restricted to ROLE_ADMIN in production, with the exception of /api/test/health which remains public for frontend status checks. The WebSocket endpoint (/ws/**) is publicly reachable at the HTTP level, but authentication is enforced at the STOMP CONNECT frame by WebSocketAuthInterceptor.

Input Validation

All controllers are annotated with @Validated to enable Jakarta Bean Validation on both path variables and query parameters. Request bodies use @Valid to trigger DTO-level validation. Key validations include:

Layer	Validation
Path variables	TMDB IDs validated as `Long` (rejects non-numeric input), limits bounded with `@Min` / `@Max`
Query params	Page numbers bounded (`@Min(0)` or `@Min(1)`), limits bounded (`@Min(1) @Max(50)`)
Request DTOs	`@Positive` on IDs, `@Size` on strings, `@Min(0)` on numeric fields, `@Pattern` on enum-like fields
Party IDs	Validated against `^[A-Za-z0-9]{1,20}$` regex to prevent Redis key injection
Search queries	URL-encoded via `URLEncoder.encode()` before being passed to the content API, preventing URL/query injection
Language codes	2-letter passthrough codes validated as strictly alphabetic

Global Exception Handling

Controllers are absolutely devoid of local try-catch blocks and manual error JSON constructions. A centralized @RestControllerAdvice (GlobalExceptionHandler) intercepts all exceptions and returns a consistent JSON ApiErrorResponse — seamlessly preventing stack traces, internal class names, and other sensitive details from leaking to clients while ensuring uniformity across the entire API.

Exception Type	HTTP Status	Behavior
`MethodArgumentNotValidException`	400	Returns field-level validation error messages
`ConstraintViolationException`	400	Returns parameter-level constraint violations
`MethodArgumentTypeMismatchException`	400	Returns type conversion error (e.g. "abc" → Long)
`ResponseStatusException`	Varies	Forwards the status and reason from the service
All other `Exception`	500	Generic "An unexpected error occurred" message

IP Sanitization

The ClientIdentifierProvider extracts client IP from X-Forwarded-For for rate-limiting keys. The extracted IP is validated against a strict regex ([0-9a-fA-F.:]+) and length check (≤ 45 characters for IPv6) to prevent header injection attacks that could manipulate rate-limit bucket keys.

Rate Limiting

Riyura uses Bucket4J backed by Redis (Lettuce) for distributed, highly available rate limiting across all HTTP traffic. The implementation is production-ready, fail-open when Redis is unavailable, and compatible with JDK 21 virtual threads.

Architecture

Component	Responsibility
`LettuceBasedProxyManager`	Distributed bucket store — fetches or creates per-key buckets in Redis via atomic CAS operations
`ClientIdentifierProvider`	Resolves caller identity: authenticated user ID (JWT `sub`) or client IP; handles `X-Forwarded-For` with regex sanitization
`RateLimitTierService`	Maps request URIs to rate-limit tiers with distinct bucket configurations
`RateLimitFilter`	`OncePerRequestFilter` at order 1 — runs after Security so JWT context is available for per-user scoping

Rate Limit Tiers

All limits use greedy refill (tokens replenish continuously over the window, not in a burst at reset), which smooths traffic and avoids thundering-herd spikes.

Tier	Endpoints	Limit
DEFAULT	All other `/api/**` routes	100 requests / minute
HEAVY	`/api/explore`, `/api/search`, `/api/anime`	30 requests / minute
PARTY	`/api/party`, `/ws/**`	10 requests / minute

Heavy endpoints proxy expensive external API calls (TMDB, etc.); party endpoints cover WebSocket handshakes and party creation — both are more resource-intensive and thus throttled more aggressively.

Redis Key Design

Keys follow the format rate_limit:{identity}:{tier}:

Authenticated → rate_limit:user:{supabase-uuid}:default
Anonymous → rate_limit:ip:{client-ip}:heavy

This isolates limits per user (or per IP for anonymous traffic) and per tier, so one client exhausting explore does not affect their movie or party quota.

Memory Leak Prevention

LettuceBasedProxyManager is configured with an ExpirationAfterWriteStrategy that assigns a Redis TTL to each bucket key based on the refill period plus a buffer. Inactive buckets expire and are evicted automatically — no unbounded key growth in Redis.

Response Format

Allowed request: The filter attaches X-RateLimit-Remaining to the response and proceeds. No JSON body is modified.

Blocked request (429 Too Many Requests):

HTTP status: 429
Retry-After header: seconds until refill (derived from Bucket4J's nanosToWaitForRefill)
JSON body (strict format):

{
  "status": 429,
  "error": "Too Many Requests",
  "message": "You have been rate limited. Try again after 42 second(s)."
}

The filter short-circuits the request; the controller is never reached.

Excluded Paths

The following prefixes are not rate-limited (to avoid Swagger and health checks consuming quota or failing under load):

/swagger-ui
/v3/api-docs
/actuator
/favicon.ico

Fail-Open Behavior

The Redis evaluation is strictly isolated in its own try/catch block. The result (allowed, blocked, or fail-open) is stored in local variables, and the try block is closed before chain.doFilter() is called. This two-phase design is intentional:

If Redis is unreachable or times out, the filter logs a WARN, sets a failOpen flag, and allows the request through without touching the response.
Because chain.doFilter() is outside the try block, any exception thrown by a controller propagates normally up the filter chain. It is never misidentified as a Redis failure and can never trigger a double execution on an already-committed response (which would crash Tomcat with IllegalStateException: response already committed).

Topology-Aware Redis Connection

RateLimitConfig inspects the Lettuce native client with instanceof before opening the dedicated Bucket4J connection, rather than blindly casting to RedisClient:

RedisClusterClient  → used in AWS ElastiCache Cluster Mode, Redis Enterprise
RedisClient         → used in standalone / Sentinel deployments

If neither matches (e.g. a future Lettuce client type), the application fails fast at startup with a clear IllegalStateException rather than throwing a cryptic ClassCastException at runtime under load. This makes the configuration safe to deploy to any Redis topology without code changes.

Virtual Thread Safety

Bucket4J's Lettuce integration issues async Redis commands via Lettuce's non-blocking pipeline. The custom filter contains no synchronized blocks, so carrier threads are never pinned during Redis I/O — safe for high-concurrency virtual-thread workloads.

Health Check

Riyura exposes a lightweight, unauthenticated liveness endpoint at GET /api/test/health. It is designed to be polled by the frontend immediately after a user lands on the login page — if the backend is unreachable or returns 503, a downtime modal is shown before any auth flow is attempted.

Probe Strategy

Each probe is executed synchronously and timed independently so that latency regressions are visible before they become hard outages.

Component	Probe	Failure Severity
Database	`SELECT 1` via `JdbcTemplate` — validates the HikariCP connection pool and PostgreSQL reachability	Critical → DOWN
Redis	`PING` command via `StringRedisTemplate` — validates the Lettuce connection	Non-critical → DEGRADED

Redis is deliberately treated as non-critical: if it is unreachable, caching, rate limiting, and party state are degraded but core playback and profile features still function.

Aggregate Status (Worst-Wins)

DB = DOWN              → aggregate = DOWN
DB = UP, Redis = DOWN  → aggregate = DEGRADED
DB = UP, Redis = UP    → aggregate = UP

HTTP Status Contract

Aggregate Status	HTTP Status	Frontend action
`UP`	`200 OK`	Proceed normally
`DEGRADED`	`200 OK`	Proceed (degraded UX banner optional)
`DOWN`	`503 Service Unavailable`	Show downtime modal

Response Envelope

{
  "status": "UP",
  "components": {
    "database": { "status": "UP", "latencyMs": 4 },
    "redis": { "status": "UP", "latencyMs": 1 }
  },
  "checkedAt": "2026-03-04T01:14:07Z"
}

When a component is unhealthy the errorMessage field is included in that component's object:

{
  "status": "DOWN",
  "components": {
    "database": {
      "status": "DOWN",
      "latencyMs": 5002,
      "errorMessage": "Database is unreachable: Connection refused"
    },
    "redis": { "status": "UP", "latencyMs": 1 }
  },
  "checkedAt": "2026-03-04T01:14:07Z"
}

The errorMessage field is omitted from JSON when null (healthy components) so that clean responses stay noise-free.

Security

/api/health is listed in the SecurityConfig permit-all block — no Authorization header is required. This endpoint should never return sensitive internal state; error messages are intentionally limited to connection-level failures.

Rate Limiting

The health endpoint falls under the DEFAULT rate-limit tier (100 req/min per client). It is exempt from the Redis fail-open concern because the health check itself reports Redis availability — a design that avoids a circular dependency where the rate limiter silently passes through while health falsely reports UP.

Actuator Lockdown

To avoid inadvertently exposing internal infrastructure information, metrics, or JVM details, the Spring Boot Actuator is heavily restricted. In environments other than local dev, management.endpoints.web.exposure.include is hardcoded to health only.

Frontend Integration (Conceptual)

// Run before auth flow on the login page
async function checkHealth(): Promise<boolean> {
  try {
    const res = await fetch("/api/health", {
      signal: AbortSignal.timeout(5000),
    });
    return res.ok; // false for 503
  } catch {
    return false; // network unreachable
  }
}

Observability & Resilience

Riyura integrates an enterprise-grade observability and resilience stack to ensure high availability, proactive monitoring, and comprehensive debugging capabilities.

Metrics & Dashboarding (Prometheus + Grafana)

Internal application metrics are automatically exposed via Spring Boot Actuator and scraped by Prometheus. Grafana is used to translate this time-series data into visual dashboards.

Actuator Endpoint: /actuator/prometheus (Exposes JVM memory, GC pauses, HTTP request latencies, and thread pool statuses).
Access Prometheus: Available locally at http://localhost:9090.
Access Grafana: Available locally at http://localhost:3000 (Default login is admin / admin).

You can configure alerts in Grafana to notify your team when API error rates spike or cache hit rates drop.

Centralized Logging (Loki)

Standard application logs are streamed in real-time directly into a local Grafana Loki instance via the loki-logback-appender.

Direct Push: The backend batches and pushes logs directly to Loki via HTTP without needing a heavy Promtail sidecar.
Log Querying: Logs can be explored dynamically alongside metrics inside the Grafana UI (http://localhost:3000).

Code Quality (SonarQube)

Static code analysis and security scans are available on-demand using a containerized SonarQube server.

Launch SonarQube: Run docker compose -f docker-compose.sonarqube.yml up -d.
Run Scan: Execute mvn clean verify sonar:sonar -Dsonar.host.url=http://localhost:9000.
View Results: Access the SonarQube dashboard at http://localhost:9000 (Default login is admin / admin).

Circuit Breakers (Resilience4j)

All external API calls (e.g., fetching TMDB metadata) are wrapped in Resilience4j Circuit Breakers to prevent cascading failures.

Behavior: If the external API's failure rate exceeds 50% over the last 10 calls, the circuit opens.
Fail-Fast: While the circuit is open, subsequent calls fail fast, immediately returning a structured 503 Service Unavailable error instead of exhausting Tomcat worker threads by waiting for external timeouts.
Recovery: The circuit automatically transitions to a half-open state after 10 seconds to test if the external service has recovered.

Performance

The backend includes several production-oriented performance optimizations that improve throughput, latency, and resilience under load.

HTTP Connection Pooling & Timeouts

The RestTemplate bean is backed by JDK 21's built-in java.net.http.HttpClient via JdkClientHttpRequestFactory. This provides automatic HTTP connection pooling (connection reuse across requests) with explicit timeouts:

Setting	Value
Connect timeout	5 s
Read timeout	10 s

Without these, a single unresponsive upstream API could stall a request thread indefinitely and cascade into thread pool exhaustion.

CompletableFuture Timeouts

Every CompletableFuture used for parallel upstream API calls carries an 8-second timeout via .orTimeout(8, TimeUnit.SECONDS). This acts as a circuit breaker — if any individual API call hangs, the future completes exceptionally rather than blocking forever. This applies to all parallel fetches in BannerService, ExploreService, AnimeService, SearchService, MovieDetailService, TvDetailsService, TvPlayerService, and HistoryService.

Response Compression

Gzip compression is enabled at the server level for JSON, XML, HTML, and plain text responses above 1 KB:

server:
  compression:
    enabled: true
    mime-types: application/json,application/xml,text/html,text/plain
    min-response-size: 1024

This reduces payload sizes significantly for API responses, especially list-heavy endpoints like explore and search.

Centralized Configuration Properties

Legacy scattered @Value injections are eradicated. Properties interact with application.yaml strictly through typesafe @ConfigurationProperties classes (like TmdbProperties). URL constructions and TMDB parameter concatenations are isolated inside structural builders such as TmdbUrlBuilder acting to replace vulnerable and messy String.format() blocks.

SQL Logging Disabled

Hibernate's show-sql and format_sql are disabled in production configuration. These options log every SQL statement to stdout, which causes measurable overhead from synchronized I/O under concurrent load.

Structured Logging

All services use SLF4J (@Slf4j) instead of System.out.println / System.err.println. This avoids the performance penalty of synchronized stdout writes and enables proper log-level filtering, structured output, and integration with log aggregation tools.

CORS Configuration

CORS is handled exclusively by the global SecurityConfig bean with a centralized CorsConfigurationSource. The allowed origin is read from the APP_FRONTEND_URL environment variable rather than being hardcoded. Allowed headers are explicitly narrowed to Authorization, Content-Type, and Accept instead of using a wildcard. Per-controller @CrossOrigin annotations are not used, avoiding redundant CORS header processing and ensuring consistent origin policy from a single configuration point.

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
.mvn/wrapper		.mvn/wrapper
grafana/provisioning/datasources		grafana/provisioning/datasources
loki		loki
prometheus		prometheus
response		response
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
docker-compose.observability.yml		docker-compose.observability.yml
docker-compose.sonarqube.yml		docker-compose.sonarqube.yml
docker-compose.yml		docker-compose.yml
mvnw		mvnw
mvnw.cmd		mvnw.cmd
pom.xml		pom.xml
watch_party_frontend_implementation.md		watch_party_frontend_implementation.md

Folders and files

Latest commit

History

Repository files navigation

Riyura Backend

Contents

Overview & About

System Architecture

Dependency Inversion Principle (DIP)

Core Modules

Tech Stack

Database & Persistence

Schema Safety (Production Hardening)

Data Integrity

Pagination

Connection Pooling (HikariCP)

Caching

Cache Stampede Prevention

Cache Strategy

Redis Party State

Redis Operational Safety (SCAN)

Concurrency

Virtual Threads (JDK 21)

WebSocket Executors

Dedicated Virtual Thread Executors

Parallel Content Fetching (CompletableFuture)

Content API Client Retry

AI Recommendations (Gemini)

RAG Pipeline

Resilience & Rate Limit Protection

Response Fields

Caching

WebSocket & Watch Parties

Authentication

Party Messaging

Party Lifecycle & Features

Security & Authentication

Input Validation

Global Exception Handling

IP Sanitization

Rate Limiting

Architecture

Rate Limit Tiers

Redis Key Design

Memory Leak Prevention

Response Format

Excluded Paths

Fail-Open Behavior

Topology-Aware Redis Connection

Virtual Thread Safety

Health Check

Probe Strategy

Aggregate Status (Worst-Wins)

HTTP Status Contract

Response Envelope

Security

Rate Limiting

Actuator Lockdown

Frontend Integration (Conceptual)

Observability & Resilience

Metrics & Dashboarding (Prometheus + Grafana)

Centralized Logging (Loki)

Code Quality (SonarQube)

Circuit Breakers (Resilience4j)

Performance

HTTP Connection Pooling & Timeouts

CompletableFuture Timeouts

Response Compression

Centralized Configuration Properties

SQL Logging Disabled

Structured Logging

CORS Configuration

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases