Infinite loop in BandwidthAllocator.allocate() causes CPU 100% under high load (stable-10431)

**Description**
Infinite loop in BandwidthAllocator.allocate() method causes CPU to reach 100% when handling large number of participants. The issue occurs when insufficient bandwidth causes the improve() method to return 0, but the while loop condition oldRemainingBandwidth != remainingBandwidth remains true, leading to infinite iteration. This problem is intermittent and occurs when stress levels exceed 1.0, even with the configured limit of 60 participants per JVB.

**Current behavior**
CPU usage reaches 100% under high load scenarios
Infinite loop in bandwidth allocation algorithm
System becomes unresponsive and load balancer redirects traffic to other JVB instances
Logs show 3,430+ WARNING messages related to TCC packet processing
High RTT values (6000ms+) and packet reordering issues
Intermittent stress level exceeding 1.0 (overloaded state)
Occurs even with 60 participants per JVB limit configured
Stress level fluctuates and occasionally spikes above threshold

**Expected Behavior**
Bandwidth allocation should complete within reasonable iterations
CPU usage should remain within normal limits even under high load
System should handle 60 participants per JVB without infinite loops
Bandwidth allocation should gracefully handle insufficient bandwidth scenarios
Stress level should remain below 1.0 consistently

**Possible Solution**
Add loop counter with maximum iteration limit and early termination logic:
```
var loopCount = 0
val maxLoops = 100
var totalConsumed = 0L

while (oldRemainingBandwidth != remainingBandwidth && loopCount < maxLoops) {
    loopCount++
    oldRemainingBandwidth = remainingBandwidth
    totalConsumed = 0L
    
    for (i in sourceBitrateAllocations.indices) {
        val sourceBitrateAllocation = sourceBitrateAllocations[i]
        if (sourceBitrateAllocation.constraints.isDisabled()) {
            continue
        }

        val consumed = sourceBitrateAllocation.improve(remainingBandwidth, i == 0)
        remainingBandwidth -= consumed
        totalConsumed += consumed
        
        if (remainingBandwidth < 0) {
            oversending = true
        }

        if (sourceBitrateAllocation.isOnStage() && !sourceBitrateAllocation.hasReachedPreferred()) {
            break
        }
    }
    
    // Early termination if no bandwidth was consumed
    if (totalConsumed == 0L) {
        logger.debug("No bandwidth consumed in iteration $loopCount, breaking loop")
        break
    }
}

if (loopCount >= maxLoops) {
    logger.warn("Bandwidth allocation loop exceeded maximum iterations: $loopCount")
}
```

**Steps to reproduce**
Set up a Jitsi Meet conference with 60 participants per JVB (configured limit)
Have participants turn on video simultaneously or in quick succession
Monitor CPU usage and JVB logs
Observe the following symptoms:
Intermittent stress level spikes above 1.0
CPU usage reaching 100%
Repeated WARNING messages in logs
High RTT values and packet processing issues
Stress level exceeding 1.0 even with participant limit

**Environment details**
Jitsi Videobridge version: stable-10431 (Docker image)
Deployment: Docker container
Participants: 60 users per JVB (configured limit)
Scenario: Participants turning on video simultaneously or in quick succession
Stress level: Intermittently exceeds 1.0 (overloaded: true)
Endpoints: 60 (configured limit)
Log evidence: 3,430 WARNING messages, TCC packet processing errors
File location: jvb/src/main/kotlin/org/jitsi/videobridge/cc/allocation/BandwidthAllocator.kt lines 251-272
Method: allocate() method in BandwidthAllocator class
Configuration: 60 participants per JVB limit is set
Docker image: jitsi/jvb:stable-10431
Root cause: The infinite loop occurs when sourceBitrateAllocation.improve() returns 0 due to insufficient bandwidth, but the while loop condition oldRemainingBandwidth != remainingBandwidth remains true, causing infinite iteration. This happens intermittently even with the configured participant limit, causing stress levels to spike above 1.0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Infinite loop in BandwidthAllocator.allocate() causes CPU 100% under high load (stable-10431) #2366

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Infinite loop in BandwidthAllocator.allocate() causes CPU 100% under high load (stable-10431) #2366

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions