Skip to content

arch/espressif: serialize Wi-Fi RX queue access.#18652

Open
aviralgarg05 wants to merge 3 commits intoapache:masterfrom
aviralgarg05:fix/issue-16915-esp-wlan-rx-queue-lock
Open

arch/espressif: serialize Wi-Fi RX queue access.#18652
aviralgarg05 wants to merge 3 commits intoapache:masterfrom
aviralgarg05:fix/issue-16915-esp-wlan-rx-queue-lock

Conversation

@aviralgarg05
Copy link
Copy Markdown
Contributor

@aviralgarg05 aviralgarg05 commented Mar 30, 2026

Note: Please adhere to Contributing Guidelines.

Summary

This change fixes a race in the common Espressif Wi-Fi lower-half netdev path used by both the Xtensa and RISC-V drivers.

The Wi-Fi RX callback appends packets to netdev_rx_queue, while the netdev upper-half RX thread removes packets from that same queue. The queue is backed by the IOB queue helpers, and those helpers do not provide their own internal serialization. Under sustained receive traffic, the enqueue and dequeue sides can run concurrently and corrupt the queue state.

When that happens, RX buffers can become stranded in the queue, available IOBs drop over time, and throughput degrades until the interface no longer recovers cleanly. This matches the behavior reported in issue #16915.

To fix that, this patch:

  • adds a dedicated RX spinlock to the shared Espressif Wi-Fi netdev private state
  • initializes the lock during device setup
  • takes the lock before clearing the RX queue during ifup
  • takes the lock before enqueueing packets in the Wi-Fi RX callback
  • takes the lock before dequeueing packets for the upper half

The change is intentionally limited to the shared Espressif Wi-Fi netdev implementation:

  • arch/risc-v/src/common/espressif/esp_wlan_netdev.c
  • arch/xtensa/src/common/espressif/esp_wlan_netdev.c

Issue: #16915

Impact

This is a targeted bug fix for Espressif Wi-Fi devices using the common netdev-based lower-half driver.

Expected impact:

  • prevents RX queue corruption caused by concurrent enqueue/dequeue access
  • avoids leaking or stranding IOB-backed RX packets under heavy traffic
  • improves long-run Wi-Fi stability during sustained throughput tests such as iperf
  • applies consistently to both Xtensa and RISC-V Espressif Wi-Fi targets that use the shared implementation

Compatibility impact:

  • no API changes
  • no Kconfig changes
  • no board configuration changes
  • no documentation changes required

Build and maintenance impact:

  • only two driver source files are touched
  • the fix follows the same locking model already used by older Espressif Wi-Fi code paths when protecting RX/TX queues

Testing

Host machine:

  • macOS on Apple Silicon (CONFIG_HOST_MACOS=y, CONFIG_HOST_ARM64=y)

Repository checks completed locally:

  1. Ran style checks directly on the modified files:

    ./tools/checkpatch.sh -f arch/risc-v/src/common/espressif/esp_wlan_netdev.c \
      arch/xtensa/src/common/espressif/esp_wlan_netdev.c

    Result:

    All checks pass.
    
  2. Re-ran patch-level style checks on the final commit:

    ./tools/checkpatch.sh -g HEAD~1..HEAD

    Result:

    All checks pass.
    
  3. Verified there are no whitespace or patch formatting issues:

    git diff --check

    Result: no output

Build verification attempted in isolated temporary source trees so the working
repository was not disturbed:

  1. Configured esp32c6-devkitc:wifi
  2. Configured esp32-devkitc:wifi

Both configurations completed the NuttX configure step far enough to confirm that the modified sources integrate into the selected board setups, but the actual builds could not be completed on this machine because the required Espressif cross-compilers are not installed in the environment.

Observed build blockers:

  • riscv64-unknown-elf-gcc: command not found for esp32c6-devkitc:wifi
  • xtensa-esp32-elf-gcc: command not found for esp32-devkitc:wifi

Representative logs:

Loaded configuration '.config'
Configuration saved to '.config'
/bin/bash: riscv64-unknown-elf-gcc: command not found
Loaded configuration '.config'
Configuration saved to '.config'
/bin/bash: xtensa-esp32-elf-gcc: command not found

Because the required Espressif toolchains and hardware were not available in this environment, I could not run on-device Wi-Fi traffic tests here. The code path and fix were instead verified by:

  • tracing the RX ownership path in the common Espressif netdev lower half
  • comparing it against the older Espressif Wi-Fi implementation, which already serialized queue access
  • confirming that the only shared mutable structure between the RX callback and upper-half RX thread was the unprotected netdev_rx_queue
  • validating the final patch with repeated style and patch checks

Recommended hardware verification for the PR:

  1. Build and flash esp32-devkitc:wifi and esp32c6-devkitc:wifi
  2. Connect to Wi-Fi and confirm normal connectivity with ping
  3. Run repeated iperf sessions and compare /proc/iobinfo before and after
  4. Confirm IOB counts recover normally and throughput does not collapse after sustained traffic

Protect the common Espressif Wi-Fi netdev RX queue with a spinlock. The Wi-Fi RX callback enqueues packets while the netdev upper-half RX thread dequeues them, and the IOB queue helpers are not internally serialized.

Under sustained receive traffic that race can corrupt the queue state, strand IOBs, and degrade throughput over time as reported in issue apache#16915.

Initialize the lock during device setup and use it when clearing the queue on ifup, enqueueing received packets, and dequeuing them for the upper half.

Signed-off-by: aviralgarg05 <gargaviral99@gmail.com>
@github-actions github-actions bot added Arch: risc-v Issues related to the RISC-V (32-bit or 64-bit) architecture Arch: xtensa Issues related to the Xtensa architecture Size: S The size of the change in this PR is small labels Mar 30, 2026
Enable the required Xtensa Espressif interrupt options in the standalone uvc defconfig.

The recent common-source Xtensa update requires CONFIG_ARCH_MINIMAL_VECTORTABLE_DYNAMIC, CONFIG_ARCH_IRQ_TO_NDX, and CONFIG_ARCH_NUSER_INTERRUPTS=2. The include-based debug configs inherit those options already, but the standalone uvc defconfig did not, which breaks the Linux (xtensa-03) CI build on lckfb-szpi-esp32s3/uvc.

Set the required options explicitly so the config matches the current Espressif Xtensa IRQ requirements.

Signed-off-by: aviralgarg05 <gargaviral99@gmail.com>
@eren-terzioglu
Copy link
Copy Markdown
Contributor

Hi,

Please do not merge it without testing it.

For required toolchain please refer to nuttx dockerfile and then check related version of that toolchain support for your target. Seems it does for risc-v also for xtensa

@fdcavalcanti
Copy link
Copy Markdown
Contributor

Please provide testing report with sustained IPERF usage.

@fdcavalcanti
Copy link
Copy Markdown
Contributor

Why can't this fix be done on network driver side?

Also, how are you able to do such fix on a deep part of the source without even knowing how to build said architecture? Sorry but it is quite an odd approach.

@aviralgarg05
Copy link
Copy Markdown
Contributor Author

Why can't this fix be done on network driver side?

Also, how are you able to do such fix on a deep part of the source without even knowing how to build said architecture? Sorry but it is quite an odd approach.

My initial reasoning was that the shared mutable object involved in the suspected race is the driver-owned netdev_rx_queue. The Wi-Fi RX callback enqueues packets into that queue, while the netdev upper-half RX thread dequeues them, and the IOB queue helpers themselves do not provide internal serialization. Based on that, I considered the lower-half driver to be the narrowest place to address the problem.

However, your broader concern is also valid. While I was able to trace the code path and compare it with the older Espressif implementation, that is not a substitute for having the proper toolchain setup and real target testing in place. I should have completed that validation before proposing a fix at this level, and I will correct that before taking this further.

acassis
acassis previously approved these changes Apr 1, 2026
This reverts commit f106761.

Signed-off-by: aviralgarg05 <gargaviral99@gmail.com>
@aviralgarg05 aviralgarg05 force-pushed the fix/issue-16915-esp-wlan-rx-queue-lock branch from e176566 to 2359678 Compare April 1, 2026 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Arch: risc-v Issues related to the RISC-V (32-bit or 64-bit) architecture Arch: xtensa Issues related to the Xtensa architecture Size: S The size of the change in this PR is small

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants