-
Notifications
You must be signed in to change notification settings - Fork 796
Open
Labels
#g-orchestrationOrchestration product groupOrchestration product group:releaseReady to write code. Scheduled in a release. See "Making changes" in handbook.Ready to write code. Scheduled in a release. See "Making changes" in handbook.P1Critical: Broken workflow (critical bug), potential vuln, new feature for immediate Fleet needCritical: Broken workflow (critical bug), potential vuln, new feature for immediate Fleet needbugSomething isn't working as documentedSomething isn't working as documentedcustomer-beatrixcustomer-beethovencustomer-blondeletcustomer-cervantescustomer-concordiacustomer-coppeliacustomer-dittebesardcustomer-easterwoodcustomer-eponymcustomer-fairbankcustomer-fiorellacustomer-firenzecustomer-flacourtiacustomer-fouriercustomer-gerticustomer-gispencustomer-granadacustomer-hawkingcustomer-hubblecustomer-invernizzicustomer-julianacustomer-kokkolacustomer-mozartiacustomer-nortiacustomer-numacustomer-nuptelcustomer-pingalicustomer-pingouincustomer-rainercustomer-redqueencustomer-reedtimmercustomer-rembrandtcustomer-rialtocustomer-rochercustomer-rosnercustomer-sarahwucustomer-schurcustomer-seidelcustomer-stoffregencustomer-thumpercustomer-vestacustomer-vincentcustomer-vulcanocustomer-weerstraprospect-benzenbergprospect-culbertsonprospect-onakaprospect-sienaprospect-veeder~help-p1For oncall tickets, P1For oncall tickets, P1~released bugThis bug was found in a stable release.This bug was found in a stable release.
Milestone
Description
Fleet version: v4.81.0
Web browser and operating system: N/A
💥 Actual behavior
Since 4.81.0 was applied, we've noticed an increase in cron (usage_statistics) failure alerts to help-p1 for customer-numa. Cleaning up error:*:json and error:*:count clears the alerts until the number of keys grows again.
The error:*:json keys have the following body:
"[\n {\n \"message\": \"request body read error: read tcp <redacted_ip>:8080-\\u003e<redacted_ip>:40940: i/o timeout\"\n },\n {\n \"message\": \"missing FleetError in chain\",\n \"data\": {\n \"timestamp\": \"2026-02-26T20:50:55Z\"\n },\n \"stack\": [\n \"github.com/fleetdm/fleet/v4/server/platform/endpointer.EncodeError (transport_error.go:78)\",\n \"github.com/fleetdm/fleet/v4/server/service.fleetErrorEncoder (transport_error.go:122)\",\n \"github.com/go-kit/kit/transport/http.Server.ServeHTTP (server.go:117)\",\n \"github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerRequestSize.func2 (instrument_server.go:255)\",\n \"net/http.HandlerFunc.ServeHTTP (server.go:2322)\",\n \"github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerResponseSize.func1 (instrument_server.go:296)\",\n \"net/http.HandlerFunc.ServeHTTP (server.go:2322)\",\n \"github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerCounter.func1 (instrument_server.go:147)\",\n \"net/http.HandlerFunc.ServeHTTP (server.go:2322)\",\n \"github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerDuration.func2 (instrument_server.go:109)\",\n \"net/http.HandlerFunc.ServeHTTP (server.go:2322)\"\n ]\n }\n]"
In the Fleet logs look like the following:
{"component":"http","err":"request body read error: read tcp <redacted_ip>:8080-><redacted_ip>:62524: i/o timeout","level":"info","path":"/api/v1/osquery/distributed/write","took":"25.000126271s","ts":"2026-02-24T07:45:54.817144386Z","uuid":"<redacted>"}
Also in the Fleet logs, it looks like the following error messages began to appear after 4.81.0, and likely related to the errors above.
{"component":"http","err":"Request exceeds the max size limit of 1.049MB. Configure the limit: https://fleetdm.com/docs/configuration/fleet-server-configuration#server-default-max-request-body-size","internal":"Request exceeds the max size limit of 1.049MB, Incoming Content-Length: 105.2MB","level":"info","path":"/api/osquery/log","took":"2.626835901s","ts":"2026-03-02T15:27:16.062914895Z"}
🛠️ To fix
- We need to remove the size limits on the following two osquery endpoints (see
WithRequestBodySizeLimitinhandler.go):
/api/{v1/}osquery/distributed/write/api/{v1/}osquery/log
- Fix them both similar to how we fixed the
/api/{v1/}osquery/carves/blockendpoint in Authenticate carve block endpoint before parsing the "data" field #39353. What's the fix? It requires performing raw JSON parsing on the request body to extract the node key and authenticate it before reading the rest of the JSON body (to prevent DDoS attacks).
🧑💻 Steps to reproduce
These steps:
- Have been confirmed to consistently lead to reproduction in multiple Fleet instances.
- Describe the workflow that led to the error, but have not yet been reproduced in multiple Fleet instances.
- TODO
- TODO
🕯️ More info (optional)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
#g-orchestrationOrchestration product groupOrchestration product group:releaseReady to write code. Scheduled in a release. See "Making changes" in handbook.Ready to write code. Scheduled in a release. See "Making changes" in handbook.P1Critical: Broken workflow (critical bug), potential vuln, new feature for immediate Fleet needCritical: Broken workflow (critical bug), potential vuln, new feature for immediate Fleet needbugSomething isn't working as documentedSomething isn't working as documentedcustomer-beatrixcustomer-beethovencustomer-blondeletcustomer-cervantescustomer-concordiacustomer-coppeliacustomer-dittebesardcustomer-easterwoodcustomer-eponymcustomer-fairbankcustomer-fiorellacustomer-firenzecustomer-flacourtiacustomer-fouriercustomer-gerticustomer-gispencustomer-granadacustomer-hawkingcustomer-hubblecustomer-invernizzicustomer-julianacustomer-kokkolacustomer-mozartiacustomer-nortiacustomer-numacustomer-nuptelcustomer-pingalicustomer-pingouincustomer-rainercustomer-redqueencustomer-reedtimmercustomer-rembrandtcustomer-rialtocustomer-rochercustomer-rosnercustomer-sarahwucustomer-schurcustomer-seidelcustomer-stoffregencustomer-thumpercustomer-vestacustomer-vincentcustomer-vulcanocustomer-weerstraprospect-benzenbergprospect-culbertsonprospect-onakaprospect-sienaprospect-veeder~help-p1For oncall tickets, P1For oncall tickets, P1~released bugThis bug was found in a stable release.This bug was found in a stable release.
Type
Projects
Status
🐣 In progress