-
|
I've ported KDiskMark to Mac and been doing some comparisons with native benchmarking utilities I have there. Turns out I was getting more than 500 syslog messages per second from the kernel: causing the This seems to happen for any testing (targetting an SSD connected over USB3) that uses 4K blocks, using the I didn't get any hits on the function, so maybe this is not a known issue, or not an issue with newer versions of the OS? |
Beta Was this translation helpful? Give feedback.
Replies: 8 comments
-
|
This isn't a fio issue. Google the warning and it'll explain this osx oddity. |
Beta Was this translation helpful? Give feedback.
-
|
Hello @RJVB: You didn't mention which version of macOS/OSX you are using and it would help to have the other information requested in https://github.com/axboe/fio/blob/master/REPORTING-BUGS added here too. I'm aware you don't know if it's a bug in fio but that information can still help. Doing a Google search for "aio_queue_async_request(): too many in flight for proc" came back with a Gemini summary that mentioned it's OSX complaining that "a specific process has reached the maximum allowed limit of 16 outstanding asynchronous I/O (AIO) requests." (!) It also said that "This error is historically common on older macOS versions (10.6–10.13) and usually appears in the system log (kernel.log) during heavy I/O operations.". The AI summary also linked to a blog post from 2013 which talks about VirtualBox triggering "too many in flight for proc" kernel message. That post mentions the message but it also mentions some sysctl tunables. On my macOS 15.7.3 laptop here are their current (default) values: I then tried a Google search for "macos posix aio maximum 16" and a bunch of the links on the first page mention a default outstanding I/O limit of 16. So as Jens said I think this is a limitation when you try and use macOS/OSX POSIX AIO. |
Beta Was this translation helpful? Give feedback.
-
|
On Saturday February 21 2026 09:16:21 Sitsofe Wheeler wrote:
You didn't mention which version of macOS/OSX you are using and it would help to have the other information requested in https://github.com/axboe/fio/blob/master/REPORTING-BUGS added here too. I'm aware you don't know if it's a bug in fio but that information can still help.
It's an older machine, limited to I think 10.13 but kept under 10.9.5 - but I see that the 15.7 kernel has the same default limits as mine has.
Fio is at v3.41-110-g1a04e5
So as Jens said I think this is a macOS/OSX limitation when you try and use POSIX aio.
That was my assumption too. That still doesn't mean fio couldn't (or shouldn't) try not to run into it if that's in any way possible, just as it does with e.g. capping the requested IO depth when using a synchronous IO engine. I probably would never have noticed anything beyond the fans spinning up (in itself not surprising in a benchmarking/stress-testing application) if I didn't always have a view on my system.log open, and it seems pretty likely that the measured results can not be reliable when this issue is being triggered. And that *does* seem like a fio issue to me, even if it's a case of garbage-in, garbage-out.
This is a bit over my head, but the explanation that this concerns outstanding open aio async requests suggests that this too might be avoidable by capping the IO depth (or more likely the IO depth per requested thread/job). I'll experiment with that.
FWIW, posixaio doesn't necessarily give much better throughputs than, say, psync, but IIRC latencies are shorter and the testing requires a lot less CPU (esp. kernel CPU time, surprisingly - and when no issues are being triggered of course ^^).
|
Beta Was this translation helpful? Give feedback.
-
|
René J.V. Bertin wrote on 20260221::19:32:38 re: "Re: [axboe/fio] [Mac] aio_queue_async_request(): too many in flight for proc: 16. (Discussion #2062)"
This is a bit over my head, but the explanation that this concerns outstanding open aio async requests suggests that this too might be avoidable by capping the IO depth (or more likely the IO depth per requested thread/job). I'll experiment with that.
Bingo - and as you probably already surmised, numjobs has no importance as fio spawns forks and not threads (I've tried up to 16 x 7 to go above the aiomax limit of 90, without effect).
An IO depth of One above the kernel max triggers the issue:
```
$ time fio --ioengine=posixaio --randrepeat=0 --refill_buffers=1 --end_fsync=1 --direct=1 --rwmixread=70 --filename=/Volumes/VMs/tmp/kdiskmark-L62363.tmp --name=randread --size=1024m --zero_buffers=0 --bs=4k --runtime=10 --rw=randread --iodepth=17 --numjobs=1
randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=17
fio-3.41-110-g1a04e5
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=42.4MiB/s][r=10.8k IOPS][eta 00m:00s]
randread: (groupid=0, jobs=1): err= 0: pid=62409: Sat Feb 21 19:56:21 2026
read: IOPS=10.9k, BW=42.5MiB/s (44.6MB/s)(425MiB/10002msec)
slat (nsec): min=0, max=6753.0k, avg=3245.21, stdev=22744.64
clat (usec): min=85, max=9212, avg=1456.50, stdev=193.38
lat (usec): min=225, max=9346, avg=1459.74, stdev=194.11
clat percentiles (usec):
| 1.00th=[ 1004], 5.00th=[ 1156], 10.00th=[ 1237], 20.00th=[ 1336],
| 30.00th=[ 1385], 40.00th=[ 1418], 50.00th=[ 1450], 60.00th=[ 1483],
| 70.00th=[ 1532], 80.00th=[ 1582], 90.00th=[ 1663], 95.00th=[ 1762],
| 99.00th=[ 1926], 99.50th=[ 2024], 99.90th=[ 2311], 99.95th=[ 2442],
| 99.99th=[ 3490]
bw ( KiB/s): min=42448, max=44328, per=100.00%, avg=43587.79, stdev=425.01, samples=19
iops : min=10612, max=11082, avg=10896.95, stdev=106.25, samples=19
lat (usec) : 100=0.01%, 250=0.01%, 500=0.05%, 750=0.14%, 1000=0.73%
lat (msec) : 2=98.48%, 4=0.57%, 10=0.01%
cpu : usr=8.11%, sys=16.43%, ctx=110646, majf=0, minf=36
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=46.2%, 16=53.7%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=99.1%, 8=0.8%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=108883,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=17
Run status group 0 (all jobs):
READ: bw=42.5MiB/s (44.6MB/s), 42.5MiB/s-42.5MiB/s (44.6MB/s-44.6MB/s), io=425MiB (446MB), run=10002-10002msec
0.894 user_cpu 1.685 kernel_cpu 0:10.34 total_time 24.8%CPU {13320192M 0F 3962R 0I 0O 0k 65817w 46356c}
```
With IO depth 16 it doesn't trigger
```
$ time fio --ioengine=posixaio --randrepeat=0 --refill_buffers=1 --end_fsync=1 --direct=1 --rwmixread=70 --filename=/Volumes/VMs/tmp/kdiskmark-L62363.tmp --name=randread --size=1024m --zero_buffers=0 --bs=4k --runtime=10 --rw=randread --iodepth=16 --numjobs=1
randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=16
fio-3.41-110-g1a04e5
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=43.3MiB/s][r=11.1k IOPS][eta 00m:00s]
randread: (groupid=0, jobs=1): err= 0: pid=62425: Sat Feb 21 20:00:06 2026
read: IOPS=11.1k, BW=43.4MiB/s (45.5MB/s)(434MiB/10002msec)
slat (nsec): min=0, max=933000, avg=2766.49, stdev=9411.64
clat (usec): min=33, max=5689, avg=1425.03, stdev=180.35
lat (usec): min=250, max=5690, avg=1427.79, stdev=179.60
clat percentiles (usec):
| 1.00th=[ 988], 5.00th=[ 1139], 10.00th=[ 1221], 20.00th=[ 1319],
| 30.00th=[ 1369], 40.00th=[ 1401], 50.00th=[ 1418], 60.00th=[ 1450],
| 70.00th=[ 1483], 80.00th=[ 1532], 90.00th=[ 1631], 95.00th=[ 1729],
| 99.00th=[ 1893], 99.50th=[ 2008], 99.90th=[ 2376], 99.95th=[ 2638],
| 99.99th=[ 4621]
bw ( KiB/s): min=42712, max=44918, per=99.99%, avg=44477.10, stdev=490.35, samples=20
iops : min=10678, max=11229, avg=11119.25, stdev=122.57, samples=20
lat (usec) : 50=0.01%, 250=0.02%, 500=0.07%, 750=0.18%, 1000=0.84%
lat (msec) : 2=98.37%, 4=0.50%, 10=0.02%
cpu : usr=6.53%, sys=12.34%, ctx=117141, majf=0, minf=36
IO depths : 1=0.1%, 2=0.1%, 4=0.2%, 8=49.3%, 16=50.5%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=98.2%, 8=1.7%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=111226,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=16
Run status group 0 (all jobs):
READ: bw=43.4MiB/s (45.5MB/s), 43.4MiB/s-43.4MiB/s (45.5MB/s-45.5MB/s), io=434MiB (456MB), run=10002-10002msec
0.736 user_cpu 1.276 kernel_cpu 0:10.34 total_time 19.3%CPU {13320192M 0F 3960R 0I 1O 0k 79909w 38651c}
```
And here's what happens with a depth of 32 (the original benchmark with which I ran into the issue):
```
$ time fio --ioengine=posixaio --randrepeat=0 --refill_buffers=1 --end_fsync=1 --direct=1 --rwmixread=70 --filename=/Volumes/VMs/tmp/kdiskmark-L62363.tmp --name=randread --size=1024m --zero_buffers=0 --bs=4k --runtime=10 --rw=randread --iodepth=32 --numjobs=1
randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=32
fio-3.41-110-g1a04e5
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=43.0MiB/s][r=11.0k IOPS][eta 00m:00s]
randread: (groupid=0, jobs=1): err= 0: pid=62438: Sat Feb 21 20:04:43 2026
read: IOPS=11.1k, BW=43.3MiB/s (45.4MB/s)(433MiB/10002msec)
slat (nsec): min=0, max=400000, avg=2670.50, stdev=7030.90
clat (usec): min=24, max=4900, avg=1431.37, stdev=182.34
lat (usec): min=247, max=4902, avg=1434.04, stdev=181.50
clat percentiles (usec):
| 1.00th=[ 979], 5.00th=[ 1123], 10.00th=[ 1205], 20.00th=[ 1319],
| 30.00th=[ 1369], 40.00th=[ 1401], 50.00th=[ 1434], 60.00th=[ 1467],
| 70.00th=[ 1500], 80.00th=[ 1549], 90.00th=[ 1647], 95.00th=[ 1729],
| 99.00th=[ 1893], 99.50th=[ 1991], 99.90th=[ 2245], 99.95th=[ 2442],
| 99.99th=[ 3458]
bw ( KiB/s): min=43482, max=44824, per=100.00%, avg=44369.30, stdev=298.01, samples=20
iops : min=10870, max=11206, avg=11092.30, stdev=74.58, samples=20
lat (usec) : 50=0.01%, 250=0.01%, 500=0.07%, 750=0.16%, 1000=0.94%
lat (msec) : 2=98.37%, 4=0.44%, 10=0.01%
cpu : usr=7.58%, sys=15.42%, ctx=113508, majf=0, minf=36
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=45.7%, 16=54.2%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=99.1%, 8=0.9%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=110945,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
READ: bw=43.3MiB/s (45.4MB/s), 43.3MiB/s-43.3MiB/s (45.4MB/s-45.4MB/s), io=433MiB (454MB), run=10002-10002msec
```
Not surprisingly the results don't seem significantly different, but that's probably because they all represent an effective IO depth of 16. I clearly have no way of telling what numbers to expect for iodepth=32 ...
|
Beta Was this translation helpful? Give feedback.
-
When the XNU kernel prints the "too many in flight for proc" message it returns Something to question is whether it is sensible to log a kernel message every single time the same process hits this limit as it leads to a lot of time spent just printing messages and effectively becomes a DoS. It looks like Apple came to a similar conclusion around XNU 7195 which according to https://opensource.apple.com/releases/ shipped with macOS 11. This brings us back to what the Gemini AI said in my original comment ("This error is historically common on older macOS versions (10.6–10.13)"). I think it's a little off in its ending range (I think you would still see the message in 10.15) but ultimately (as Jens stated) this is macOS/OSX kernel behaviour and further the high kernel CPU usage due to excessive logging is fixed in macOS 11+. |
Beta Was this translation helpful? Give feedback.
-
|
Another agree to disagree then...
I don't see how this is any different than capping the iodepth when a synchronous engine is in use. The logging issue is just a side-effect which turns out to have little consequence on the benchmark (unless maybe if you're on a really old machine with only 2 cores). You're surely aware that the not every message is logged by the syslog daemon. Hurray for people who have a new enough Mac that they can run a kernel that is fixed in that regard, but they'll still not get the result they asked for, likely enough without being aware of it. In the meantime there must still be plenty people using Macs capped at 10.15 or earlier, simply because these were good machines of a kind and quality that Apple no longer make. Mine is from 2011 but it is still fast enough for my current needs.
Either way, I found a workaround that was trivial to add to the one I already added to my KDiskMark fork to avoid the json issue.
|
Beta Was this translation helpful? Give feedback.
-
|
End of the day, OSX is doing stupid things by exposing a tribal way to trigger a kernel error message. There is ZERO reason to do this, and Apple also realized that and fixed it. EOD, let's not waste any more time on this. |
Beta Was this translation helpful? Give feedback.
-
|
Indeed, let's not waste any more time on something that was never the main issue... |
Beta Was this translation helpful? Give feedback.
Hello @RJVB:
You didn't mention which version of macOS/OSX you are using and it would help to have the other information requested in https://github.com/axboe/fio/blob/master/REPORTING-BUGS added here too. I'm aware you don't know if it's a bug in fio but that information can still help.
Doing a Google search for "aio_queue_async_request(): too many in flight for proc" came back with a Gemini summary that mentioned it's OSX complaining that "a specific process has reached the maximum allowed limit of 16 outstanding asynchronous I/O (AIO) requests." (!) It also said that "This error is historically common on older macOS versions (10.6–10.13) and usually appears in the system log (kernel.lo…