Linux block maintainer and IO_uring lead developer Jens Axboe recently was debugging some slowdowns in the AHCI/SCSI code with IO_uring usage. When turning to Claude AI to help in sorting through the issue, patches were devised that can deliver up to a “literally yield a 50-80x improvement on the io_uring side for idle systems.” The code is on its way to the Linux kernel.
Last week Jens Axboe posted a patch series of these IO_uring fixes where he explained:
“Patch 1 here is the real meat of this, patch 2 is just a slight improvement. For patch 1, it can literally yield a 50-80x improvement on the io_uring side for idle systems, where ppoll() ends up sleeping for 500 msec while there’s IO to submit! I noticed this running the io_uring regression tests in a vm, where I use a variety of block devices for some of the tests. They would often randomly time out on AHCI devices, while running them on a virtio-blk or nvme device would finish in one second or so. I then wrote a reproducer to try and grok this and had claude dive into this, which helped me better grasp the various event loops.”
The real kicker beyond the 50-80x improvement for IO_uring is that the main patch is just one line of actual code (plus a few lines of comments). The one line avoids the ppoll() sleeping up to 499ms before submitting.
Axboe also posted on X around this “60-80x improvement in performance” for IO_uring as well as his adventures with using AI (Claude) for dealing with this issue. In the process Claude ended up partially destroying his virtual disk for the VM used for testing but was then able to recover.
Axboe commented today that both of these patches are now staged for inclusion in making their way to the mainline Linux kernel.
