Last week saw the main set of block and IO_uring feature patches for the Linux 6.19 merge window but some additional block subsystem material was merged on Monday. There are various NVMe updates now merged plus enabling per-CPU BIO caching by default to help with file-system performance.
From this merge mostly about landing the NVMe fature updates for Linux 6.19, there’s also patches for enabling the per-CPU BIO caching by default. That turns out to be interesting for helping with Linux file-system performance.
Bytedance engineer is responsible for this default caching change and explained on the patch series:
“For now, per-cpu bio cache was only used in the io_uring + raw block device, filesystem also can use this to improve performance. After discussion, we think it’s better to enable per-cpu bio cache by default.”
The actual patch goes on to provide some hard numbers:
“Since after commit 12e4e8c7ab59 (“io_uring/rw: enable bio caches for IRQ rw”), bio_put is safe for task and irq context, bio_alloc_bioset is safe for task context and no one calls in irq context, so we can enable per cpu bio cache by default.
Benchmarked with t/io_uring and ext4+nvme:
taskset -c 6 /root/fio/t/io_uring -p0 -d128 -b4096 -s1 -c1 -F1 -B1 -R1
-X1 -n1 -P1 /mnt/testfile
base IOPS is 562K, patch IOPS is 574K. The CPU usage of bio_alloc_bioset decrease from 1.42% to 1.22%.The worst case is allocate bio in CPU A but free in CPU B, still use t/io_uring and ext4+nvme:
base IOPS is 648K, patch IOPS is 647K.Also use fio test ext4/xfs with libaio/sync/io_uring on null_blk and nvme, no obvious performance regression.”
Being able to go from 562 to 574K IOPS isn’t bad especially paired with a slight decrease in CPU usage, plus every extra bit of performance adding up.
