Meta’s great Linux engineering team have been working through some fresh performance optimizations recently from optimizing /proc/interrupts outputs to renewing their investment in jemalloc. A new Linux kernel patch this week provides another optimization to avoid a possible situation of throttling the TCP throughput unnecessarily on Linux systems.
JP Kobryn sent out a mm/vmpressure patch to skip socket pressure for costly order reclaim. Kobryn explained of the situation with the patch this week:
“When kswapd reclaims at high order due to fragmentation, vmpressure() can report poor reclaim efficiency even though the system has plenty of free memory. This is because kswapd scans many pages but finds little to reclaim – the pages are actively in use and don’t need to be freed. The resulting scan:reclaim ratio triggers socket pressure, throttling TCP throughput unnecessarily.
Net allocations do not exceed order 3 (PAGE_ALLOC_COSTLY_ORDER), so high order reclaim difficulty should not trigger socket pressure. The kernel already treats this order as the boundary where reclaim is no longer expected to succeed and compaction may take over.
Make vmpressure() order-aware through an additional parameter sourced from scan_control at existing call sites. Socket pressure is now only asserted when order <= PAGE_ALLOC_COSTLY_ORDER.
Memcg reclaim is unaffected since try_to_free_mem_cgroup_pages() always uses order 0, which passes the filter unconditionally. Similarly, vmpressure_prio() now passes order 0 internally when calling vmpressure(), ensuring critical pressure from low reclaim priority is not suppressed by the order filter.”
No specific performance numbers were shared as part of the patch. The patch is out for review on the Linux kernel mailing list.
