The Problem: Bytedance Engineer Identifies Kernel Overhead
Bytedance engineer Fengnan Chang discovered the issue while running 4K random read tests on PCIe Gen5 NVMe SSDs. The bottleneck wasn’t in the hardware itself, but rather in the Linux kernel’s IOmap framework, which handles direct I/O operations.
The root cause was straightforward but significant: the CPU was spending too much time on internal memory allocations and managing the IOmap state machine. This overhead meant the kernel couldn’t efficiently transfer data from these blazingly fast drives, creating a performance ceiling that prevented users from getting the speed their hardware promised.
The Solution: A Streamlined Direct I/O Path
Chang’s fix introduces a simplified direct I/O path specifically optimized for small I/O operations when using IOmap with filesystems like EXT4 and XFS. This streamlined path activates under specific conditions:
- I/O request size is equal to or smaller than the inode blocksize
- The inode itself is not encrypted
By bypassing some traditional IOmap processing for these smaller, direct read requests, the kernel can handle operations far more efficiently. The CPU spends less time on internal management and more time moving data, which translates directly to faster storage performance.
Real Performance Gains
Initial benchmarks on PCIe Gen5 NVMe SSDs show measurable improvements. During 4K random read tests, IOPS jumped from 1.92 million to 2.19 million. Additional testing shows benefits of up to 10 percent on EXT4 and XFS, especially when utilizing IO_uring with higher queue depths on modern PCIe Gen5 NVMe storage.
For context, these aren’t trivial gains. A 10 percent performance boost on already-fast storage means noticeably quicker database operations, faster file transfers, and more responsive systems for data-intensive workloads.
What’s Coming in Linux 7.3
The optimization is now integrated into the VFS subsystem’s “fvfs-7.3.iomap” Git branch and queued for the Linux 7.3 cycle, expected later this year. When it lands, Linux users with PCIe Gen5 NVMe SSDs will finally be able to fully utilize their storage hardware’s raw speed.
This fix is part of a broader wave of kernel improvements. Linux 7.2, which concluded its feature merge window recently, brought SMB2 compression support to KSMBD, bug fixes for the NTFS3 driver, and improvements to F2FS and EROFS. Each update continues the ongoing effort to refine the Linux kernel across multiple subsystems.
Who Benefits
Linux users performing data-intensive tasks will see the most immediate impact. This includes database administrators, content creators working with large files, and anyone running AI workloads or handling massive datasets. For these users, the PCIe Gen5 NVMe bottleneck fix means their storage hardware will finally perform as intended, translating to faster completion times and more responsive systems overall.
Follow Hashlytics on Bluesky, LinkedIn, Telegram, and X to Get Instant Updates



