msF2FS: Design and Implementation of an NVMe ZNS SSD Optimized F2FS File System

More Info
expand_more

Abstract

The ongoing digitalization of the world, estimated to reach a yearly data generation of 200 Zettabytes by 2025, is putting increasing pressure on system developers to provide systems capable of scaling with future needs. Of particular importance are the data storage systems, providing the means of storing and retrieving the vast amounts of data. One widely adopted storage technology, predicted to become the leading media for future data storage, is flash-based solid state drives (SSD). The complex architecture of flash SSD however introduces several challenges, such as necessary garbage collection, for managing the flash storage. To readily integrate flash SSD into storage systems, the flash management idiosyncrasies are hidden inside the storage device. The hiding of the flash management idiosyncrasies has however been identified to have significant performance implications. As a result, numerous efforts have pushed towards more open flash storage interfaces, with the most recent addition of Zoned Namespace (ZNS) flash SSD. ZNS SSD presents a unique opportunity for storage software and the flash SSD to coordinate the flash management responsibilities.

While the open flash storage interface of ZNS presents a plethora of opportunity in software optimization, current software support is in its early stages, leaving much of its potentials yet to be explored. In this work, we present msF2FS (multi-streamed F2FS), a file system with optimized ZNS integration, based on the de facto standard flash file system F2FS. msF2FS enhances the ZNS integration by leveraging the parallelism capabilities of ZNS SSD, and increasing the coordination between the file system and applications for data placement decision-making. The data placement coordination between file system and application reduces the sub-optimal data placement decisions made by the file system. Evaluations of msF2Fs show the benefit of the application and file system coordination, with the RocksDB application achieving up to 23.19% higher throughput as a result of optimized data placement. We make all developed code of msF2FS publicly available at https://github.com/nicktehrany/msF2FS.