Publication details
- Small-file Access in Parallel File Systems (Philip Carns, Sam Lang, Robert Ross, Murali Vilayannur, Julian Kunkel, Thomas Ludwig), In IPDPS '09: Proceedings of the 2009 IEEE International Symposium on Parallel and Distributed Processing, pp. 1–11, IEEE Computer Society (Washington, DC, USA), IPDPS-09, University of Rome, Rome, Italy, ISBN: 978-1-4244-3751-1, 2009
Publication details – URL – DOI
Abstract
Today's computational science demands have resulted in ever larger parallel computers, and storage systems have grown to match these demands. Parallel file systems used in this environment are increasingly specialized to extract the highest possible performance for large I/O operations, at the expense of other potential workloads. While some applications have adapted to I/O best practices and can obtain good performance on these systems, the natural I/O patterns of many applications result in generation of many small files. These applications are not well served by current parallel file systems at very large scale. This paper describes five techniques for optimizing small-file access in parallel file systems for very large scale systems. These five techniques are all implemented in a single parallel file system (PVFS) and then systematically assessed on two test platforms. A microbenchmark and the mdtest benchmark are used to evaluate the optimizations at an unprecedented scale. We observe as much as a 905% improvement in small-file create rates, 1,106% improvement in small-file stat rates, and 727% improvement in small-file removal rates, compared to a baseline PVFS configuration on a leadership computing platform using 16,384 cores.
BibTeX
@inproceedings{SAIPFSCLRV09, author = {Philip Carns and Sam Lang and Robert Ross and Murali Vilayannur and Julian Kunkel and Thomas Ludwig}, title = {{Small-file Access in Parallel File Systems}}, year = {2009}, booktitle = {{IPDPS '09: Proceedings of the 2009 IEEE International Symposium on Parallel and Distributed Processing}}, publisher = {IEEE Computer Society}, address = {Washington, DC, USA}, pages = {1--11}, conference = {IPDPS-09}, organization = {University of Rome}, location = {Rome, Italy}, isbn = {978-1-4244-3751-1}, doi = {http://dx.doi.org/10.1109/IPDPS.2009.5161029}, abstract = {Today's computational science demands have resulted in ever larger parallel computers, and storage systems have grown to match these demands. Parallel file systems used in this environment are increasingly specialized to extract the highest possible performance for large I/O operations, at the expense of other potential workloads. While some applications have adapted to I/O best practices and can obtain good performance on these systems, the natural I/O patterns of many applications result in generation of many small files. These applications are not well served by current parallel file systems at very large scale. This paper describes five techniques for optimizing small-file access in parallel file systems for very large scale systems. These five techniques are all implemented in a single parallel file system (PVFS) and then systematically assessed on two test platforms. A microbenchmark and the mdtest benchmark are used to evaluate the optimizations at an unprecedented scale. We observe as much as a 905\% improvement in small-file create rates, 1,106\% improvement in small-file stat rates, and 727\% improvement in small-file removal rates, compared to a baseline PVFS configuration on a leadership computing platform using 16,384 cores.}, url = {http://www.mcs.anl.gov/uploads/cels/papers/P1571.pdf}, }