Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 30172
Analysis of Long-Term File System Activities on Cluster Systems

Authors: Hyeyoung Cho, Sungho Kim, Sik Lee

Abstract:

I/O workload is a critical and important factor to analyze I/O pattern and to maximize file system performance. However to measure I/O workload on running distributed parallel file system is non-trivial due to collection overhead and large volume of data. In this paper, we measured and analyzed file system activities on two large-scale cluster systems which had TFlops level high performance computation resources. By comparing file system activities of 2009 with those of 2006, we analyzed the change of I/O workloads by the development of system performance and high-speed network technology.

Keywords: I/O workload, Lustre, GPFS, Cluster File System

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1077742

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1101

References:


[1] John K. Ousterhout, Hervg Da Costa, David Harrison, John A. Kunze, Mike Kupfer, and James G. Thompson, "A Trace-Driven Analysis of the UNIX 4.2 BSD File System," ACM SIGOPS Operating Systems Review archive, Volume 19, Issue 5, pp. 15~24, 1985.
[2] PVFS web size, http://www.pvfs.org
[3] Lustre web site, http://wiki.lustre.org
[4] GPFS Wikipedia, http://en.wikipedia.org/wiki/GPFS
[5] Hyeyoung Cho, Sungho Kim and SangDong Lee, "Design and Implementation of Shared Memory based Parallel File System Logging Method for High Performance Computing," Volume 45, 2008.
[6] Hyeyoung Cho, Kwangho Cha and Sungho Kim, "Analysis of File System Workloads on Hamel Cluster System," 2006 Autumn Conference, Korea Information Processing Society, 2006.
[7] M. Satyanarayanan, "A Study of File Sizes and Functional Lifetimes," In Proceedings of the 8th Symposium on Operating Systems Principles, pp. 96-108, 1981.
[8] John K. Ousterhout, Hervg Da Costa, David Harrison, John A. Kunze, Mike Kupfer, and James G. Thompson, "A Trace-Driven Analysis of the UNIX 4.2 BSD File System," ACM SIGOPS Operating Systems Review archive, Volume 19, Issue 5, pp. 15~24, 1985.
[9] Timothy J. Gibson and Ethan L. Miller, "Long-Term File Activity Patterns in a UNIX Workstation Environment," in the Proceedings of the 15th IEEE Symposium on Mass Storage Systems, pp. 355-272, 1998.
[10] Allen B. Downey, "The structural cause of file size distributions," ACM SIGMETRICS Performance Evaluation Review, Volume 29, pp. 328 - 329, 2001.
[11] Drew Roselli, Jacob R. Lorch,, "A comparison of file system workloads," USNIX, 2002.
[12] Nils Nieuwejaar , David Kotz , Apratim Purakayastha , Carla Schlatter Ellis , Michael L. Best, "File-Access Characteristics of Parallel Scientific Workloads," IEEE Transactions on Parallel and Distributed Systems, v.7 n.10, pp.1075-1089, October 1996.
[13] Phyllis E. CrandallRuth A. AydtAndrew A. ChienDaniel A. Reed, "Input/Output characteristics of scalable parallel applications," in the Proceedings of the ACM/IEEE Supercomputing conference, 1995.
[14] Evgenia Smirni and Daniel A. Reed, "Workload characterization of input/output intensive parallel applications," In the Proceedings of the Conference on Computer Performance Evaluation Modeling Techniques and Tools for computer performance evaluation, Volume 1245, LNCS, pp 169-180, June 1997.
[15] Top500 Supercomputing Website, http://www.top500.org