User Tools

Site Tools


publication

Publication details

Abstract

The ever-present gap between the growth of computational power in contrast to the capabilities of storage and network technologies makes I/O the bottleneck of a system. This is especially true for large-scale systems found in HPC. Over the years a number of different storage devices emerged each providing their own advantages and disadvantages. Fast memory elements such as RAM are very powerful but come with high acquisition costs. With limited budgets and the requirement for long-term storage over several decades, a different approach is needed. This led to a hierarchical structuring of different technologies atop of one another. While tape systems are capable of preserving large amounts of data reliably over 30 years, they are also the most affordable choice for this purpose. They form the bottom layer of the hierarchy, whereas high-throughput and low-latency devices like non-volatile RAM are located at the top. As the upper layers are limited in capacity due to their price, data migration policies are essential for managing the file movement between the different tiers in order to maximise the system's performance. Since data loss and downtime are a concern, these policies have to be evaluated in advance. Simulations of such hierarchical storage systems provide an alternative way of analysing the effects of placement strategies. Although there is consent that a generic simulator of diverse storage systems able to represent complex infrastructures is indispensable, the existing proposals lack a number of features. In this thesis, an emulator for hierarchical storage systems has been designed and implemented supporting a wide range of existing and future hardware as well as a flexible topology model. A second library is conceptualised on top offering a file handling interface to the application layer as well as a set of data migration schemes. Only minor changes are required to run an application on the emulated storage system. The validation shows a maximum performance of both libraries in the range of 7 to 9 GB per second when executed in RAM. Analysing the impact of the used block size lead to the recommendation to use at least 100 kB in order to maximise the resulting performance.

BibTeX

@mastersthesis{SOSTADMD17,
	author	 = {Kira Duwe},
	title	 = {{Simulation of Storage Tiering and Data Migration}},
	advisors	 = {Michael Kuhn},
	year	 = {2017},
	month	 = {09},
	school	 = {Universität Hamburg},
	howpublished	 = {{Online \url{http://edoc.sub.uni-hamburg.de/informatik/volltexte/2017/233/pdf/master_duwe.pdf}}},
	type	 = {Master's Thesis},
	abstract	 = {The ever-present gap between the growth of computational power in contrast to the capabilities of storage and network technologies makes I/O the bottleneck of a system. This is especially true for large-scale systems found in HPC. Over the years a number of different storage devices emerged each providing their own advantages and disadvantages. Fast memory elements such as RAM are very powerful but come with high acquisition costs. With limited budgets and the requirement for long-term storage over several decades, a different approach is needed. This led to a hierarchical structuring of different technologies atop of one another. While tape systems are capable of preserving large amounts of data reliably over 30 years, they are also the most affordable choice for this purpose. They form the bottom layer of the hierarchy, whereas high-throughput and low-latency devices like non-volatile RAM are located at the top. As the upper layers are limited in capacity due to their price, data migration policies are essential for managing the file movement between the different tiers in order to maximise the system's performance. Since data loss and downtime are a concern, these policies have to be evaluated in advance. Simulations of such hierarchical storage systems provide an alternative way of analysing the effects of placement strategies. Although there is consent that a generic simulator of diverse storage systems able to represent complex infrastructures is indispensable, the existing proposals lack a number of features. In this thesis, an emulator for hierarchical storage systems has been designed and implemented supporting a wide range of existing and future hardware as well as a flexible topology model. A second library is conceptualised on top offering a file handling interface to the application layer as well as a set of data migration schemes. Only minor changes are required to run an application on the emulated storage system. The validation shows a maximum performance of both libraries in the range of 7 to 9 GB per second when executed in RAM. Analysing the impact of the used block size lead to the recommendation to use at least 100 kB in order to maximise the resulting performance.},
	url	 = {http://edoc.sub.uni-hamburg.de/informatik/volltexte/2017/233/},
}

publication.txt · Last modified: 2019-01-23 10:26 by 127.0.0.1

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki