User Tools

Site Tools


publication

Publication details

  • Design and Evaluation of Tool Extensions for Power Consumption Measurement in Parallel Systems (Timo Minartz), PhD Thesis, School: Universität Hamburg, 2013-07-03
    Publication detailsURL

Abstract

In an effort to reduce the energy consumption of high performance computing centers, a number of new approaches have been developed in the last few years. One of these approaches is to switch hardware to lower power states in promising parallel application phases. A test cluster is designed with high performance computing nodes supporting multiple power saving mechanisms comparable to mobile devices. Each of the nodes is connected to power measurement equipment to investigates the power saving potential under different load scenarios of the specific hardware. However, statically switching the power saving mechanisms usually increases the application runtime. As a consequence, no energy can be saved. Contrary to static switching strategies, dynamic switching strategies consider the hardware usage in the application phases to switch between the different modes without increasing the application runtime. Even if the concepts are already quite clear, tools to identify application phases and to determine impact on performance, power and energy are still rare. This thesis designs and evaluates tool extensions for power consumption measurement in parallel systems with the final goal to characterize and identify energy-efficiency hot spots in scientific applications. Using offline tracing, the metrics are collected in trace files and can be visualized or post-processed after the application run. The timeline-based visualization tools Sunshot and Vampir are used to correlate parallel applications with the energy-related metrics. With these tracing and visualization capabilities, it is possible to evaluate the quality of energy-saving mechanisms, since waiting times in the application can be related to hardware power states. Using the energy-efficiency benchmark eeMark, typical hardware usage pattern are identified to characterize the workload, the impact on the node power consumption and finally the potential for energy saving. To exploit the developed extensions, four scientific applications are analyzed to evaluate the whole approach. Appropriate phases of the parallel applications are manually instrumented to reduce the power consumption with the final goal of saving energy for the whole application run on the test cluster. This thesis provides a software interface for the efficient management of the power saving modes per compute node to be exploited by application programmers. All analyzed applications consist of several, different calculation-intensive compute phases and have a considerable power and energy-saving potential which cannot be exhausted by traditional, utilization-based mechanisms implemented in the operating system. Reducing the processor frequency in communication and I/O phases can also gain remarkable savings for the presented applications.

BibTeX

@phdthesis{DAEOTEFPCM13,
	author	 = {Timo Minartz},
	title	 = {{Design and Evaluation of Tool Extensions for Power Consumption Measurement in Parallel Systems}},
	advisors	 = {Thomas Ludwig},
	year	 = {2013},
	month	 = {07},
	school	 = {Universität Hamburg},
	howpublished	 = {{Online \url{http://ediss.sub.uni-hamburg.de/volltexte/2013/6230/pdf/Dissertation.pdf}}},
	type	 = {PhD Thesis},
	abstract	 = {In an effort to reduce the energy consumption of high performance computing centers, a number of new approaches have been developed in the last few years. One of these approaches is to switch hardware to lower power states in promising parallel application phases. A test cluster is designed with high performance computing nodes supporting multiple power saving mechanisms comparable to mobile devices. Each of the nodes is connected to power measurement equipment to investigates the power saving potential under different load scenarios of the specific hardware. However, statically switching the power saving mechanisms usually increases the application runtime. As a consequence, no energy can be saved. Contrary to static switching strategies, dynamic switching strategies consider the hardware usage in the application phases to switch between the different modes without increasing the application runtime. Even if the concepts are already quite clear, tools to identify application phases and to determine impact on performance, power and energy are still rare. This thesis designs and evaluates tool extensions for power consumption measurement in parallel systems with the final goal to characterize and identify energy-efficiency hot spots in scientific applications. Using offline tracing, the metrics are collected in trace files and can be visualized or post-processed after the application run. The timeline-based visualization tools Sunshot and Vampir are used to correlate parallel applications with the energy-related metrics. With these tracing and visualization capabilities, it is possible to evaluate the quality of energy-saving mechanisms, since waiting times in the application can be related to hardware power states. Using the energy-efficiency benchmark eeMark, typical hardware usage pattern are identified to characterize the workload, the impact on the node power consumption and finally the potential for energy saving. To exploit the developed extensions, four scientific applications are analyzed to evaluate the whole approach. Appropriate phases of the parallel applications are manually instrumented to reduce the power consumption with the final goal of saving energy for the whole application run on the test cluster. This thesis provides a software interface for the efficient management of the power saving modes per compute node to be exploited by application programmers. All analyzed applications consist of several, different calculation-intensive compute phases and have a considerable power and energy-saving potential which cannot be exhausted by traditional, utilization-based mechanisms implemented in the operating system. Reducing the processor frequency in communication and I/O phases can also gain remarkable savings for the presented applications.},
	url	 = {http://ediss.sub.uni-hamburg.de/volltexte/2013/6230/},
}

publication.txt · Last modified: 2019-01-23 10:26 by 127.0.0.1

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki