Publication details

Optimized Force Calculation in Molecular Dynamics Simulations for the Intel Xeon Phi (Nikola Tchipev, Amer Wafai, Colin W. Glass, Wolfgang Eckhardt, Alexander Heinecke, Hans-Joachim Bungartz, Philipp Neumann), In Euro-Par 2015: Parallel Processing Workshops, Lecture Notes in Computer Science (9523), pp. 774–785, Springer (Berlin, Heidelberg), Euro-Par 2015, Vienna, ISBN: 978-3-319-27307-5, 2015
Publication details – DOI

Abstract

We provide details on the shared-memory parallelization for manycore architectures of the molecular dynamics framework ls1-mardyn, including an optimization of the SIMD vectorization for multi-centered molecules. The novel shared-memory parallelization scheme allows to retain Newton's third law optimization and exhibits very good scaling on many-core devices such as a full Xeon Phi card running 240 threads. The Xeon Phi can thus be exploited and delivers comparable performance as IvyBridge nodes in our experiments.

BibTeX

@inproceedings{OFCIMDSFTI15,
	author	 = {Nikola Tchipev and Amer Wafai and Colin W. Glass and Wolfgang Eckhardt and Alexander Heinecke and Hans-Joachim Bungartz and Philipp Neumann},
	title	 = {{Optimized Force Calculation in Molecular Dynamics Simulations for the Intel Xeon Phi}},
	year	 = {2015},
	booktitle	 = {{Euro-Par 2015: Parallel Processing Workshops}},
	publisher	 = {Springer},
	address	 = {Berlin, Heidelberg},
	series	 = {Lecture Notes in Computer Science},
	number	 = {9523},
	pages	 = {774--785},
	conference	 = {Euro-Par 2015},
	location	 = {Vienna},
	isbn	 = {978-3-319-27307-5},
	doi	 = {http://dx.doi.org/10.1007/978-3-319-27308-2_62},
	abstract	 = {We provide details on the shared-memory parallelization for manycore architectures of the molecular dynamics framework ls1-mardyn, including an optimization of the SIMD vectorization for multi-centered molecules. The novel shared-memory parallelization scheme allows to retain Newton's third law optimization and exhibits very good scaling on many-core devices such as a full Xeon Phi card running 240 threads. The Xeon Phi can thus be exploited and delivers comparable performance as IvyBridge nodes in our experiments.},
}