Publication details

Optimising Scientific Software for Heterogeneous Cluster Computers: Evaluation of Machine Learning Methods for Source Code Classification (Ruben Felgenhauer), Master's Thesis, School: Universität Hamburg, 2021-09-23
Publication details

Abstract

Since high performance compute centres are shifting more to use accelerators like GPUs, vector processors, or many-core CPUs, HPC programmers are often confronted with a very heterogeneous hardware environment. Different computation units have different requirements in order to use them most efficiently. Typically, scientific software is optimised for specific target architectures based upon decisions that are made before it is known which hardware composition will be available at the time of running. This can lead to cluster computers being used under capacity which wastes computational resources and energy. With the evolution and resulting gain in popularity of automatic parallelisation tools like OpenMP and sophisticated static code analysis methods, source code can increasingly be written in a more readable fashion with acceptable performance cuts. Therefore, given the choice between performance and maintainability, it can increasingly be made in favour of the latter. However, at the time of writing, this only goes so far that the programmer decides which sections to parallelise and optimise for which target architecture. Ultimately, to efficiently tackle cluster heterogeneity, the goal should be to automatically find the optimal mapping of sub-programs to computation units and performing the required parallelisation. Central to this task is source code classification. It has been shown by Barchi et al. that machine learning classification can be used to determine a suitable computation unit for OpenCL source code samples. In this thesis, we evaluate machine learning methods for general-purpose classification of program parts to extend this principle to support all ahead-of-time compiled programming languages. First, we combine the ASTNN by Zhang et al., a neural network architecture that can be used for the classification of C source code samples, with RetDec, a retargetable decompiler by Křoustek (Avast Software) based on LLVM which generates C code from machine instructions. This is compared with a straight-forward approach where general-purpose classifiers from scikit-learn are trained on byte-value data of the object code files' .text section. We show that the modified ASTNN outperforms these methods in all of our performed benchmarks, but that it comes with several limitations including high memory consumption and training time, and unresponsiveness of the decompiler on some samples.

BibTeX

@mastersthesis{OSSFHCCEOM21,
	author	 = {Ruben Felgenhauer},
	title	 = {{Optimising Scientific Software for Heterogeneous Cluster Computers: Evaluation of Machine Learning Methods for Source Code Classification}},
	advisors	 = {Jannek Squar and Peter Hauschildt},
	year	 = {2021},
	month	 = {09},
	school	 = {Universität Hamburg},
	howpublished	 = {{Online \url{https://wr.informatik.uni-hamburg.de/_media/research:theses:ruben_felgenhauer_optimising_scientific_software_for_heterogeneous_cluster_computers_evaluation_of_machine_learning_methods_for_source_code_classification.pdf}}},
	type	 = {Master's Thesis},
	abstract	 = {Since high performance compute centres are shifting more to use accelerators like GPUs, vector processors, or many-core CPUs, HPC programmers are often confronted with a very heterogeneous hardware environment. Different computation units have different requirements in order to use them most efficiently. Typically, scientific software is optimised for specific target architectures based upon decisions that are made before it is known which hardware composition will be available at the time of running. This can lead to cluster computers being used under capacity which wastes computational resources and energy. With the evolution and resulting gain in popularity of automatic parallelisation tools like OpenMP and sophisticated static code analysis methods, source code can increasingly be written in a more readable fashion with acceptable performance cuts. Therefore, given the choice between performance and maintainability, it can increasingly be made in favour of the latter. However, at the time of writing, this only goes so far that the programmer decides which sections to parallelise and optimise for which target architecture. Ultimately, to efficiently tackle cluster heterogeneity, the goal should be to automatically find the optimal mapping of sub-programs to computation units and performing the required parallelisation. Central to this task is source code classification. It has been shown by Barchi et al. that machine learning classification can be used to determine a suitable computation unit for OpenCL source code samples. In this thesis, we evaluate machine learning methods for general-purpose classification of program parts to extend this principle to support all ahead-of-time compiled programming languages. First, we combine the ASTNN by Zhang et al., a neural network architecture that can be used for the classification of C source code samples, with RetDec, a retargetable decompiler by Křoustek (Avast Software) based on LLVM which generates C code from machine instructions. This is compared with a straight-forward approach where general-purpose classifiers from scikit-learn are trained on byte-value data of the object code files' .text section. We show that the modified ASTNN outperforms these methods in all of our performed benchmarks, but that it comes with several limitations including high memory consumption and training time, and unresponsiveness of the decompiler on some samples.},
}