Publication details
- Dynamic decision-making for efficient compression in parallel distributed file systems (Janosch Hirsch), Master's Thesis, School: Universität Hamburg, 2017-08-12
Publication details
Abstract
The technology gap between computational speed, storage capacity and storage speed poses big problems especially for the HPC field. A promising technique to bridge this gap is data reduction through compression. Compression algorithms like LZ4 can reach compression speeds high enough to be applicable for the HPC field. Consequently efforts to integrate compression into the Luste file system are in progress. Client side compression also brings the potential to increase the network throughput. But to be able to fully exploit the compression potential the compression configuration has to be adapted to its environment. The more adaptations to the data structure and machines condition the better the compression effectiveness will be. This objective of this thesis is to design a decision logic that dynamically adapts the compression configuration to maximize a desired trade-off between application speed and compression. Different compression algorithms and the conditions for compression on the client side of a distributed file systems are examined to identify possibilities to apply compression. Finally an implemented prototype of the decision and adaption logic is evaluated with different network speeds and starting points to further improve the concept are given.
BibTeX
@mastersthesis{DDFECIPDFS17, author = {Janosch Hirsch}, title = {{Dynamic decision-making for efficient compression in parallel distributed file systems}}, advisors = {Michael Kuhn and Anna Fuchs}, year = {2017}, month = {08}, school = {Universität Hamburg}, howpublished = {{Online \url{https://wr.informatik.uni-hamburg.de/_media/research:theses:janosch_hirsch_dynamic_decision_making_for_efficient_compression_in_parallel_distributed_file_systems.pdf}}}, type = {Master's Thesis}, abstract = {The technology gap between computational speed, storage capacity and storage speed poses big problems especially for the HPC field. A promising technique to bridge this gap is data reduction through compression. Compression algorithms like LZ4 can reach compression speeds high enough to be applicable for the HPC field. Consequently efforts to integrate compression into the Luste file system are in progress. Client side compression also brings the potential to increase the network throughput. But to be able to fully exploit the compression potential the compression configuration has to be adapted to its environment. The more adaptations to the data structure and machines condition the better the compression effectiveness will be. This objective of this thesis is to design a decision logic that dynamically adapts the compression configuration to maximize a desired trade-off between application speed and compression. Different compression algorithms and the conditions for compression on the client side of a distributed file systems are examined to identify possibilities to apply compression. Finally an implemented prototype of the decision and adaption logic is evaluated with different network speeds and starting points to further improve the concept are given.}, }