BibTeX

@misc{EODSBATFMW13,
	author	 = {Johann Weging},
	title	 = {{Evaluation of Different Storage Backends and Technologies for MongoDB}},
	advisors	 = {Michael Kuhn},
	year	 = {2013},
	month	 = {02},
	school	 = {Universität Hamburg},
	type	 = {Bachelor's Thesis},
	abstract	 = {Today's data base management systems store their data in conventional file systems. Some of them allocate files of the size of multiple gigabyte and handle the data alignment by them self. In theory these data base management systems can work with just contiguous space of memory for their data base files. This these attempt to reduce the over head produces by file operations, by implementing a object store back end for a data base management system. The reference software used in this thesis is MongoDB for data base management system, JZFS for the object which works on top on the ZFS file system. Unfortunately while developing the new storage back end it was discovered that this implementation is to extensive for a bachelor thesis. The development is documented and shown up until this point. Further work that has to be done is finishing the storage back end for MongoDB and evaluate it. The main questing is if a object store is really capable of reducing the I/O overhead of MongoDB. This thesis covers two parts. Fist the implementation of a object store storage back end for MongoDB based on JZFS and ZFS. It makes the attempt to implement this storage solution but while developing the storage back end it was discovered that the implementation is to extensive for a bachelor thesis. The development is documented and shown up until this point. After the implementation was consider too extensive, the focus was moved towards file system benchmarking. The benchmarking is done by using the meta data benchmark mdtest. It covers the file systems ext4, XFS, btrfs and ZFS on different hardware setups. Every file system was benchmarked on a HDD and a SSD, in addition ZFS was benchmarked on a HDD using a SSD for read and write cache. It turns out that ZFS is still suffering some serious meta data performance bottlenecks. Surprising is that the HDD with the SSD cache performs nearly as good as ZFS on top of a pure SSD setup. Btrfs performs quit well, a odd thing about btrfs is that it in some cases performs better on the HDD than on the SSD and when creating files or directories it outperformed the other file systems by far. Ext4 doesn't seem to scale against multiple threads accessing shared data, the performance mostly stays the same or sometimes even drops. Only with two threads the performance increases at some operations. XFS performed quit well in most of the test cases, there was only one odd case when reading directory stats, one thread on the HDD was faster than one thread on the SDD, but when increasing the thread count on the HDD the performance drops rapidly. Further work at this point would be to identify the bottlenecks of ZFS which slows it down in every case except for file removal and directory removal.},
}