Cacheless NetCDF
What does the cache do?
For files in the classic format, the NetCDF-Library uses a private cache. It works by dividing the NetCDF-file into pages (the size depends on the system prefered IO block size or is set to 8 kiB by default), that can be paged in/out separately. This has a number of effects:
- All read/write requests are divided into a large number of requests.
- When data is written to a location, that is not currently present in the cache, this operation is converted into a read/modify/write sequence.
- When a new record is initialized with fill values and it is bigger than the amount of cache the library uses, the fill data is actually written to disk. Writing the real output data will trigger a new read/modify/write cycle.
- The cache doubles the caching efforts of modern operating systems which already employ the entire unused memory as a disk cache.
What we have done
After we understood the impact of the cache on large input/output operations, we decided to change the NetCDF source code to deactivate the cache completely and prepared a patch for it. This patch will be handed back to the NetCDF-developers, so we hope that there will be an official version soon which allows the user to switch off the cache.
What to do to get a cachefree NetCDF
Applying the patch:
- Get the source code of NetCDF version 4.1.3 and extract it (NetCDF download page). Unfortunately, our patch does not work with newer versions of the library since it conflicts with other changes.
- Apply the patch to the NetCDF source code by running “patch -p1 < netcdf-cacheless.patch” inside the NetCDF directory.
- Run “./configure”, “make” and “make install” as usual.
Changing the caching behavior:
- The patch adds two #defines to the file include/ncio.h, commenting/uncommenting them controls the behavior of the patch.
- The first one is used to switch the cache on/off,
- the second one allows to output a trace of all reads/writes when the cache is switched off.
- Of course, each change in include/ncio.h neccessitates a recompilation/reinstallation.
Changes
2012-07-30: Eliminated a bug in the handling of incomplete reads/writes.