Project

General

Profile

netCDF4 zip format vs netCDF4 classic - speed of cdo

Added by John Young 7 months ago

Seeking advice on processing netcdf4 zip format.

running some tests e.g cdo remapbil

on nc4 zip takes locally 300sec vs 8sec on classic

is there a way to process zip format much quicker, I have about 3000 to do ... & with multiple cdo operations


Replies (2)

RE: netCDF4 zip format vs netCDF4 classic - speed of cdo - Added by Ralf Mueller 7 months ago

hi John!

To my XP it is very costly to work with compresses netcdf4 files in data analysis workflows with CDO. I would only use compression before archiving the data. in short: nothing beats classic netcdf2

but you have to keep in mind, that CDO has a certain data layout, where it works really good: large grids, many timesteps per file, possibly large number of variables per file. compression and decompression will always take time on such data sets, so why spending time on it for each step of your workflow. instead decompress first, run your analysis and compress before archival if needed.
If you work with long time series data of a single grid point (or even without any gridpoint, like a global mean temperature), CDO will be a lot slower than xarray for example, simply because xarray can load a lot more data at once whereas CDO will read each timestep separately. The reason for that is CDO's ability to chain operators.

Another important aspect when dealing with compresses netcdf is the chunksize. depending on the data layout of your input, the chunksize can easily slow down (or speed up) the processing by a factor of 5-10. coming up with the right one is a bit of a science. So if you can afford to decompress, doing it first is my best guess atm

cheers
ralf

    (1-2/2)