netCDF4 zip format vs netCDF4 classic - speed of cdo
Added by John Young 7 months ago
Seeking advice on processing netcdf4 zip format.
running some tests e.g cdo remapbil
on nc4 zip takes locally 300sec vs 8sec on classic
is there a way to process zip format much quicker, I have about 3000 to do ... & with multiple cdo operations
Replies (2)
RE: netCDF4 zip format vs netCDF4 classic - speed of cdo - Added by Ralf Mueller 7 months ago
hi John!
To my XP it is very costly to work with compresses netcdf4 files in data analysis workflows with CDO. I would only use compression before archiving the data. in short: nothing beats classic netcdf2
but you have to keep in mind, that CDO has a certain data layout, where it works really good: large grids, many timesteps per file, possibly large number of variables per file. compression and decompression will always take time on such data sets, so why spending time on it for each step of your workflow. instead decompress first, run your analysis and compress before archival if needed.
If you work with long time series data of a single grid point (or even without any gridpoint, like a global mean temperature), CDO will be a lot slower than xarray for example, simply because xarray can load a lot more data at once whereas CDO will read each timestep separately. The reason for that is CDO's ability to chain operators.
Another important aspect when dealing with compresses netcdf is the chunksize. depending on the data layout of your input, the chunksize can easily slow down (or speed up) the processing by a factor of 5-10. coming up with the right one is a bit of a science. So if you can afford to decompress, doing it first is my best guess atm
cheers
ralf
RE: netCDF4 zip format vs netCDF4 classic - speed of cdo - Added by Ralf Mueller 7 months ago
https://code.mpimet.mpg.de/boards/2/topics/12598?r=12612#message-12612
Uwe wrote something on this 2 years ago