Performance of cdo with netCDF4
Added by Arne Kriegsmann about 11 years ago
I noticed that the format of the inputfiles significantly influences the time that cdo needs to perform it's actions.
For example, a file in netCDF4 takes more than 4 times longer than the same data in netCDF format.
To avoid long computing times, I usually start to convert netCDF4 data into netCDF format. Is there a way to increase the performance of cdo for netCDF4?
Regards
Arne
PS:
Climate Data Operators version 1.5.9 (http://code.zmaw.de/projects/cdo)
Compiler: gcc -std=gnu99 -fopenmp -pthread -O2
version: gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-54)
with: PTHREADS OpenMP NC4 SZ Z JASPER PROJ.4
Compiled: by k202066 on c5comp1.dkrz.de (x86_64-unknown-linux-gnu) Feb 1 2013 15:13:17
CDI library version : 1.5.9 of Feb 1 2013 15:13:11
CGRIBEX library version : 1.5.6 of Dec 17 2012 13:44:05
GRIB_API library version : 1.9.0
netCDF library version : 4.1.3 of Apr 10 2012 16:09:07 $
HDF5 library version : 1.8.7
SERVICE library version : 1.3.1 of Feb 1 2013 15:13:10
EXTRA library version : 1.3.1 of Feb 1 2013 15:13:10
IEG library version : 1.3.1 of Feb 1 2013 15:13:10
FILE library version : 1.8.1 of Feb 1 2013 15:13:10
Replies (4)
RE: Performance of cdo with netCDF4 - Added by Ralf Mueller about 11 years ago
Could you provide any example cdo calls?
RE: Performance of cdo with netCDF4 - Added by Uwe Schulzweida about 11 years ago
Hi Arne,
I observe nearly the same difference in the performance.
A factor of 2 comes from the thread safe HDF5 version. The thread safe HDF5 version uses a lot of pthread_canceled(), pthread_setcancelstate_internal(), pthread_mutex_lock() and pthread_mutex_unlock() calls.
The other factor of 2 comes from the netCDF4 layer. The most time consuming part here is the function nc4_find_dim_len(). It seems that the performance depends on the number of used arrays. I got the same performance with netCDF and netCDF4, if I use only one large 4D array.
Unfortunately there is no way to increase the performance of CDO for reading netCDF4. CDO uses exactly the same netCDF interface for both file formats.
Regards,
Uwe
RE: Performance of cdo with netCDF4 - Added by Ralf Mueller about 11 years ago
I ran this benchmark
require 'benchmark'
include Benchmark # we need the CAPTION and FORMAT constants
n = 10000
Benchmark.benchmark(CAPTION, 7, FORMAT, ">total:", ">avg:") do |x|
tf = x.report("plain nc :") { IO.popen("cdo -sinfov -timmean oceLong.nc >/dev/null 2>&1").read }
tt = x.report("plain nc4:") { IO.popen("cdo -sinfov -timmean oceLong.nc4 >/dev/null 2>&1").read }
tu = x.report("nc4 zip:") { IO.popen("cdo -sinfov -timmean oceLong.nc4z >/dev/null 2>&1").read }
[tf+tt+tu, (tf+tt+tu)/3]
end
with this result:
[ram@thingol:~/data/icon]ruby cdoBench.rb [15:43:18|13-10-24] user system total real plain nc : 0.000000 0.000000 0.000000 ( 0.202038) plain nc4: 0.000000 0.000000 0.000000 ( 0.297458) nc4 zip: 0.000000 0.000000 0.000000 ( 1.236381) >total: 0.000000 0.000000 0.000000 ( 1.735876) >avg: 0.000000 0.000000 0.000000 ( 0.578625)
on these files (to large to upload):
-rw-rw-r-- 1 ram users 125M Oct 24 15:33 oceLong.nc4 -rw-rw-r-- 1 ram users 40M Oct 24 15:34 oceLong.nc4z -rw-rw-r-- 1 ram users 524 Oct 24 15:41 cdoBench.rb -rw-r--r-- 1 ram users 125M Oct 24 15:41 oceLong.nc
cdoBench.rb (524 Bytes) cdoBench.rb |
RE: Performance of cdo with netCDF4 - Added by Arne Kriegsmann about 11 years ago
Hi Uwe,
that's very good to know. Thanks for the answer.
Arne