Project

General

Profile

Performance of cdo with netCDF4

Added by Arne Kriegsmann about 11 years ago

I noticed that the format of the inputfiles significantly influences the time that cdo needs to perform it's actions.
For example, a file in netCDF4 takes more than 4 times longer than the same data in netCDF format.

To avoid long computing times, I usually start to convert netCDF4 data into netCDF format. Is there a way to increase the performance of cdo for netCDF4?

Regards
Arne

PS:
Climate Data Operators version 1.5.9 (http://code.zmaw.de/projects/cdo)
Compiler: gcc -std=gnu99 -fopenmp -pthread -O2
version: gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-54)
with: PTHREADS OpenMP NC4 SZ Z JASPER PROJ.4
Compiled: by k202066 on c5comp1.dkrz.de (x86_64-unknown-linux-gnu) Feb 1 2013 15:13:17
CDI library version : 1.5.9 of Feb 1 2013 15:13:11
CGRIBEX library version : 1.5.6 of Dec 17 2012 13:44:05
GRIB_API library version : 1.9.0
netCDF library version : 4.1.3 of Apr 10 2012 16:09:07 $
HDF5 library version : 1.8.7
SERVICE library version : 1.3.1 of Feb 1 2013 15:13:10
EXTRA library version : 1.3.1 of Feb 1 2013 15:13:10
IEG library version : 1.3.1 of Feb 1 2013 15:13:10
FILE library version : 1.8.1 of Feb 1 2013 15:13:10


Replies (4)

RE: Performance of cdo with netCDF4 - Added by Ralf Mueller about 11 years ago

Could you provide any example cdo calls?

RE: Performance of cdo with netCDF4 - Added by Uwe Schulzweida about 11 years ago

Hi Arne,

I observe nearly the same difference in the performance.
A factor of 2 comes from the thread safe HDF5 version. The thread safe HDF5 version uses a lot of pthread_canceled(), pthread_setcancelstate_internal(), pthread_mutex_lock() and pthread_mutex_unlock() calls.
The other factor of 2 comes from the netCDF4 layer. The most time consuming part here is the function nc4_find_dim_len(). It seems that the performance depends on the number of used arrays. I got the same performance with netCDF and netCDF4, if I use only one large 4D array.
Unfortunately there is no way to increase the performance of CDO for reading netCDF4. CDO uses exactly the same netCDF interface for both file formats.

Regards,
Uwe

RE: Performance of cdo with netCDF4 - Added by Ralf Mueller about 11 years ago

I ran this benchmark

  require 'benchmark'
  include Benchmark          # we need the CAPTION and FORMAT constants

  n = 10000
  Benchmark.benchmark(CAPTION, 7, FORMAT, ">total:", ">avg:") do |x|
    tf = x.report("plain nc :")  { IO.popen("cdo -sinfov -timmean oceLong.nc   >/dev/null 2>&1").read }
    tt = x.report("plain nc4:")  { IO.popen("cdo -sinfov -timmean oceLong.nc4  >/dev/null 2>&1").read }
    tu = x.report("nc4   zip:")  { IO.popen("cdo -sinfov -timmean oceLong.nc4z >/dev/null 2>&1").read }
    [tf+tt+tu, (tf+tt+tu)/3]
  end

with this result:
[ram@thingol:~/data/icon]ruby cdoBench.rb                                                    [15:43:18|13-10-24]
              user     system      total        real
plain nc :  0.000000   0.000000   0.000000 (  0.202038)
plain nc4:  0.000000   0.000000   0.000000 (  0.297458)
nc4   zip:  0.000000   0.000000   0.000000 (  1.236381)
>total:   0.000000   0.000000   0.000000 (  1.735876)
>avg:     0.000000   0.000000   0.000000 (  0.578625)

on these files (to large to upload):

-rw-rw-r-- 1 ram users  125M Oct 24 15:33 oceLong.nc4
-rw-rw-r-- 1 ram users   40M Oct 24 15:34 oceLong.nc4z
-rw-rw-r-- 1 ram users   524 Oct 24 15:41 cdoBench.rb
-rw-r--r-- 1 ram users  125M Oct 24 15:41 oceLong.nc
cdoBench.rb (524 Bytes) cdoBench.rb

RE: Performance of cdo with netCDF4 - Added by Arne Kriegsmann about 11 years ago

Hi Uwe,

that's very good to know. Thanks for the answer.

Arne

    (1-4/4)