Project

General

Profile

CDO metadata extraction is unusually slow

Added by Mark Payne almost 3 years ago

Hi,

I have been using CDO to extract some of the metadata from my netcdf files, but am finding that it is painfully slow. For example:

time cdo showdate so_Omon_ACCESS-ESM1-5_historical_r1i1p1f1_gn_201001-201412.nc

2010-01-16 2010-02-15 2010-03-16 2010-04-16 2010-05-16 2010-06-16 2010-07-16 2010-08-16 2010-09-16 2010-10-16 2010-11-16 2010-12-16 2011-01-16 2011-02-15 2011-03-16 2011-04-16 2011-05-16 2011-06-16 2011-07-16 2011-08-16 2011-09-16 2011-10-16 2011-11-16 2011-12-16 2012-01-16 2012-02-15 2012-03-16 2012-04-16 2012-05-16 2012-06-16 2012-07-16 2012-08-16 2012-09-16 2012-10-16 2012-11-16 2012-12-16 2013-01-16 2013-02-15 2013-03-16 2013-04-16 2013-05-16 2013-06-16 2013-07-16 2013-08-16 2013-09-16 2013-10-16 2013-11-16 2013-12-16 2014-01-16 2014-02-15 2014-03-16 2014-04-16 2014-05-16 2014-06-16 2014-07-16 2014-08-16 2014-09-16 2014-10-16 2014-11-16 2014-12-16
cdo showdate: Processed 1 variable over 60 timesteps [0.12s 65MB]

real 0m7.198s
user 0m0.131s
sys 0m0.058s

Doing the same exercise using ncdump on the same file gives a very different result:

time ncdump -ct so_Omon_ACCESS-ESM1-5_historical_r1i1p1f1_gn_201001-201412.nc

netcdf so_Omon_ACCESS-ESM1-5_historical_r1i1p1f1_gn_201001-201412 {
<<snip>
data:

time = "2010-01-16 12", "2010-02-15", "2010-03-16 12", "2010-04-16", 
"2010-05-16 12", "2010-06-16", "2010-07-16 12", "2010-08-16 12",
"2010-09-16", "2010-10-16 12", "2010-11-16", "2010-12-16 12",
"2011-01-16 12", "2011-02-15", "2011-03-16 12", "2011-04-16",
"2011-05-16 12", "2011-06-16", "2011-07-16 12", "2011-08-16 12",
"2011-09-16", "2011-10-16 12", "2011-11-16", "2011-12-16 12",
"2012-01-16 12", "2012-02-15 12", "2012-03-16 12", "2012-04-16",
"2012-05-16 12", "2012-06-16", "2012-07-16 12", "2012-08-16 12",
"2012-09-16", "2012-10-16 12", "2012-11-16", "2012-12-16 12",
"2013-01-16 12", "2013-02-15", "2013-03-16 12", "2013-04-16",
"2013-05-16 12", "2013-06-16", "2013-07-16 12", "2013-08-16 12",
"2013-09-16", "2013-10-16 12", "2013-11-16", "2013-12-16 12",
"2014-01-16 12", "2014-02-15", "2014-03-16 12", "2014-04-16",
"2014-05-16 12", "2014-06-16", "2014-07-16 12", "2014-08-16 12",
"2014-09-16", "2014-10-16 12", "2014-11-16", "2014-12-16 12" ;

<<snip>>
real 0m0.195s
user 0m0.032s
sys 0m0.001s

i.e. a factor of 35x faster. Now, I'll confess that this is an extreme example to illustrate the point - the file in question is being accessed on a remote server over a VPN. But even when I'm working with files locally on a SSD disk I still see ncdump outperforming cdo by a factor of 3x or so. To me, it looks like CDO is doing a lot more disk io than ncdump, which is slowing the process. This issue also slows many of the other metadata and information functions e.g. sinfo and makes them irritatingly slow when trying to gather metadata across 1000s of files.

Is there any workaround for this problem or a faster way to extract metadata? Or is it something that we just have to live with?

Thanks!

Mark

PS This works for pretty much every file I've tried, so I haven't provided the file in the example


Replies (2)

RE: CDO metadata extraction is unusually slow - Added by Uwe Schulzweida almost 3 years ago

Hi Mark,

When opening a NetCDF file in CDO, all metadata is read. The CDO metadata also includes all coordinate variables. So the complete grid is read in.
There is no way to speed this up, sorry.

Cheers,
Uwe

RE: CDO metadata extraction is unusually slow - Added by Mark Payne almost 3 years ago

Hi Uwe,

Ok, thanks for letting me know anyway.

Mark

    (1-2/2)