Project

General

Profile

Using CDO Efficiently for a Perturbation

Added by Matt Thompson almost 7 years ago

Dear CDO Gurus,

I'm hoping you can help me be efficient. I'm trying to figure out how to use CDO to perturb some variables in some NC4 files. The scenario is we have three restarts from three times: state_t0, state_t1, state_t2 where t0 is newest, t1 is one day before, t2 is two days before. We want to perturb, say, U wind where from our states U0 is from t0, U1, from t1, and U2 from t2 with these formulas:

U_plus  = U0 + ((U1-U2)/factor)
U_minus = U0 + ((U1-U2)/factor)

and then create a state_plus which is state_t0 but with U_plus instead of U0, and a state_minus which is state_t0 with U_minus

I can map this process with:

cdo selname,U state_t0.nc4 U0.nc4
cdo selname,U state_t1.nc4 U1.nc4
cdo selname,U state_t2.nc4 U2.nc4

cdo sub U1.nc4 U2.nc4 deltaU.nc4
cdo divc,factor deltaU.nc4 deltaUpert.nc4

cdo add U0.nc4 deltaUpert.nc4 U_plus.nc4
cdo sub U0.nc4 deltaUpert.nc4 U_minus.nc4

cdo replace state_t0.nc4 U_plus.nc4  state_plus.nc4
cdo replace state_t0.nc4 U_minus.nc4 state_minus.nc4

Now, I can see how to chain these with:

cdo -L replace state_t0.nc4 -add -selname,U state_t0.nc4 -divc,factor -sub -selname,U state_t1.nc4 -selname,U state_t2.nc4 state_plus.nc4
cdo -L replace state_t0.nc4 -sub -selname,U state_t0.nc4 -divc,factor -sub -selname,U state_t1.nc4 -selname,U state_t2.nc4 state_minus.nc4

As near as I can tell, these are the same result.

What I was wondering is is there a "better" way of doing this? Maybe with exprf/aexprf? I'm trying to learn those operators/functions of CDO I never touch because I tend to think in simple ways. :)


Replies (6)

RE: Using CDO Efficiently for a Perturbation - Added by Matt Thompson almost 7 years ago

Sigh, I of course meant:

U_plus  = U0 + ((U1-U2)/factor)
U_minus = U0 - ((U1-U2)/factor)

RE: Using CDO Efficiently for a Perturbation - Added by Ralf Mueller almost 7 years ago

factor=3.141529
# collect the data into a single file
cdo -L -merge -chname,U,U0 -selname,U state_t0.nc4 -chname,U,U1 -selname,U state_t1.nc4 -chname,U,U2 -selname,U state_t2.nc4 AllU.nc
#replace old U0 with new U0
cdo -L -replace state_t0.nc4 -expr,"U0=U0+((U1+U2)/$factor)" AllU.nc state_plus.nc

What you get is a shorter final command line, but you have to do an additional one to collect the variables in a single file. Unfortunately merge cannot be chained further.

Personally I like your long call, because you avoid temporal data completely - nothing left to remove. this really pays back when working with larger data sets.

hth
ralf

RE: Using CDO Efficiently for a Perturbation - Added by Matt Thompson almost 7 years ago

Ralf,

Interesting call. I'll test it to see if it is faster.

Next question if you don't mind: CDO seems to have removed a dimension and added another:

Before:

netcdf state_t0 {
dimensions:
        lon = 180 ;
        lat = 1080 ;
        lev = 72 ;
        edge = 73 ;
        time = 1 ;
variables:
        double lon(lon) ;
                lon:units = "degrees_east" ;
                lon:long_name = "Longitude" ;
                lon:_Storage = "contiguous" ;
                lon:_Endianness = "little" ;
        double lat(lat) ;
                lat:units = "degrees_north" ;
                lat:long_name = "Latitude" ;
                lat:_Storage = "contiguous" ;
                lat:_Endianness = "little" ;
        double lev(lev) ;
                lev:units = "layer" ;
                lev:long_name = "sigma at layer midpoints" ;
                lev:standard_name = "atmosphere_hybrid_sigma_pressure_coordinate" ;
                lev:coordinate = "eta" ;
                lev:positive = "down" ;
                lev:formulaTerms = "ap: ak b: bk ps: ps p0: p00" ;
                lev:_Storage = "contiguous" ;
                lev:_Endianness = "little" ;
        double edge(edge) ;
                edge:units = "level" ;
                edge:long_name = "sigma at layer edges" ;
                edge:standard_name = "atmosphere_hybrid_sigma_pressure_coordinate" ;
                edge:coordinate = "eta" ;
                edge:positive = "down" ;
                edge:formulaTerms = "ap: ak b: bk ps: ps p0: p00" ;
                edge:_Storage = "contiguous" ;
                edge:_Endianness = "little" ;
        double time(time) ;
                time:units = "minutes since 2017-07-03 21:00:00" ;
                time:long_name = "time" ;
                time:begin_date = 20170703 ;
                time:begin_time = 210000 ;
                time:_Storage = "contiguous" ;
                time:_Endianness = "little" ;

After:

netcdf state_plus {
dimensions:
        lon = 180 ;
        lat = 1080 ;
        edge = 73 ;
        lev = 72 ;
        bnds = 2 ;
variables:
        double lon(lon) ;
                lon:standard_name = "longitude" ;
                lon:long_name = "longitude" ;
                lon:units = "degrees_east" ;
                lon:axis = "X" ;
                lon:_Storage = "contiguous" ;
                lon:_Endianness = "little" ;
        double lat(lat) ;
                lat:standard_name = "latitude" ;
                lat:long_name = "latitude" ;
                lat:units = "degrees_north" ;
                lat:axis = "Y" ;
                lat:_Storage = "contiguous" ;
                lat:_Endianness = "little" ;
        double edge(edge) ;
                edge:long_name = "sigma at layer edges" ;
                edge:units = "level" ;
                edge:positive = "down" ;
                edge:axis = "Z" ;
                edge:coordinate = "eta" ;
                edge:formulaTerms = "ap: ak b: bk ps: ps p0: p00" ;
                edge:_Storage = "contiguous" ;
                edge:_Endianness = "little" ;
        double lev(lev) ;
                lev:standard_name = "atmosphere_hybrid_sigma_pressure_coordinate" ;
                lev:axis = "Z" ;
                lev:positive = "down" ;
                lev:long_name = "hybrid sigma pressure coordinate" ;
                lev:units = "layer" ;
                lev:formula_terms = "ap: ap b: b ps: ps" ;
                lev:bounds = "lev_bnds" ;
                lev:coordinate = "eta" ;
                lev:formulaTerms = "ap: ak b: bk ps: ps p0: p00" ;
                lev:_Storage = "contiguous" ;
                lev:_Endianness = "little" ;
        double lev_bnds(lev, bnds) ;
                lev_bnds:standard_name = "atmosphere_hybrid_sigma_pressure_coordinate" ;
                lev_bnds:units = "layer" ;
                lev_bnds:formula_terms = "ap: ap_bnds b: b_bnds ps: ps" ;
                lev_bnds:_Storage = "contiguous" ;
                lev_bnds:_Endianness = "little" ;

The new(?) lev_bnds I can remove with ncks easily (maybe with CDO as well, I just know the ncks command by heart). But I'm wondering if there is a good way to bring back time? Now, I can understand perhaps why CDO did this. I was doing math on files from different times, so it probably had no idea what to make the time in the last. I know there is the settime, settaxis, setreftime, etc. operators, but, again, I'm lazy. Is there a way to perhaps have CDO take the time from one file (state_t0) and apply it to a new file (state_plus)? Or if not, is there even a way for CDO to make a time variable that looks like state_t0's? CDO usually handles time a bit different.

RE: Using CDO Efficiently for a Perturbation - Added by Ralf Mueller almost 7 years ago

Could you upload some data? CDOs handling of coordinate variables highly depends on their connection the data fields...

RE: Using CDO Efficiently for a Perturbation - Added by Matt Thompson almost 7 years ago

Ralf,

A colleague of mine uploaded some data:

data link: ftp://gmaoftp.gsfc.nasa.gov/pub/data/dvarier/cdo_question/
org.nc4 - original fvcore_internal data
U0.nc4 - output from the command: "cdo selname,U org.nc4 U0.nc4"
org.txt - ncdump -h output
U0.txt - ncdump -h output

RE: Using CDO Efficiently for a Perturbation - Added by Uwe Schulzweida almost 7 years ago

The variables in the original file are declared as:

    double U(lev, lat, lon) ;
That means none of the variables depends on the time. CDO doesn't process coordinate variables which are not used.
Add the time dimension to the variable definition in order to kept this coordinate variable and dimension:
    double U(time, lev, lat, lon) ;

    (1-6/6)