Project

General

Profile

is there an operator to create a new variable independantly from another?

Added by Jan Griesfeller about 4 years ago

Hi,

I have a quick question:
Is there a cdo operator that can create a new variable? I basically need to recreate the latitude variable because the modellers created it not CF compliant.

In nco I do this:
ncap2 -o ${outfile} -O -s 'lat=array(30d,.1,$lat)' ${outfile}

but it's not working on files that do not fit into memory.

So is there a way to achieve this in cdo?

Thanks for your help!


Replies (5)

RE: is there an operator to create a new variable independantly from another? - Added by Ralf Mueller about 4 years ago

hi!

CDO can create data variables with the operators like expr, const, random, topo2, @seq and stdatm. Coordinates variables cannot be created in that way, because they only occur as something related to data variable. But you can use the something like the output of the griddes operator to discribe a target grid and use this as a parameter to random or the others or remapping.

BTW: I don't see any input file in your example ncap2 call, so the notion of not working on files that to not fit into memory confuses me a bit. maybe you can explain that a bit

thx in advance
ralf

RE: is there an operator to create a new variable independantly from another? - Added by Jan Griesfeller about 4 years ago

Hi,

since you have asked, here's the whole picture:
The data I got looks like this:

netcdf aerocom3_CHIMERE.cams61.rerun_concpm10_Surface_2018_hourly {
dimensions:
    lat = 421 ;
    lon = 701 ;
    Time = UNLIMITED ; // (8760 currently)
variables:
    double lat(lat, lon) ;
        lat:units = "degrees_north" ;
        lat:long_name = "Latitude" ;
        lat:standard_name = "Latitude" ;
        lat:axis = "Y" ;
    double lon(lat, lon) ;
        lon:units = "degrees_east" ;
        lon:long_name = "Longitude" ;
        lon:standard_name = "Longitude" ;
        lon:axis = "X" ;
    double Times(Time) ;
        Times:long_name = "seconds since 2018-01-01 00:00:00" ;
        Times:axis = "T" ;
    float concpm10(Time, lat, lon) ;
        concpm10:_FillValue = -999.f ;
        concpm10:units = "ug/m3" ;
        concpm10:long_name = "PM10" ;

I need to be able to read it with the Python iris package, so the output needs to be more or less CF compliant. The goal is something like this (not entirely CF, but good enough at this point):

netcdf aerocom3_CHIMERE.cams61.rerun_concpm10_Surface_2018_hourly {
dimensions:
    lat = 421 ;
    lon = 701 ;
    time = UNLIMITED ; // (8760 currently)
variables:
    double lat(lat) ;
        lat:units = "degrees_north" ;
        lat:standard_name = "latitude" ;
        lat:axis = "Y" ;
    double lon(lon) ;
        lon:units = "degrees_east" ;
        lon:standard_name = "longitude" ;
        lon:axis = "X" ;
    double time(time) ;
        time:standard_name = "time" ;
        time:units = "hours since 2018-1-1 00:00:00" ;
        time:calendar = "proleptic_gregorian" ;
        time:axis = "T" ;
    float concpm10(time, lat, lon) ;
        concpm10:long_name = "PM10" ;
        concpm10:units = "ug/m3" ;
        concpm10:_FillValue = -999.f ;
        concpm10:missing_value = -999.f ;

I was able to do the conversion entirely with nco (working around some bugs that e.g. let me remove the current lat/lon vars on when working on a netcdf3 file) but the process took 45 Minutes per file.
Since I had some good experience with cdo lately, I thought I give it a shot. I could find a replacement for all nco stuff, except for recreating the lat and lon variables with the right values. Therefore the question.
I am now down to 3 minutes with the following commands ($file is the input file):

cdo -r -f nc4 delname,lat,lon,Times ${file} ${outfile}
cdo -r settaxis,2018-01-01,00:00:00,1hour ${outfile} ${nc4outfile}
ncap2 -o ${nc4outfile} -O -s 'lat=array(30d,.1,$lat);lon=array(-25d,.1,$lon)' ${nc4outfile}
ncatted -O -a 'units,lon,o,c,degrees_east' -a 'standard_name,lon,o,c,longitude' -a 'axis,lon,o,c,X' -a 'units,lat,o,c,degrees_north' -a 'standard_name,lat,o,c,latitude' -a 'axis,lat,o,c,Y' ${nc4outfile}

I am aware that I can edit the attributes with cdo as well, but how to create the axis with the right values?

Thanks for your help!

RE: is there an operator to create a new variable independantly from another? - Added by Ralf Mueller about 4 years ago

hi Jan!

thx for the information. I see some issues with your solution:

  • cdo -r -f nc4 delname,lat,lon,Times ${file} ${outfile}
    this should give a warning and copy your input to the output, because coordinate variables cannot be deleted with the delname operator
  • Essentially your initial input has a curvilinear grid, because the lat-variable (and also lon) depends on both dimensions lat AND lon. It could be that the real lon-lat-locations of the input are identical to a pure lon-lat grid like it's defined in your output. I hope your are sure this is the case
  • My best guess is a remapping to the target grid you want
    1. create a griddes file, for lonlat type this is realy easy, check output of
      cdo griddes -topo
    2. perform an interpolation maybe with nearest-neighbor in order to keep the original values - but that's up to you because I don;t know the details

BTW: I believe the only thing missing to make your original input CF-compliant is the coordinates attribute for the data variable

cheers
ralf

RE: is there an operator to create a new variable independantly from another? - Added by Jan Griesfeller about 4 years ago

Hi Ralf,

I was also thinking that the grid is curvilinear, but the lat values don't change with longitudes and the lon values don't change and latitude. The original grid of the model is likely curvilinear, but they were asked to submit data in a rectangular grid. In addition, the model group acknowledged that the output of my small script is correct.

The delname operator did not complain btw, but I guess it did not recognise the variables as grid variables either.

So the conclusion is that one can't create new variables based on something else than the time variable (for which the -expr operator is used). Is that correct?

Anyway, thanks for your support!

Jan

RE: is there an operator to create a new variable independantly from another? - Added by Ralf Mueller about 4 years ago

Jan Griesfeller wrote:

Hi Ralf,

I was also thinking that the grid is curvilinear, but the lat values don't change with longitudes and the lon values don't change and latitude. The original grid of the model is likely curvilinear, but they were asked to submit data in a rectangular grid. In addition, the model group acknowledged that the output of my small script is correct.

you could check the coordinated easily with

cdo griddes
But if your processing is confirmed to be correct I guess there is nothing to worry about.

The delname operator did not complain btw, but I guess it did not recognize the variables as grid variables either.

So the conclusion is that one can't create new variables based on something else than the time variable (for which the -expr operator is used). Is that correct?

you can create data variables, but NOT coordinate variables like lon/lat/time. For time axis you need to use settaxis just like you did in your example.
More examples von expr usage can be found here and here

Anyway, thanks for your support!

Jan

    (1-5/5)