Project

General

Profile

Option to select which dimensions to be removed with --reduce_dim.

Added by Brian Højen-Sørensen 2 months ago

Using --reduce_dim removes all dimensions of size 1. Is there an option to select which dimensions to remove (or keep)?

My problem is that we download and process some large datasets in parallel. The files are provided for each timestep. That means that my 'time' dimension is also removed when i use --reduce_dim and not only the vertical dimensions that i would like to remove.

It would be very nice if one could just remove the selected dimensions.
cdo --reduce_dim,height,height_2 .....

Currently we use ncwa and ncks -x to remove the variables and dimensions, but that takes some additional resources and requires temporary files (which ends up being quite a lot of IO). Alternatively we need to wait until we have merged the individual time steps into one forecast file and then do --reduce_dim, but that still gives us additional IO and takes longer. And it would only work if we actually has multiple time steps, which might not be the case for some other datasets.

Is there some other way of doing it (maybe undocumented) otherwise i would like for you to consider something with that use case in mind as a new feature.

Cheers,
Brian Højen-Sørensen


Replies (9)

RE: Option to select which dimensions to be removed with --reduce_dim. - Added by Estanislao Gavilan 2 months ago

Hi Brian,

I have never done it, but did you try the command isosurface (see 2.3.10 selsurface)?

Best regards,

Estanislao

RE: Option to select which dimensions to be removed with --reduce_dim. - Added by Brian Højen-Sørensen 2 months ago

Hi Estanislao,

Thanks for the quick reply. I didn't know about the selsurface operators. :-)

But it unfortunately fails with:
@cdo isosurface (Abort): No processable variable found!

I though it was because my variables are 4D (time,height,lat,lon) and not 3D as the documentation states. But i removed the time dimension to test the idea, but it still fails with both bottomvalue and topvalue

I have uploaded two small sample files that can be used for testing. I can't see why it shouldn't work, at least with the one with only three dimensions (but that wouldn't solve my issue anyway).

cdo bottomvalue dimension_height_test.nc out.nc

cdo    bottomvalue (Abort): No processable variable found!
ncdump -h dimension_height_test.nc 
netcdf dimension_height_test {
dimensions:
    height = 1 ;
    lat = 51 ;
    lon = 51 ;
variables:
    double height(height) ;
        height:standard_name = "height" ;
        height:long_name = "height" ;
        height:units = "m" ;
        height:positive = "up" ;
        height:axis = "Z" ;
    double lat(lat) ;
        lat:standard_name = "latitude" ;
        lat:long_name = "latitude" ;
        lat:units = "degrees_north" ;
        lat:axis = "Y" ;
    double lon(lon) ;
        lon:standard_name = "longitude" ;
        lon:long_name = "longitude" ;
        lon:units = "degrees_east" ;
        lon:axis = "X" ;
    float sh(height, lat, lon) ;
        sh:standard_name = "specific_humidity" ;
        sh:long_name = "specific humidity" ;
        sh:units = "kg/kg" ;
        sh:param = "0.1.0" ;
        sh:_FillValue = -9999.f ;
        sh:missing_value = -9999.f ;
        sh:cell_methods = "time: mean" ;

Cheers,
Brian

RE: Option to select which dimensions to be removed with --reduce_dim. - Added by Estanislao Gavilan 2 months ago

Hi Brian,
I totally forgot about sellevel. That command should do the job.

RE: Option to select which dimensions to be removed with --reduce_dim. - Added by Brian Højen-Sørensen 2 months ago

Hi again,

That unfortunately doesn't remove the dimensions it just extracts them. That is part of the command we created the file to begin with.

cdo -O -s --pedantic \
      -f nc4 \
      ...
      ...
      ...
      -settunits,seconds \
      -setmissval,-9999 \
      -remap,${GRID},${WEIGHTS} \
      -merge \
      -selltype,102 -selvar,pres ${SRCFILE} \
      -selltype,103 -selvar,2t,q,10u,10v ${SRCFILE} \
      -sellevel,0 -selvar,cc,tp ${SRCFILE} \
      tmp1_${SRCFILE}.nc

/Brian

RE: Option to select which dimensions to be removed with --reduce_dim. - Added by Karin Meier-Fleischer 2 months ago

Hi Brian,

I think the best way is to do the following:

# display the time stamp of the file
cdo -s -showtimestamp dimension_time_height_test.nc

# reduce one element dimensions
cdo --reduce_dim -copy dimension_time_height_test.nc tmp.nc

# add the time dimension
cdo -setreftime,2024-02-01,00:00:00,1sec -settaxis,2024-02-02,19:00:00,1sec tmp.nc outfile.nc

Result:

dimensions:
    time = UNLIMITED ; // (1 currently)
    lon = 51 ;
    lat = 51 ;
variables:
    double time(time) ;
        time:standard_name = "time" ;
        time:units = "seconds since 2024-2-1 00:00:00" ;
        time:calendar = "proleptic_gregorian" ;
        time:axis = "T" ;
    double lon(lon) ;
        lon:standard_name = "longitude" ;
        lon:long_name = "longitude" ;
        lon:units = "degrees_east" ;
        lon:axis = "X" ;
    double lat(lat) ;
        lat:standard_name = "latitude" ;
        lat:long_name = "latitude" ;
        lat:units = "degrees_north" ;
        lat:axis = "Y" ;
    float sh(time, lat, lon) ;
        sh:standard_name = "specific_humidity" ;
        sh:long_name = "specific humidity" ;
        sh:units = "kg/kg" ;
        sh:param = "0.1.0" ;
        sh:_FillValue = -9999.f ;
        sh:missing_value = -9999.f ;

RE: Option to select which dimensions to be removed with --reduce_dim. - Added by Brian Højen-Sørensen 2 months ago

Hi Karin,

Thanks for the solution, but that sort of only comes halfway and increases the complexity (by having to extract the timestamp and split it with some bash commands and insert the axis again).

The current NCO commands does the job just fine, but I would like to be able to do it without using temporary files to reduce both CPU load and disk IO.
With the above commands I only get one temporary file instead of two, which is of course better than nothing. I haven't tested the performance difference yet.

I still think the option to just name the dimensions to reduce would be a nice feature.

But thanks for the quick replies and the different solutions.

/BRian

RE: Option to select which dimensions to be removed with --reduce_dim. - Added by Uwe Schulzweida 2 months ago

You can remove all vertical dimensions with size=1 with the CDO function setzaxis,surface.
surface is a predefined (undocumented) zaxis description for:

zaxistype = surface
size      = 1
levels    = 0 

RE: Option to select which dimensions to be removed with --reduce_dim. - Added by Brian Højen-Sørensen 2 months ago

Hi Uwe,

Awesome, it works exactly as I wanted. :-)

I just went down a rabbit hole to try and optimize the job as much as possible.

Thanks for the amazing support and great tool in general.
Brian

RE: Option to select which dimensions to be removed with --reduce_dim. - Added by Karin Meier-Fleischer 2 months ago

Ah, that is really great. The undocumented things are always cool. 👍

    (1-9/9)