Project

General

Profile

Regrid multiple netcdf files

Added by Samantha Andrews over 6 years ago

Hello

I have a few hundred NetCDF files that I need to regrid. I can successfully use the following command in Cygwin to regrid a single NetCDF

cdo remapnn,grid2.txt inputfile.nc outputfile.nc

However, for obvious reasons, I'd like something a little more automated.

I have tried the following

for i in $(ls); do cdo remapnn,grid2.txt $(i) $(i)wgs.nc; done

But received this error:
cdo (Abort): Too few streams specified! Operator remapnn,grid2.txt needs 1 input and 1 output streams.
-bash: i: command not found
-bash: i: command not found

Amongst other things, I also tried

for i in $(ls); do cdo remapnn,grid2.txt $(ls) $(ls)wgs.nc; done

This just gave me a NetCDF named based on the grid2.txt file rather than the input NetCDF files.

If anyone has any suggestions, I would be very grateful.


Replies (10)

RE: Regrid multiple netcdf files - Added by Samantha Andrews over 6 years ago

So of course it true "I have no idea what I'm doing" style, as soon as I post a question I figure out the answer 40 minutes later...

So for those who might have the same problem, here is the code that works:

for i in $(ls); do cdo remapnn,grid2.txt ${i} regrid/${i}wgs.nc; done

Note the curly braces rather than brackets, the use of i, and the addition of an output directory in the output file, which prevents cdo trying to regrid already regridded files.

RE: Regrid multiple netcdf files - Added by Ralf Mueller over 6 years ago

hi Samantha!

the question of optimization has multiple options. I'll try to come up with what I can guess:

Pre-compute interpolation weights

The remap*** operators actually do two things in one step

  1. compute interpolation weights
  2. apply these weights to input data

if your input files split into groups of identical grids, you can generate the weights for the grids with gennn once and apply them with then remap operators in your loop. this should speed up your loop. The amount of speedup depends on the size of your target and source grids

OpenMP

You can use the -P <num> switch with a <num> being set to the number of your logical CPUs. you can find that out using the TaskManager (available with a right-click on your taskbar). this will speed up the computation of the weights - not the application of the weight. if you take a number higher than the theoretical maxval, CDO will reset to the max.

Process-based parallelization

This technique can be performed to any call on the command line - not only CDO. Instead of call your for loop step by step, you can call chunks of these calls in parallel using GNU parallel. I am pretty sure that there is a cygwin package for this. What you have to do is creating a file with all your commands line by line and run parallel on it with a given number of processes to spawn (-j option)

for i in $(ls); do echo "cdo remapnn,grid2.txt ${i} regrid/${i}wgs.nc"; done | parallel -v -j 4
If you want to combine this technique with the OpenMP switch -P be careful - it can slow down your system if you take to much. Usually taking a higher number for parallel works good with IO-related calls (like yours).

Finally depending on the number of your input files, their source grid(s), number of timesteps and your target grid these steps can be combined for best speedup.

hth
ralf

RE: Regrid multiple netcdf files - Added by Samantha Andrews over 6 years ago

Hi Ralf

Thanks for your reply and explanation - this is very helpful.

RE: Regrid multiple netcdf files - Added by Kizje Marif over 4 years ago

Is this also possible for multiple grb2 files? I am trying to convert the latest ICON data with CDO with the following line;

for i in $(ls); do cdo -f grb2 remapnn,${TARGET_GRID_DESCRIPTION} ${i} regrid/${i}wgs.grb2; done

The error it gives me is that all the files are empty.

_cdo remapnn: Open failed on >icon_global_icosahedral_single-level_2020062400_051_T_2M.grib2<
File is empty

cdo remapnn: Open failed on >icon_global_icosahedral_single-level_2020062400_052_T_2M.grib2<
File is empty

cdo remapnn: Open failed on >icon_global_icosahedral_single-level_2020062400_053_T_2M.grib2<
File is empty

cdo remapnn: Open failed on >icon_global_icosahedral_single-level_2020062400_054_T_2M.grib2<
File is empty_

So,
- How can I convert multiple grb2 files?
- How can I indicate that I only want the files that end with grib2?
- Is it also possible that the converted files are saved in a new folder?

Cheers,
Kizje

RE: Regrid multiple netcdf files - Added by Kizje Marif over 4 years ago

Kizje Marif wrote:

Is this also possible for multiple grb2 files? I am trying to convert the latest ICON data with CDO with the following line;

for i in $(ls); do cdo -f grb2 remapnn,${TARGET_GRID_DESCRIPTION} ${i} regrid/${i}wgs.grb2; done

The error it gives me is that all the files are empty.

_cdo remapnn: Open failed on >icon_global_icosahedral_single-level_2020062400_051_T_2M.grib2<
File is empty

cdo remapnn: Open failed on >icon_global_icosahedral_single-level_2020062400_052_T_2M.grib2<
File is empty

cdo remapnn: Open failed on >icon_global_icosahedral_single-level_2020062400_053_T_2M.grib2<
File is empty

cdo remapnn: Open failed on >icon_global_icosahedral_single-level_2020062400_054_T_2M.grib2<
File is empty_

So,
- How can I convert multiple grb2 files?
- How can I indicate that I only want the files that end with grib2?
- Is it also possible that the converted files are saved in a new folder?

Cheers,
Kizje

I redownloaded everything now it has the following error.
cdo remapnn (Abort): Reference to source grid not found!
cdo remapnn (Warning): Reference to horizontal grid not available!

RE: Regrid multiple netcdf files - Added by Ralf Mueller over 4 years ago

could you post a download link? maybe you have to download the grid files,too.

RE: Regrid multiple netcdf files - Added by Kizje Marif over 4 years ago

What do you mean by download link? I am currently downloading the grib2 files from https://opendata.dwd.de/weather/nwp/icon/grib/

RE: Regrid multiple netcdf files - Added by Ralf Mueller over 4 years ago

I took something like this: https://opendata.dwd.de/weather/nwp/icon/grib/06/t/icon_global_icosahedral_model-level_2020070706_000_31_T.grib2.bz2

when you check the unpacked file with cdo sinfov, you will notice, that the coordinates cannot be found. this is due to the fact, that the ICON-grid cannot be stored in grib2 directly. But there is a reference number store in the grib file:

grib_dump icon_global_icosahedral_model-level_2020070706_000_33_T.grib2 | grep numberOfGrid
  numberOfGridUsed = 26;
  numberOfGridInReference = 1;

So your input is based on grid number 26. Then you need to download the gridfile from icon-downloads.zmaw.de (http://icon-downloads.mpimet.mpg.de/grids/public/icon_grid_0026_R03B07_G.nc)

After this you can attach these coordinates to the file and do all coordinate-related analysis afterwards:

cdo -sinfov -setgrid,icon_grid_0026_R03B07_G.nc icon_global_icosahedral_model-level_2020070706_000_33_T.grib2                                                                                                                                                                     1 
cdo(1) setgrid: Process started
   File format : GRIB2
    -1 : Institut Source   T Steptype Levels Num    Points Num Dtype : Parameter name
     1 : DWD      unknown  v instant       1   1   2949120   1  P16  : t             
   Grid coordinates :
     1 : unstructured             : points=2949120  nvertex=3
                             grid : number=26  position=0
                              uri : http://icon-downloads.mpimet.mpg.de/grids/public/icon_grid_0026_R03B07_G.nc
                             clon : -3.141593 to 3.141593 radian
                             clat : -1.56928 to 1.56928 radian
                        available : cellbounds
                             uuid : a27b8de6-18c4-11e4-820a-b5b098c6a5c0
   Vertical coordinates :
     1 : generalized_height       : levels=1
                           height : 33.5 
                           bounds : 33-34 
                            zaxis : number=4
                             uuid : 60075e14-1b8f-9e93-dc54-b937c25d2f00
   Time coordinate :  1 step
     RefTime =  2020-07-07 06:00:00  Units = minutes  Calendar = proleptic_gregorian
  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss
  2020-07-07 06:00:00
cdo(1) setgrid: Processed 1 variable over 1 timestep.
cdo    sinfon: Processed 1 variable over 1 timestep [0.94s 815MB].

when you use remapnn instead of sinfov I am sure it will work

hth
ralf

RE: Regrid multiple netcdf files - Added by Kizje Marif over 4 years ago

I dont get how this is answering my questions. Because it is possible to do a conversion for one file where I indeed use -setgrid to set the grid to the icon grid file:

export WORKDIR="C:\\Users\\kizje.marif\\Transform"
export ICON_GRID_FILE=${WORKDIR}/icon_grid_0026_R03B07_G.nc
export TARGET_GRID_DESCRIPTION=${WORKDIR}/target_grid_world_0125.txt
export WEIGHTS_FILE=${WORKDIR}/weights_icogl2world_0125.nc

time cdo -f grb2 remapnn,${TARGET_GRID_DESCRIPTION} -setgrid,${ICON_GRID_FILE} ${in_file} ${out_file}

But I made a new folder in Transform where the latest T_2M data files are downloaded and unzipped.

I want to know how I can convert all these more automated.

RE: Regrid multiple netcdf files - Added by Ralf Mueller over 4 years ago

so automated means to do this for a couple of files?

    (1-10/10)