Regrid multiple netcdf files
Added by Samantha Andrews over 6 years ago
Hello
I have a few hundred NetCDF files that I need to regrid. I can successfully use the following command in Cygwin to regrid a single NetCDF
cdo remapnn,grid2.txt inputfile.nc outputfile.nc
However, for obvious reasons, I'd like something a little more automated.
I have tried the following
for i in $(ls); do cdo remapnn,grid2.txt $(i) $(i)wgs.nc; done
But received this error:cdo (Abort): Too few streams specified! Operator remapnn,grid2.txt needs 1 input and 1 output streams.
-bash: i: command not found
-bash: i: command not found
Amongst other things, I also tried
for i in $(ls); do cdo remapnn,grid2.txt $(ls) $(ls)wgs.nc; done
This just gave me a NetCDF named based on the grid2.txt file rather than the input NetCDF files.
If anyone has any suggestions, I would be very grateful.
Replies (10)
RE: Regrid multiple netcdf files - Added by Samantha Andrews over 6 years ago
So of course it true "I have no idea what I'm doing" style, as soon as I post a question I figure out the answer 40 minutes later...
So for those who might have the same problem, here is the code that works:
for i in $(ls); do cdo remapnn,grid2.txt ${i} regrid/${i}wgs.nc; done
Note the curly braces rather than brackets, the use of i, and the addition of an output directory in the output file, which prevents cdo trying to regrid already regridded files.
RE: Regrid multiple netcdf files - Added by Ralf Mueller over 6 years ago
hi Samantha!
the question of optimization has multiple options. I'll try to come up with what I can guess:
Pre-compute interpolation weights¶
The remap***
operators actually do two things in one step
- compute interpolation weights
- apply these weights to input data
if your input files split into groups of identical grids, you can generate the weights for the grids with gennn
once and apply them with then remap
operators in your loop. this should speed up your loop. The amount of speedup depends on the size of your target and source grids
OpenMP¶
You can use the -P <num>
switch with a <num> being set to the number of your logical CPUs. you can find that out using the TaskManager (available with a right-click on your taskbar). this will speed up the computation of the weights - not the application of the weight. if you take a number higher than the theoretical maxval, CDO will reset to the max.
Process-based parallelization¶
This technique can be performed to any call on the command line - not only CDO. Instead of call your for loop step by step, you can call chunks of these calls in parallel using GNU parallel. I am pretty sure that there is a cygwin package for this. What you have to do is creating a file with all your commands line by line and run parallel on it with a given number of processes to spawn (-j
option)
for i in $(ls); do echo "cdo remapnn,grid2.txt ${i} regrid/${i}wgs.nc"; done | parallel -v -j 4If you want to combine this technique with the OpenMP switch
-P
be careful - it can slow down your system if you take to much. Usually taking a higher number for parallel
works good with IO-related calls (like yours).
Finally depending on the number of your input files, their source grid(s), number of timesteps and your target grid these steps can be combined for best speedup.
hth
ralf
RE: Regrid multiple netcdf files - Added by Samantha Andrews over 6 years ago
Hi Ralf
Thanks for your reply and explanation - this is very helpful.
RE: Regrid multiple netcdf files - Added by Kizje Marif over 4 years ago
Is this also possible for multiple grb2 files? I am trying to convert the latest ICON data with CDO with the following line;
for i in $(ls); do cdo -f grb2 remapnn,${TARGET_GRID_DESCRIPTION} ${i} regrid/${i}wgs.grb2; done
The error it gives me is that all the files are empty.
_cdo remapnn: Open failed on >icon_global_icosahedral_single-level_2020062400_051_T_2M.grib2<
File is empty
cdo remapnn: Open failed on >icon_global_icosahedral_single-level_2020062400_052_T_2M.grib2<
File is empty
cdo remapnn: Open failed on >icon_global_icosahedral_single-level_2020062400_053_T_2M.grib2<
File is empty
cdo remapnn: Open failed on >icon_global_icosahedral_single-level_2020062400_054_T_2M.grib2<
File is empty_
So,
- How can I convert multiple grb2 files?
- How can I indicate that I only want the files that end with grib2?
- Is it also possible that the converted files are saved in a new folder?
Cheers,
Kizje
RE: Regrid multiple netcdf files - Added by Kizje Marif over 4 years ago
Kizje Marif wrote:
Is this also possible for multiple grb2 files? I am trying to convert the latest ICON data with CDO with the following line;
for i in $(ls); do cdo -f grb2 remapnn,${TARGET_GRID_DESCRIPTION} ${i} regrid/${i}wgs.grb2; done
The error it gives me is that all the files are empty.
_cdo remapnn: Open failed on >icon_global_icosahedral_single-level_2020062400_051_T_2M.grib2<
File is emptycdo remapnn: Open failed on >icon_global_icosahedral_single-level_2020062400_052_T_2M.grib2<
File is emptycdo remapnn: Open failed on >icon_global_icosahedral_single-level_2020062400_053_T_2M.grib2<
File is emptycdo remapnn: Open failed on >icon_global_icosahedral_single-level_2020062400_054_T_2M.grib2<
File is empty_So,
- How can I convert multiple grb2 files?
- How can I indicate that I only want the files that end with grib2?
- Is it also possible that the converted files are saved in a new folder?Cheers,
Kizje
I redownloaded everything now it has the following error.
cdo remapnn (Abort): Reference to source grid not found!
cdo remapnn (Warning): Reference to horizontal grid not available!
RE: Regrid multiple netcdf files - Added by Ralf Mueller over 4 years ago
could you post a download link? maybe you have to download the grid files,too.
RE: Regrid multiple netcdf files - Added by Kizje Marif over 4 years ago
What do you mean by download link? I am currently downloading the grib2 files from https://opendata.dwd.de/weather/nwp/icon/grib/
RE: Regrid multiple netcdf files - Added by Ralf Mueller over 4 years ago
I took something like this: https://opendata.dwd.de/weather/nwp/icon/grib/06/t/icon_global_icosahedral_model-level_2020070706_000_31_T.grib2.bz2
when you check the unpacked file with cdo sinfov
, you will notice, that the coordinates cannot be found. this is due to the fact, that the ICON-grid cannot be stored in grib2 directly. But there is a reference number store in the grib file:
grib_dump icon_global_icosahedral_model-level_2020070706_000_33_T.grib2 | grep numberOfGrid numberOfGridUsed = 26; numberOfGridInReference = 1;
So your input is based on grid number 26. Then you need to download the gridfile from icon-downloads.zmaw.de (http://icon-downloads.mpimet.mpg.de/grids/public/icon_grid_0026_R03B07_G.nc)
After this you can attach these coordinates to the file and do all coordinate-related analysis afterwards:
cdo -sinfov -setgrid,icon_grid_0026_R03B07_G.nc icon_global_icosahedral_model-level_2020070706_000_33_T.grib2 1 cdo(1) setgrid: Process started File format : GRIB2 -1 : Institut Source T Steptype Levels Num Points Num Dtype : Parameter name 1 : DWD unknown v instant 1 1 2949120 1 P16 : t Grid coordinates : 1 : unstructured : points=2949120 nvertex=3 grid : number=26 position=0 uri : http://icon-downloads.mpimet.mpg.de/grids/public/icon_grid_0026_R03B07_G.nc clon : -3.141593 to 3.141593 radian clat : -1.56928 to 1.56928 radian available : cellbounds uuid : a27b8de6-18c4-11e4-820a-b5b098c6a5c0 Vertical coordinates : 1 : generalized_height : levels=1 height : 33.5 bounds : 33-34 zaxis : number=4 uuid : 60075e14-1b8f-9e93-dc54-b937c25d2f00 Time coordinate : 1 step RefTime = 2020-07-07 06:00:00 Units = minutes Calendar = proleptic_gregorian YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss 2020-07-07 06:00:00 cdo(1) setgrid: Processed 1 variable over 1 timestep. cdo sinfon: Processed 1 variable over 1 timestep [0.94s 815MB].
when you use remapnn
instead of sinfov
I am sure it will work
hth
ralf
RE: Regrid multiple netcdf files - Added by Kizje Marif over 4 years ago
I dont get how this is answering my questions. Because it is possible to do a conversion for one file where I indeed use -setgrid to set the grid to the icon grid file:
export WORKDIR="C:\\Users\\kizje.marif\\Transform"
export ICON_GRID_FILE=${WORKDIR}/icon_grid_0026_R03B07_G.nc
export TARGET_GRID_DESCRIPTION=${WORKDIR}/target_grid_world_0125.txt
export WEIGHTS_FILE=${WORKDIR}/weights_icogl2world_0125.nc
time cdo -f grb2 remapnn,${TARGET_GRID_DESCRIPTION} -setgrid,${ICON_GRID_FILE} ${in_file} ${out_file}
But I made a new folder in Transform where the latest T_2M data files are downloaded and unzipped.
I want to know how I can convert all these more automated.
RE: Regrid multiple netcdf files - Added by Ralf Mueller over 4 years ago
so automated means to do this for a couple of files?