Project

General

Profile

Optimizing remapping of batch files from/to the same grid.

Added by Andre D. L. Zanchetta over 3 years ago

Hi all,

I have nearly 2000 large NetCDF files (if0001.nc, if0002.nc, ...) in rotated polar grid (all of them in the same grid) that I would like to remap into another WGS-84 grid.

I've set up a NetCDF file with my desired WGS-84 grid (target.nc) and I am currently doing my remapping with:

$ cdo remapbil,target.nc if0001.nc of0001.nc

The output files I am getting (of0001.nc, of0002.nc, ...) are exactly as I wanted. Clean and neat! Thank you CDO for existing! :)

However, in my current script I am executing this command on all my input files using 8 parallel processes. My machine is not overloaded (nearly 66% CPU usage and 20% memory usage). And each file takes about 50 minutes to be remapped. At this step, I may end up taking almost 10 continuous days to get all my files done.

It takes the same time to remap a file regardless if it has 4 or 24 layers, making me think that what is costing the most is the definition of the remap function f(input_grid, target_grid), while applying this defined f() to each layer is pretty inexpensive.

In the way I am doing now, f(input_grid, target_grid) is being recalculated and recalculated again and again for every single file. As all input files have the same grid, and are being regridded to the same target grid, I suppose that the code could be optimized if f() could be calculated once and then reused again and again.

What is your opinion?
Is it possible to do something like that?
Suggested alternatives for batch remapping?


Replies (2)

RE: Optimizing remapping of batch files from/to the same grid. - Added by Robert Wilson over 3 years ago

Hi Andre

I think what you need to do is pre-generate the weights first using genbil.

So something like

cdo -genbil,target.nc infile1.nc weights_nc

to generate your weights. And then something like

cdo -remap,target.nc,weights_nc infile.nc outfile.nc

for the rest of the files. You might also want to look into parallelizing it within CDO using the "-P" argument.

Robert

RE: Optimizing remapping of batch files from/to the same grid. - Added by Andre D. L. Zanchetta over 3 years ago

Hi Robert,

Thank you very much for your suggestions. It is a pretty nice solution and it is exactly was I was looking for.

I changed my code to replace my previous approach by your solution with the pre-calculated weights.
Got the same outputs, what is good, but unfortunately the processing time was barely kept unchanged.

When I tried with the argument "-P", strangely my code got slower. I guess I need to set up my MPI properly before going that way.

    (1-2/2)