Why is it so slow to process ERA5 data with cdo after ERA5 update?
Added by wu yy about 2 months ago
Hi!
I recently downloaded a few sets of data from ERA5 and had no problem viewing them with commands such as cdo-infos.
However, data processing commands such as daymean, shifttime, etc. will prompt: "Warning (cdfInqContents): Coordinates variable number can't be assigned! Warning (cdfInqContents): Coordinates variable expver can't be assigned!” .
Although the warning does not stop the process, it is very slow and can take several hours, which is something I have not done with ERA5 data before. I wonder if there is a solution to this problem?
Attached are the two ERA5 data I downloaded without any processing.
best wishes,
Rebecca
Replies (9)
RE: Why is it so slow to process ERA5 data with cdo after ERA5 update? - Added by Uwe Schulzweida about 2 months ago
Hi Rebecca,
At least CDO version 2.4.1 is required to read/decompress this data much faster.
Cheers,
Uwe
RE: Why is it so slow to process ERA5 data with cdo after ERA5 update? - Added by wu yy about 2 months ago
Thanks!
I found another way to solve this problem
RE: Why is it so slow to process ERA5 data with cdo after ERA5 update? - Added by Jiawei Bao about 2 months ago
Hi Uwe,
I used the latest cdo/2.4.3-gcc-11.2.0 on Levante. And it is still extremely slow. I took almost 8 hours to remap 1 year of hourly data (2d surface temperature) from original grid (0.25 deg) to 1 deg. Before the ERA5 update, it took around 20 minutes.
Below is the command that I used:
cdo remapbil,grid_target.txt era5_tropical_sp_2020.nc era5_tropical_sp_1x1_2020.nc.
Is there any solution to solve the issue? Thanks in advance.
Best,
Jiawei
RE: Why is it so slow to process ERA5 data with cdo after ERA5 update? - Added by Uwe Schulzweida about 2 months ago
Hi Jiawei,
Could you please send a link to the datafile?
Cheers,
Uwe
RE: Why is it so slow to process ERA5 data with cdo after ERA5 update? - Added by Jiawei Bao about 2 months ago
Hi Uwe,
Thanks for the fast reply.
The data is on levante: /work/mh0066/m300752/OBS/ERA5/hourly/era5_tropical_Td_2020.nc
Cheers,
Jiawei
RE: Why is it so slow to process ERA5 data with cdo after ERA5 update? - Added by Uwe Schulzweida about 2 months ago
Hi Jiawei,
The variable CDI_CHUNK_CACHE=1gb is set in the module environment of cdo/2.4.3-gcc-11.2.0. Unfortunately, this value is not sufficient for this ERA5 data.
If you use "unset CDI_CHUNK_CACHE" before the cdo call, it should run much faster:
unset CDI_CHUNK_CACHE cdo remapbil,global_1 /work/mh0066/m300752/OBS/ERA5/hourly/era5_tropical_Td_2020.nc result cdo remapbil: Processed 3048399360 values from 1 variable over 8784 timesteps [51.60s 1567MB]Cheers,
Uwe
RE: Why is it so slow to process ERA5 data with cdo after ERA5 update? - Added by Jiawei Bao about 2 months ago
Hi Uwe,
It's working and much faster now. Thanks a lot!
Cheers,
Jiawei
RE: Why is it so slow to process ERA5 data with cdo after ERA5 update? - Added by Fernand Mouassom 12 days ago
Hi Uwe
I downloaded the hourly ERA5 data containing the variables specific humidity, the two horizontal components of the wind. This for the months August and September of the year 2022 on 13 levels of pressure between 1000 and 300hPa. I make cdo daymean but that takes more than 10h of time (whereas I am planning to work on 40 years of data).
I have followed the instructions above and installed cdo.2.4.3 but the problem persists.
Any help or direction or whatever that could help me solve this problem and move forward is most welcome.
Best,
Fernand
RE: Why is it so slow to process ERA5 data with cdo after ERA5 update? - Added by Uwe Schulzweida 3 days ago
Dear Fernand,
For some NetCDF4 files, the chunk cache size is not calculated correctly in CDO. We will fix this problem in the next CDO version 2.5.1.
A workaround is to set the chunk cache size with the environment variable CDI_CHUNK_CACHE to a large value. However, this value should be smaller than the size of the main memory. Here is an example with 8GB:
CDI_CHUNK_CACHE=8gb cdo ... infile outfileBest,
Uwe