Project

General

Profile

Merge ERA5-Land and ERA5 data

Added by Nikos Alexandris about 2 years ago

ERA5-Land data "miss" some pixels along the coastlines due to the application of the fractional land coverage map (see https://confluence.ecmwf.int/pages/viewpage.action?pageId=151519725; https://confluence.ecmwf.int/display/CKB/ERA5-Land%3A+data+documentation; download link: https://confluence.ecmwf.int/download/attachments/140385202/lsm_1279l4_0.1x0.1.grb_v4_unpack.nc?version=1&modificationDate=1591979822208&api=v2).

To fill-in cells along the transition of land and sea (essentially, whatever is not sea and, at the same time, NULL cells in ERA5-Land data), I went on and set up the following workflow (using GDAL, see: https://gis.stackexchange.com/a/426391/5256):

1. apply scale and offset factors and extract individual ERA5 and ERA5-Land maps using `gdal_translate`
2. merge corresponding maps in one netCDF by copying ERA5-Land over ERA5 data and resample, at the same time, to the spatial resolution of ERA5-Land data using `gdal_merge`
3. concatenate all maps of interest in one netCDF, while adding a time dimension using `ncecat`
4. re-create the time variable in the time dimension using `ncap2`
5. rename the variable (note: GDAL names maps with `Band1`) using `ncrename`
6. update the long_name attribute using `ncatted`
7. update the units attribute again using `ncatted`

This works -- but seems to be very slow for steps 4 and 5. Would this workflow be faster/better to do with CDO/NCO tools completely/exclusively?
I feel that adding the timestamp to each map separately will be much faster, at/after step 2. How can I assign a timestamp to a single map (without having to duplicate files -- `cdo` seems to work with an in.file and an out.file!).

Any other recommendations to speed-up?

Thank you, Nikos


Replies (3)

RE: Merge ERA5-Land and ERA5 data - Added by Estanislao Gavilan about 2 years ago

Hi Nikos,
what about if you make a bash script with a loop? it would be a single line

RE: Merge ERA5-Land and ERA5 data - Added by Nikos Alexandris about 2 years ago

Dear Estanislao,

thank you for your reply. The workflow (previously) was a process that concatenated many netCDF files in one, by adding the time dimension at the same time by using the ncecat tool. Then, interpolating the hours since timestamps using the ncap2 tool. This is slow (for yearly maps which consist of either 8760 or 8784 maps).

I have updated my workflow and I do, for the time-stamping part and for each map separately, the following:

# variables fed below are defined in the script ar running time
ncecat -u time $IN $IN_TIME  # add time dimension
ncap2 \
    -h \
    -s "time[time]=array($HOURS,1,\$time)" \
    -s 'time@units="hours since 1900-01-01 00:00:00.0"' \
    -O $IN_TIME $OUT
ncrename -v Band1,${VARIABLE} $OUT
ncatted -a long_name,${VARIABLE},o,c,$VARIABLE_LONG_NAME $OUT
ncatted -a units,${VARIABLE},o,c,K $OUT

I haven't figured out a more elegant way to do this.

Kind regards

RE: Merge ERA5-Land and ERA5 data - Added by Estanislao Gavilan about 2 years ago

Hi Nikos,
There are things that you can try like compressing the netcdf files (not sure if cdo can do it) or.. have you considered to not merge files? If you know the time step you can create a loop for the single files. You can select, remap and save them in single files which are named by blablayearsmonthsday or create another variable where the time is recorded. Finally the last line could be used to concatenate them.

Best regards,

Estanislao

    (1-3/3)