CDO computes different results on two different machines from same file
Added by Luca Lelli about 2 years ago
Hello forum,
I have a possibly worrisome problem. I have created a netCDF stack with
cdo -b F32 -z zip4 setgrid,modis.des -copy ./MOD08_D3_netcdf/*.nc modis_gome2a_20070123_20211115.nc
The content is thus a timeserie of lonlat grids
netcdf modis_gome2a_20070123_20211115 { dimensions: time = UNLIMITED ; // (2902 currently) lon = 241 ; lat = 121 ; variables: double time(time) ; time:standard_name = "time" ; time:units = "days since 2007-01-01 00:00:00" ; time:calendar = "proleptic_gregorian" ; time:axis = "T" ; double lon(lon) ; lon:standard_name = "longitude" ; lon:long_name = "longitude" ; lon:units = "degrees_east" ; lon:axis = "X" ; double lat(lat) ; lat:standard_name = "latitude" ; lat:long_name = "latitude" ; lat:units = "degrees_north" ; lat:axis = "Y" ; float Water_Vapor_Near_Infrared_Clear_Mean(time, lat, lon) ; Water_Vapor_Near_Infrared_Clear_Mean:long_name = "Water vapor near infrared - clear column (bright land and ocean sunglint only): Mean" ; Water_Vapor_Near_Infrared_Clear_Mean:units = "cm" ; Water_Vapor_Near_Infrared_Clear_Mean:_FillValue = -9999.f ; Water_Vapor_Near_Infrared_Clear_Mean:missing_value = -9999.f ;
When I issue the following command (but this applies to timmean and others but not to infon), I get different results on different machines.
For instance, on machine 1
cdo outputf,%5.3f,1 -fldmean -seltimestep,1,2,3 modis_gome2a_20070123_20211115.nc cdo(1) fldmean: Process started cdo(2) seltimestep: Process started 20.802 14.097 15.187 cdo(2) seltimestep: Processed 87483 values from 1 variable over 4 timesteps cdo(1) fldmean: Processed 87483 values from 1 variable over 3 timesteps cdo outputf: Processed 3 values from 1 variable over 3 timesteps [0.17s 40MB]
while on machine 2
cdo(1) fldmean: Process started cdo(2) seltimestep: Process started nan nan nan cdo(2) seltimestep: Processed 87483 values from 1 variable over 4 timesteps. cdo(1) fldmean: Processed 87483 values from 1 variable over 3 timesteps. cdo outputf: Processed 3 values from 1 variable over 3 timesteps [0.01s 37MB].
Operators like setrtomiss, setvrange do not sort any difference in the computations.
I am puzzled: why should CDO compute different results if the file is exactly the same and/or treat NaNs differently on different machines?
On machine 1 I have the following CDO installation
Climate Data Operators version 2.1.0 (https://mpimet.mpg.de/cdo) System: x86_64-conda-linux-gnu CDI data types: SizeType=size_t CDI file types: srv ext ieg grb1 grb2 nc1 nc2 nc4 nc4c nc5 nczarr CDI library version : 2.1.0 cgribex library version : 2.0.2 ecCodes library version : 2.27.0 NetCDF library version : 4.8.1 of Aug 13 2022 00:35:58 $ HDF5 library version : 1.12.2 threadsafe exse library version : 1.4.2 FILE library version : 1.9.1
On machine 2 I have the following CDO installation
Climate Data Operators version 2.0.5 (https://mpimet.mpg.de/cdo) System: x86_64-conda-linux-gnu CDI data types: SizeType=size_t DateType=int64_t CDI file types: srv ext ieg grb1 grb2 nc1 nc2 nc4 nc4c nc5 CDI library version : 2.0.5 cgribex library version : 2.0.1 ecCodes library version : 2.26.0 NetCDF library version : 4.8.1 of Apr 25 2022 17:43:42 $ hdf5 library version : 1.12.1 threadsafe exse library version : 1.4.2 FILE library version : 1.9.1
If any good Samaritan wants to give it a look, I attach the first three time steps of the stack. For the time being I run computations on the machine giving me real results and not nans but it is not a good feeling when the numerics behaves differently across platforms.
Thanks and cheers
Luca
modis-t123.nc (374 KB) modis-t123.nc | First three timesteps of a bigger stack |
Replies (3)
RE: CDO computes different results on two different machines from same file - Added by Uwe Schulzweida about 2 years ago
Hello Luca
The missing value in the data is -9999. In addition, there are some values in the data that are NaN. Such undefined values lead to undefined results in calculations. NaNs are handled correctly in CDO only if it is the missing value.
A workaround for this case is either convert all NaNs to missing values
cdo outputf,%5.3f,1 -fldmean -seltimestep,1,2,3 -setctomiss,nan modis_gome2a_20070123_20211115.ncor set the missing value to NaN
cdo outputf,%5.3f,1 -fldmean -seltimestep,1,2,3 -setmissval,nan modis_gome2a_20070123_20211115.ncCheers, Uwe
RE: CDO computes different results on two different machines from same file - Added by Luca Lelli about 2 years ago
Hello Uwe,
thanks. I confirm that both approaches deliver the correct and expected outcome.
Just out of curiosity, I wonder why nan is interpreted as missing_value on machine 1 and not on machine 2.
Cheers
Luca
RE: CDO computes different results on two different machines from same file - Added by Uwe Schulzweida about 2 years ago
I was just wondering that myself. For perfromance reasons, we have two branches for calculations. One for data with missing values and one without. Since CDO version 2.0.6 a new branch has been added. This is for data with missing values, if this is not NaN. This branch seems to give correct results for the present case. I would see this more as a coincidence.