Project

General

Profile

Need explanation GRB2 -> NC1 -> GRB2

Added by Romain LE LAMER over 3 years ago

Hi,
I completed my script (Python) almost 100%.
I tested the output grib in my routing software.
Result = Not readable, I thought to myself that I had probably made a mistake somewhere and so I started to look for the error(s).

I tried something simple ...
A grb2 downloaded from NOAA -> Converted to nc1 without changing anything -> Converted back to grb2.

## OFFICIAL GRIB NOAA : (159Ko)
$ cdo sinfon GFS1_20200817_12_006.grb2 
   File format : GRIB2
    -1 : Institut Source   T Steptype Levels Num    Points Num Dtype : Parameter name
     1 : NCEP     unknown  v instant       1   1     65160   1  P13  : 10u           
     2 : NCEP     unknown  v instant       1   1     65160   1  P14  : 10v           
   Grid coordinates :
     1 : lonlat                   : points=65160 (360x181)
                              lon : 0 to 359 by 1 degrees_east  circular
                              lat : 90 to -90 by -1 degrees_north
   Vertical coordinates :
     1 : height                   : levels=1
                           height : 10 m
   Time coordinate :  1 step
     RefTime =  2020-08-17 12:00:00  Units = hours  Calendar = proleptic_gregorian
  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss
  2020-08-17 18:00:00
cdo    sinfon: Processed 2 variables over 1 timestep [0.06s 20MB].
## CONVERT GRB2 to NC1 :
$  cdo -f nc1 copy GFS1_20200817_12_006.grb2 GFS1_20200817_12_006.nc1
## NC1 : (527Ko)
$ cdo sinfon GFS1_20200817_12_006.nc1 
   File format : NetCDF
    -1 : Institut Source   T Steptype Levels Num    Points Num Dtype : Parameter name
     1 : NCEP     unknown  v instant       1   1     65160   1  F32  : 10u           
     2 : NCEP     unknown  v instant       1   1     65160   1  F32  : 10v           
   Grid coordinates :
     1 : lonlat                   : points=65160 (360x181)
                              lon : 0 to 359 by 1 degrees_east  circular
                              lat : 90 to -90 by -1 degrees_north
   Vertical coordinates :
     1 : height                   : levels=1
                           height : 10 m
   Time coordinate :  1 step
     RefTime =  2020-08-17 12:00:00  Units = hours  Calendar = proleptic_gregorian
  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss
  2020-08-17 18:00:00
cdo    sinfon: Processed 2 variables over 1 timestep [0.00s 7556KB].
## CONVERT NC1 TO GRB2 :
$ cdo -f grb2 copy GFS1_20200817_12_006.nc1 GFS1_20200817_12_006_2.grb2
## 2nd GBR2 : (522Ko)
$ cdo sinfon GFS1_20200817_12_006_2.grb2 
   File format : GRIB2
    -1 : Institut Source   T Steptype Levels Num    Points Num Dtype : Parameter name
     1 : NCEP     unknown  v instant       1   1     65160   1  F32  : 10u           
     2 : NCEP     unknown  v instant       1   1     65160   1  F32  : 10v           
   Grid coordinates :
     1 : lonlat                   : points=65160 (360x181)
                              lon : 0 to 359 by 1 degrees_east  circular
                              lat : 90 to -90 by -1 degrees_north
   Vertical coordinates :
     1 : height                   : levels=1
                           height : 10 m
   Time coordinate :  1 step
     RefTime =  2020-08-17 18:00:00  Units = hours  Calendar = proleptic_gregorian
  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss
  2020-08-17 18:00:00
cdo    sinfon: Processed 2 variables over 1 timestep [0.11s 20MB].

The first GRB2 is readable under my routing software, the second is not ...

I have the ecCodes tools installed and if I look with grib_compare this is what I get

$ grib_compare GFS1_20200817_12_006.grb2 GFS1_20200817_12_006_2.grb2 

-- GRIB #1 -- shortName=10u paramId=165 stepRange=6 levelType=sfc level=10 packingType=grid_complex_spatial_differencing gridType=regular_ll --
long [totalLength]: [79336] != [260810]
long [tablesVersion]: [2] != [4]
long [localTablesVersion]: [1] != [0]
long [hour]: [12] != [18]
string [typeOfProcessedData]: [fc] != [af]
long [shapeOfTheEarth]: [6] != [0]
scaleFactorOfRadiusOfSphericalEarth is set to missing in 2nd field is not missing in 1st field
scaledValueOfRadiusOfSphericalEarth is set to missing in 2nd field is not missing in 1st field
scaleFactorOfEarthMajorAxis is set to missing in 2nd field is not missing in 1st field
scaledValueOfEarthMajorAxis is set to missing in 2nd field is not missing in 1st field
scaleFactorOfEarthMinorAxis is set to missing in 2nd field is not missing in 1st field
scaledValueOfEarthMinorAxis is set to missing in 2nd field is not missing in 1st field
long [typeOfGeneratingProcess]: [2] != [0]
long [generatingProcessIdentifier]: [96] != [128]
long [forecastTime]: [6] != [0]
scaleFactorOfSecondFixedSurface is set to missing in 2nd field is not missing in 1st field
scaledValueOfSecondFixedSurface is set to missing in 2nd field is not missing in 1st field
long [section5Length]: [49] != [12]
long [dataRepresentationTemplateNumber]: [3] != [4]
double [referenceValue]: [-3.65532153320312500000e+03] != [0.00000000000000000000e+00]
    absolute diff. = 3655.32, relative diff. = 3655.32
    tolerance=0.000244141
long [decimalScaleFactor]: [2] != [0]
long [bitsPerValue]: [13] != [0]
[typeOfOriginalFieldValues] not found in 2nd field
[groupSplittingMethodUsed] not found in 2nd field
[missingValueManagementUsed] not found in 2nd field
[primaryMissingValueSubstitute] not found in 2nd field
[secondaryMissingValueSubstitute] not found in 2nd field
[numberOfGroupsOfDataValues] not found in 2nd field
[referenceForGroupWidths] not found in 2nd field
[numberOfBitsUsedForTheGroupWidths] not found in 2nd field
[referenceForGroupLengths] not found in 2nd field
[lengthIncrementForTheGroupLengths] not found in 2nd field
[trueLengthOfLastGroup] not found in 2nd field
[numberOfBitsForScaledGroupLengths] not found in 2nd field
[orderOfSpatialDifferencing] not found in 2nd field
[numberOfOctetsExtraDescriptors] not found in 2nd field
long [section7Length]: [79134] != [260645]
double [codedValues]: 62449 out of 65160 different
 max absolute diff. = 9.1552734460265128e-07, relative diff. = 5.64207e-08
    max diff. element 40253: 1.62267846679687508527e+01 1.62267837524414062500e+01
    tolerance=0.0000000000000000e+00
    values max= [26.2068]  [26.2068]         min= [-36.5532] [-36.5532]

What am I doing that is wrong or that I forget ? :(


Replies (12)

RE: Need explanation GRB2 -> NC1 -> GRB2 - Added by Karin Meier-Fleischer over 3 years ago

Hi Romain,

why do you use nc1? Try to use nc2 or nc4 for GRIB2.

-Karin

RE: Need explanation GRB2 -> NC1 -> GRB2 - Added by Romain LE LAMER over 3 years ago

Hi Karin,
Simply because I read this

<quote>
The classic format was the only format for netCDF data created between 1989 and 2004 by the reference software from Unidata. It is still the default format for new netCDF data files, and the form in which most netCDF data is stored. This format is also referred as CDF-1 format.

In 2004, the 64-bit offset format variant was added. Nearly identical to netCDF classic format, it allows users to create and access far larger datasets than were possible with the original format. (A 64-bit platform is not required to write or read 64-bit offset netCDF files.) This format is also referred as CDF-2 format.

In 2008, the netCDF-4 format was added to support per-variable compression, multiple unlimited dimensions, more complex data types, and better performance, by layering an enhanced netCDF access interface on top of the HDF5 format.

At the same time, a fourth format variant, netCDF-4 classic model format, was added for users who needed the performance benefits of the new format (such as compression) without the complexity of a new programming interface or enhanced data model.

In 2016, the 64-bit data format variant was added. To support large variables with more than 4-billion array elements, it replaces most of the 32-bit integers used in the format specification with 64-bit integers. It also adds support for several new data types including unsigned byte, unsigned short, unsigned int, signed 64-bit int and unsigned 64-bit int. A 64-bit platform is required to write or read 64-bit data netCDF files. This format is also referred as CDF-5 format.

With each additional format variant, the C-based reference software from Unidata has continued to support access to data stored in previous formats transparently, and to also support programs written using previous programming interfaces.

Although strictly speaking, there is no single "netCDF-3 format", that phrase is sometimes used instead of the more cumbersome but correct "netCDF classic CDF-1, 64-bit offset CDF-2, or 64-bit data CDF-5 format" to describe files created by the netCDF-3 (or netCDF-1 or netCDF-2) libraries. Similarly "netCDF-4 format" is sometimes used informally to mean "either the general netCDF-4 format or the restricted netCDF-4 classic model format". We will use these shorter phrases in FAQs below when no confusion is likely.
</quote>

so i went on the default format ...

I will do the test with the nc2 and nc4 formats
Thanks

RE: Need explanation GRB2 -> NC1 -> GRB2 - Added by Romain LE LAMER over 3 years ago

I tried the same manipulation as presented with the nc2 & nc4 formats.
The result is the same as above and grib_compare produces exactly the same output for the differences.

cdo -f grb2 copy infile.nc (1, 2 or 4) outfile.grb2

does produce a grib (no error) but this file cannot be used ...

grib_compare produce

-- GRIB #1 -- shortName=10u paramId=165 stepRange=6 levelType=sfc level=10 packingType=grid_complex_spatial_differencing gridType=regular_ll --
long [totalLength]: [79336] != [260810]
long [tablesVersion]: [2] != [4]
long [localTablesVersion]: [1] != [0]
long [hour]: [12] != [18]
string [typeOfProcessedData]: [fc] != [af]
long [shapeOfTheEarth]: [6] != [0]
scaleFactorOfRadiusOfSphericalEarth is set to missing in 2nd field is not missing in 1st field
scaledValueOfRadiusOfSphericalEarth is set to missing in 2nd field is not missing in 1st field
scaleFactorOfEarthMajorAxis is set to missing in 2nd field is not missing in 1st field
scaledValueOfEarthMajorAxis is set to missing in 2nd field is not missing in 1st field
scaleFactorOfEarthMinorAxis is set to missing in 2nd field is not missing in 1st field
scaledValueOfEarthMinorAxis is set to missing in 2nd field is not missing in 1st field
long [typeOfGeneratingProcess]: [2] != [0]
long [generatingProcessIdentifier]: [96] != [128]
long [forecastTime]: [6] != [0]
scaleFactorOfSecondFixedSurface is set to missing in 2nd field is not missing in 1st field
scaledValueOfSecondFixedSurface is set to missing in 2nd field is not missing in 1st field
long [section5Length]: [49] != [12]
long [dataRepresentationTemplateNumber]: [3] != [4]
double [referenceValue]: [-3.65532153320312500000e+03] != [0.00000000000000000000e+00]
    absolute diff. = 3655.32, relative diff. = 3655.32
    tolerance=0.000244141
long [decimalScaleFactor]: [2] != [0]
long [bitsPerValue]: [13] != [0]
[typeOfOriginalFieldValues] not found in 2nd field
[groupSplittingMethodUsed] not found in 2nd field
[missingValueManagementUsed] not found in 2nd field
[primaryMissingValueSubstitute] not found in 2nd field
[secondaryMissingValueSubstitute] not found in 2nd field
[numberOfGroupsOfDataValues] not found in 2nd field
[referenceForGroupWidths] not found in 2nd field
[numberOfBitsUsedForTheGroupWidths] not found in 2nd field
[referenceForGroupLengths] not found in 2nd field
[lengthIncrementForTheGroupLengths] not found in 2nd field
[trueLengthOfLastGroup] not found in 2nd field
[numberOfBitsForScaledGroupLengths] not found in 2nd field
[orderOfSpatialDifferencing] not found in 2nd field
[numberOfOctetsExtraDescriptors] not found in 2nd field
long [section7Length]: [79134] != [260645]
double [codedValues]: 62449 out of 65160 different
 max absolute diff. = 9.1552734460265128e-07, relative diff. = 5.64207e-08
    max diff. element 40253: 1.62267846679687508527e+01 1.62267837524414062500e+01
    tolerance=0.0000000000000000e+00
    values max= [26.2068]  [26.2068]         min= [-36.5532] [-36.5532]

RE: Need explanation GRB2 -> NC1 -> GRB2 - Added by Karin Meier-Fleischer over 3 years ago

Can you upload the file?

I think it is a precision problem when converting the file from grib2 to nc and back.
You can try to use the '-b F32' option but its only a guess and would not explain why
there are some fields set to missing.

RE: Need explanation GRB2 -> NC1 -> GRB2 - Added by Romain LE LAMER over 3 years ago

Hi Karin,
I am sure that -b F32 is not the cause because in my script it is present.

I used grib_dump and wgrib2 to find a start of an answer (it may be possible with cdo but I don't know how to do it)

grib_dump infile_noaa.grb2 > noaa.txt

vs
grib_dump infile_cdo.grb2 > cdo.txt

can be compared with a text comparator (I am using Beyond Compare)

We can see several differences and I think the concern is there.
What appeals to me the most:
• dataTime 12 for noaa / 18 for cdo
• radius - 6,371,229.0 m noaa / 6,367,470.0 m cdo (≠ 3759 m)
• shapeOfTheEarth 6 noaa / 0 cdo
Forecast noaa / Analysis cdo
• forecastTime 6 noaa / 0 cdo
• stepRange 6 noaa / 0 cdo

After some reading on the ECMWF site, it seems that all this is related to the Tables, which indeed are also different in the noaa vs cdo files

If I understand everything, it looks like the noaa file is of type Forecast but the output cdo file is of type analysis, wgrib2 seems to confirm it

$ wgrib2 GFS1_20200817_12_006.grb2 
1:0:d=2020081712:UGRD:10 m above ground:6 hour fcst:
2:79336:d=2020081712:VGRD:10 m above ground:6 hour fcst:
$ wgrib2 GFS1_20200817_12_006_2.grb2 
1:0:d=2020081718:UGRD:10 m above ground:anl:
2:260810:d=2020081718:VGRD:10 m above ground:anl:

My config

$ cdo -V
Climate Data Operators version 1.9.8 (https://mpimet.mpg.de/cdo)
System: x86_64-apple-darwin19.4.0
CXX Compiler: clang++ -std=gnu++11 -g -O2  -D_THREAD_SAFE -pthread
CXX version : Apple clang version 11.0.3 (clang-1103.0.32.59)
C Compiler: clang -g -O2  -D_THREAD_SAFE -pthread -D_THREAD_SAFE -D_THREAD_SAFE -pthread
C version : Apple clang version 11.0.3 (clang-1103.0.32.59)
F77 Compiler: gfortran -g -O2
F77 version : GNU Fortran (Homebrew GCC 9.3.0_1) 9.3.0
Features: 8GB 4threads C++11 Fortran DATA PTHREADS HDF5 NC4/HDF5 OPeNDAP SZ SSE4_2
Libraries: HDF5/1.12.0
Filetypes: srv ext ieg grb1 grb2 nc1 nc2 nc4 nc4c nc5 
     CDI library version : 1.9.8
 cgribex library version : 1.9.4
 ecCodes library version : 2.18.0
  NetCDF library version : 4.7.4 of Jul  2 2020 22:40:24 $
    hdf5 library version : 1.12.0
    exse library version : 1.4.1
    FILE library version : 1.8.3

$ codes_info
ecCodes Version 2.18.0
[...]

$ wgrib2 -version
v0.2.0.8 2/2019
[...]

RE: Need explanation GRB2 -> NC1 -> GRB2 - Added by Karin Meier-Fleischer over 3 years ago

Close, but no cigar. :) You should use -b F64.

cdo -f nc copy GFS1_20200817_12_006.grb2 GFS1_20200817_12_006.nc
cdo -b F64 -f nc -copy GFS1_20200817_12_006.grb2 GFS1_20200817_12_006.nc
cdo diff GFS1_20200817_12_006.grb2 GFS1_20200817_12_006.nc
cdo    diff: Processed 4 variables over 2 timesteps [0.06s 25MB].
cdo -f grb2 -copy GFS1_20200817_12_006.nc GFS1_20200817_12_006_cdo.grb2
cdo diff GFS1_20200817_12_006_cdo.grb2 GFS1_20200817_12_006.grb2
cdo    diff: Processed 4 variables over 2 timesteps [0.08s 28MB].

RE: Need explanation GRB2 -> NC1 -> GRB2 - Added by Romain LE LAMER over 3 years ago

Hi Karin,
The concern does not come from F32 or F64.

With the cdo diff tool we cannot visualize so many things with grib_compare

grib_ls allows you to see that the NOAA grib is of type "forecast"
string [typeOfProcessedData]: [fc]
still with grib_ls, the CDO grib is of type "analysis"
string [typeOfProcessedData]: [af]

So the first concern comes from there
I receive a NOAA grib (forecast) I convert it to netcdf to apply formulas / calculations and to netcdf -> grb2 conversion I get a grib analysis that I cannot use with any software that only reads the forecast type :(

RE: Need explanation GRB2 -> NC1 -> GRB2 - Added by Uwe Schulzweida over 3 years ago

The GRIB to netCDF conversion only considers those attributes that are also present in the NetCDF CF-convention. Some GRIB attributes are therefore lost. These can be added back with the ecCodes tool grib_set if necessary. Here is an example to set the attribute typeOfProcessedData to fc:

grib_set -s typeOfProcessedData=fc infile outfile

RE: Need explanation GRB2 -> NC1 -> GRB2 - Added by Romain LE LAMER over 3 years ago

Hi Uwe & Karin,
Thanks to confirm to me what I was thinking and yes I discovered grib_set during my research to understand the problem I am having.
Which brings me to 1 question:
• Is it possible to modify the sources so that the parameters when converting netCDF file to Grib2 file are done correctly ? If so, which file(s) does it happen in ?

RE: Need explanation GRB2 -> NC1 -> GRB2 - Added by Uwe Schulzweida over 3 years ago

Unfortunately it is not possible to modify the sources for this task. The real problem starts with the conversion from GRIB to NetCDF. Reading all GRIB attributes is not provided in the code. Mapping all GRIB attributes to NetCDF attributes and back would be extremely difficult. All in all, almost the whole code would have to be changed for this task.

RE: Need explanation GRB2 -> NC1 -> GRB2 - Added by Romain LE LAMER over 3 years ago

Hi Uwe and Karin,
I didn't drop my stories from grb2 -> nc1 -> grb2.
I found a method that seems to work but I need to switch from grb2 format to grb format.
it boils down to this:
• NOAA (file.grb2)

• With eccodes:

 
grib_set -r -s bitmapPresent = 1, packingType = grid_simple infile.grb2 outfile.grb2
 

• With cnvgrib:

 
cnvgrib2to1 infile.grb2 outfile.grb
 

• With cdo:

 
cdo -f nc1 -b F32 file.grb file.nc1
 

• With cdo:

 
cdo -f grb -b F32 file.nc1 file.grb
 

This last outfile.grb is readable in my software

From what I understand (with cdo):
grb2 = NOAA
• grb2 to grb => NOK
• grb2 to nc1, nc1 to grb2 => NOK
grb = NOAA.grb2 + grib_set + cnvgrib2to1
• grb to grb2 => NOK
• grb to nc1, nc1 to grb => OK

    (1-12/12)