Project

General

Profile

Converting GRIB2 to NC4: Should I expect zero-diff?

Added by Matt Thompson about 10 years ago

All,

I recently tried converting a GRIB2 file (grabbed at random from internet) to NC4 using CDO and all went well. However, when I do a diffn I see:

(1531) $ cdo -f nc4 copy CMC_naefs-geps-raw_ARAIN_SFC_0_latlon1p0x1p0_2014090212_P018_allmbrs.grib2 CMC_naefs-geps-raw_ARAIN_SFC_0_latlon1p0x1p0_2014090212_P018_allmbrs.nc4
cdo copy: Processed 1368360 values from 1 variable over 21 timesteps ( 1.13s )
(1532) $ cdo diffn CMC_naefs-geps-raw_ARAIN_SFC_0_latlon1p0x1p0_2014090212_P018_allmbrs.grib2 CMC_naefs-geps-raw_ARAIN_SFC_0_latlon1p0x1p0_2014090212_P018_allmbrs.nc4
               Date     Time   Level Gridsize    Miss : S Z  Max_Absdiff Max_Reldiff : Parameter name
     1 : 2014-09-03 06:00:00       0    65160       0 : F T   2.1362e-06  5.7140e-08 : rprate     
     2 : 2014-09-03 06:00:00       0    65160       0 : F T   4.8828e-06  5.7078e-08 : rprate     
     3 : 2014-09-03 06:00:00       0    65160       0 : F T   7.3242e-06  5.7140e-08 : rprate     
     4 : 2014-09-03 06:00:00       0    65160       0 : F T   7.3242e-06  5.7078e-08 : rprate     
     5 : 2014-09-03 06:00:00       0    65160       0 : F T   3.6621e-06  5.7189e-08 : rprate     
     6 : 2014-09-03 06:00:00       0    65160       0 : F T   6.7139e-06  5.7078e-08 : rprate     
     7 : 2014-09-03 06:00:00       0    65160       0 : F T   4.8828e-06  5.7140e-08 : rprate     
     8 : 2014-09-03 06:00:00       0    65160       0 : F T   1.0986e-05  5.7078e-08 : rprate     
     9 : 2014-09-03 06:00:00       0    65160       0 : F T   6.7139e-06  5.7189e-08 : rprate     
    10 : 2014-09-03 06:00:00       0    65160       0 : F T   6.7139e-06  5.7140e-08 : rprate     
    11 : 2014-09-03 06:00:00       0    65160       0 : F T   3.6621e-06  5.7078e-08 : rprate     
    12 : 2014-09-03 06:00:00       0    65160       0 : F T   3.6621e-06  5.7078e-08 : rprate     
    13 : 2014-09-03 06:00:00       0    65160       0 : F T   4.2725e-06  5.7084e-08 : rprate     
    14 : 2014-09-03 06:00:00       0    65160       0 : F T   3.6621e-06  5.7140e-08 : rprate     
    15 : 2014-09-03 06:00:00       0    65160       0 : F T   3.6621e-06  5.7189e-08 : rprate     
    16 : 2014-09-03 06:00:00       0    65160       0 : F T   3.6621e-06  5.7084e-08 : rprate     
    17 : 2014-09-03 06:00:00       0    65160       0 : F T   6.7139e-06  5.7078e-08 : rprate     
    18 : 2014-09-03 06:00:00       0    65160       0 : F T   3.6621e-06  5.7189e-08 : rprate     
    19 : 2014-09-03 06:00:00       0    65160       0 : F T   3.6621e-06  5.7140e-08 : rprate     
    20 : 2014-09-03 06:00:00       0    65160       0 : F T   6.7139e-06  5.7140e-08 : rprate     
    21 : 2014-09-03 06:00:00       0    65160       0 : F T   7.3242e-06  5.7140e-08 : rprate     
  21 of 21 records differ
  0 of 21 records differ more than 0.001
cdo diffn: Processed 2736720 values from 2 variables over 42 timesteps ( 1.18s )

I just wanted to make sure this is to be expected.

A sinfon showed that the grib2 seemed to be, maybe, 16-bit (??):

(1533) $ cdo sinfon CMC_naefs-geps-raw_ARAIN_SFC_0_latlon1p0x1p0_2014090212_P018_allmbrs.grib2
   File format: GRIB2 JPEG
    -1 : Institut Source   Ttype    Levels Num  Gridsize Num Dtype : Parameter name
     1 : unknown  unknown  accum         1   1     65160   1  P16z : rprate        
...snip...

while the NC4 was 32-bit:
(1534) $ cdo sinfon CMC_naefs-geps-raw_ARAIN_SFC_0_latlon1p0x1p0_2014090212_P018_allmbrs.nc4
   File format: netCDF4
    -1 : Institut Source   Ttype    Levels Num  Gridsize Num Dtype : Parameter name
     1 : unknown  unknown  instant       1   1     65160   1  F32  : rprate        
...snip...

I'm fairly certain F32 is 32-bit float, but I'm not too sure what P16z means. Well, the z seems to be JPEG compression, it's the P16 I'm not sure of. I'm guessing if it isn't 32-bit but some shaved representation for compression efficiency that one could small differences upon conversion.

Matt


Replies (3)

RE: Converting GRIB2 to NC4: Should I expect zero-diff? - Added by Uwe Schulzweida about 10 years ago

CDO is internally using 64-bit floats. That means the 16-bit packed and jpeg compress GRIB record is convert to 64-bit float. Some information could be lost if you convert it to 32-bit float (this is the default) netCDF. You can use the CDO option -b 64 to write 64-bit float. Then is the result 100% the same:

cdo -f nc4 -b 64 copy CMC.grib2 CMC.nc4

RE: Converting GRIB2 to NC4: Should I expect zero-diff? - Added by Matt Thompson about 10 years ago

Uwe Schulzweida wrote:

CDO is internally using 64-bit floats. That means the 16-bit packed and jpeg compress GRIB record is convert to 64-bit float. Some information could be lost if you convert it to 32-bit float (this is the default) netCDF. You can use the CDO option -b 64 to write 64-bit float. Then is the result 100% the same:
[...]

Uwe,

Right you are!

One other question that is grib-ish related. I also tried converting a GRIB1 file as well and I noticed the order changed after I did a cdo diffn and got different results:

(1677) $ cdo -f nc4 copy sample.grib sample.nc4
cdo copy: Processed 1563840 values from 2 variables over 3 timesteps ( 0.04s )
(1678) $ cdo diffn sample.nc4 sample.grib
               Date     Time   Level Gridsize    Miss : S Z  Max_Absdiff Max_Reldiff : Parameter name
     1 : 2004-04-02 00:00:00  100000    65160       0 : F T   7.6294e-07  4.7095e-08 : var33      
     2 : 2004-04-02 00:00:00  100000    65160       0 : T T       49.000     0.99664 : var33      
     3 : 2004-04-02 00:00:00   85000    65160       0 : T T       42.000     0.99711 : var33      
     4 : 2004-04-02 00:00:00   85000    65160       0 : T T       82.400     0.99855 : var33      
     5 : 2004-04-02 00:00:00   50000    65160       0 : T T       63.900     0.99826 : var34      
     6 : 2004-04-02 00:00:00   50000    65160       0 : T T       44.900     0.99699 : var34      
     7 : 2004-04-02 00:00:00   20000    65160       0 : T T       92.300     0.99858 : var34      
     8 : 2004-04-02 00:00:00   20000    65160       0 : F T   1.5259e-06  4.7535e-08 : var34      
     9 : 2004-04-02 06:00:00  100000    65160       0 : F T   7.6294e-07  4.7095e-08 : var33      
    10 : 2004-04-02 06:00:00  100000    65160       0 : T T       49.700     0.99678 : var33      
    11 : 2004-04-02 06:00:00   85000    65160       0 : T T       44.500     0.99676 : var33      
    12 : 2004-04-02 06:00:00   85000    65160       0 : T T       81.000     0.99828 : var33      
    13 : 2004-04-02 06:00:00   50000    65160       0 : T T       58.200     0.99823 : var34      
    14 : 2004-04-02 06:00:00   50000    65160       0 : T T       42.200     0.99742 : var34      
    15 : 2004-04-02 06:00:00   20000    65160       0 : T T       95.600     0.99842 : var34      
    16 : 2004-04-02 06:00:00   20000    65160       0 : F T   1.5259e-06  4.7535e-08 : var34      
    17 : 2004-04-02 12:00:00  100000    65160       0 : F T   7.6294e-07  4.7095e-08 : var33      
    18 : 2004-04-02 12:00:00  100000    65160       0 : T T       48.800     0.99628 : var33      
    19 : 2004-04-02 12:00:00   85000    65160       0 : T T       46.200     0.99760 : var33      
    20 : 2004-04-02 12:00:00   85000    65160       0 : T T       88.900     0.99858 : var33      
    21 : 2004-04-02 12:00:00   50000    65160       0 : T T       60.000     0.99796 : var34      
    22 : 2004-04-02 12:00:00   50000    65160       0 : T T       40.700     0.99696 : var34      
    23 : 2004-04-02 12:00:00   20000    65160       0 : T T       95.300     0.99853 : var34      
    24 : 2004-04-02 12:00:00   20000    65160       0 : F T   1.5259e-06  4.7535e-08 : var34      
  24 of 24 records differ
  18 of 24 records differ more than 0.001
cdo diffn: Processed 3127680 values from 4 variables over 6 timesteps ( 0.10s )
(1679) $ cdo infon sample.grib
    -1 :       Date     Time   Level Gridsize    Miss :     Minimum        Mean     Maximum : Parameter name
     1 : 2004-04-02 00:00:00  100000    65160       0 :     -24.800    -0.15758      29.300 : var33         
     2 : 2004-04-02 00:00:00  100000    65160       0 :     -21.600   -0.016963      23.900 : var34         
     3 : 2004-04-02 00:00:00   85000    65160       0 :     -29.200      1.3656      36.900 : var33         
     4 : 2004-04-02 00:00:00   85000    65160       0 :     -24.300     0.13997      30.200 : var34         
     5 : 2004-04-02 00:00:00   50000    65160       0 :     -31.900      6.6042      60.800 : var33         
     6 : 2004-04-02 00:00:00   50000    65160       0 :     -49.900    0.020106      56.300 : var34         
     7 : 2004-04-02 00:00:00   20000    65160       0 :     -33.300      15.183      81.100 : var33         
     8 : 2004-04-02 00:00:00   20000    65160       0 :     -49.900    0.064019      52.200 : var34         
     9 : 2004-04-02 06:00:00  100000    65160       0 :     -27.500    -0.10051      28.000 : var33         
    10 : 2004-04-02 06:00:00  100000    65160       0 :     -19.800   -0.019998      21.500 : var34         
    11 : 2004-04-02 06:00:00   85000    65160       0 :     -31.600      1.2970      36.800 : var33         
    12 : 2004-04-02 06:00:00   85000    65160       0 :     -27.700     0.10693      30.800 : var34         
    13 : 2004-04-02 06:00:00   50000    65160       0 :     -28.800      7.0169      60.900 : var33         
    14 : 2004-04-02 06:00:00   50000    65160       0 :     -47.400   -0.017706      53.500 : var34         
    15 : 2004-04-02 06:00:00   20000    65160       0 :     -31.200      15.194      79.500 : var33         
    16 : 2004-04-02 06:00:00   20000    65160       0 :     -52.100     0.11962      48.800 : var34         
    17 : 2004-04-02 12:00:00  100000    65160       0 :     -27.400   -0.037144      28.900 : var33         
    18 : 2004-04-02 12:00:00  100000    65160       0 :     -20.800    0.042563      22.000 : var34         
    19 : 2004-04-02 12:00:00   85000    65160       0 :     -31.200      1.4964      36.300 : var33         
    20 : 2004-04-02 12:00:00   85000    65160       0 :     -28.700    0.073415      38.000 : var34         
    21 : 2004-04-02 12:00:00   50000    65160       0 :     -26.000      7.1112      58.500 : var33         
    22 : 2004-04-02 12:00:00   50000    65160       0 :     -44.900    0.099064      54.200 : var34         
    23 : 2004-04-02 12:00:00   20000    65160       0 :     -31.900      15.193      78.400 : var33         
    24 : 2004-04-02 12:00:00   20000    65160       0 :     -55.000    0.053691      49.400 : var34         
cdo infon: Processed 1563840 values from 2 variables over 3 timesteps ( 0.04s )
(1680) $ cdo infon sample.nc4
    -1 :       Date     Time   Level Gridsize    Miss :     Minimum        Mean     Maximum : Parameter name
     1 : 2004-04-02 00:00:00  100000    65160       0 :     -24.800    -0.15758      29.300 : var33         
     2 : 2004-04-02 00:00:00   85000    65160       0 :     -29.200      1.3656      36.900 : var33         
     3 : 2004-04-02 00:00:00   50000    65160       0 :     -31.900      6.6042      60.800 : var33         
     4 : 2004-04-02 00:00:00   20000    65160       0 :     -33.300      15.183      81.100 : var33         
     5 : 2004-04-02 00:00:00  100000    65160       0 :     -21.600   -0.016963      23.900 : var34         
     6 : 2004-04-02 00:00:00   85000    65160       0 :     -24.300     0.13997      30.200 : var34         
     7 : 2004-04-02 00:00:00   50000    65160       0 :     -49.900    0.020106      56.300 : var34         
     8 : 2004-04-02 00:00:00   20000    65160       0 :     -49.900    0.064019      52.200 : var34         
     9 : 2004-04-02 06:00:00  100000    65160       0 :     -27.500    -0.10051      28.000 : var33         
    10 : 2004-04-02 06:00:00   85000    65160       0 :     -31.600      1.2970      36.800 : var33         
    11 : 2004-04-02 06:00:00   50000    65160       0 :     -28.800      7.0169      60.900 : var33         
    12 : 2004-04-02 06:00:00   20000    65160       0 :     -31.200      15.194      79.500 : var33         
    13 : 2004-04-02 06:00:00  100000    65160       0 :     -19.800   -0.019998      21.500 : var34         
    14 : 2004-04-02 06:00:00   85000    65160       0 :     -27.700     0.10693      30.800 : var34         
    15 : 2004-04-02 06:00:00   50000    65160       0 :     -47.400   -0.017706      53.500 : var34         
    16 : 2004-04-02 06:00:00   20000    65160       0 :     -52.100     0.11962      48.800 : var34         
    17 : 2004-04-02 12:00:00  100000    65160       0 :     -27.400   -0.037144      28.900 : var33         
    18 : 2004-04-02 12:00:00   85000    65160       0 :     -31.200      1.4964      36.300 : var33         
    19 : 2004-04-02 12:00:00   50000    65160       0 :     -26.000      7.1112      58.500 : var33         
    20 : 2004-04-02 12:00:00   20000    65160       0 :     -31.900      15.193      78.400 : var33         
    21 : 2004-04-02 12:00:00  100000    65160       0 :     -20.800    0.042563      22.000 : var34         
    22 : 2004-04-02 12:00:00   85000    65160       0 :     -28.700    0.073415      38.000 : var34         
    23 : 2004-04-02 12:00:00   50000    65160       0 :     -44.900    0.099064      54.200 : var34         
    24 : 2004-04-02 12:00:00   20000    65160       0 :     -55.000    0.053691      49.400 : var34         
cdo infon: Processed 1563840 values from 2 variables over 3 timesteps ( 0.02s )

Now, obviously I can select a certain time/level/name combo and diff, but I was wondering if you had a slick way to "enforce" an ordering that isn't alphabetical?

Matt

RE: Converting GRIB2 to NC4: Should I expect zero-diff? - Added by Uwe Schulzweida about 10 years ago

The records in the GRIB file are sorted by levels. In netCDF these records are sorted by variable. You can use the undocumented CDO operator sortlevel to compare these files with diff:

cdo diffn -sortlevel sample.nc4 -sortlevel sample.grib

    (1-3/3)