Converting GRIB2 to NC4: Should I expect zero-diff?
Added by Matt Thompson about 10 years ago
All,
I recently tried converting a GRIB2 file (grabbed at random from internet) to NC4 using CDO and all went well. However, when I do a diffn I see:
(1531) $ cdo -f nc4 copy CMC_naefs-geps-raw_ARAIN_SFC_0_latlon1p0x1p0_2014090212_P018_allmbrs.grib2 CMC_naefs-geps-raw_ARAIN_SFC_0_latlon1p0x1p0_2014090212_P018_allmbrs.nc4 cdo copy: Processed 1368360 values from 1 variable over 21 timesteps ( 1.13s ) (1532) $ cdo diffn CMC_naefs-geps-raw_ARAIN_SFC_0_latlon1p0x1p0_2014090212_P018_allmbrs.grib2 CMC_naefs-geps-raw_ARAIN_SFC_0_latlon1p0x1p0_2014090212_P018_allmbrs.nc4 Date Time Level Gridsize Miss : S Z Max_Absdiff Max_Reldiff : Parameter name 1 : 2014-09-03 06:00:00 0 65160 0 : F T 2.1362e-06 5.7140e-08 : rprate 2 : 2014-09-03 06:00:00 0 65160 0 : F T 4.8828e-06 5.7078e-08 : rprate 3 : 2014-09-03 06:00:00 0 65160 0 : F T 7.3242e-06 5.7140e-08 : rprate 4 : 2014-09-03 06:00:00 0 65160 0 : F T 7.3242e-06 5.7078e-08 : rprate 5 : 2014-09-03 06:00:00 0 65160 0 : F T 3.6621e-06 5.7189e-08 : rprate 6 : 2014-09-03 06:00:00 0 65160 0 : F T 6.7139e-06 5.7078e-08 : rprate 7 : 2014-09-03 06:00:00 0 65160 0 : F T 4.8828e-06 5.7140e-08 : rprate 8 : 2014-09-03 06:00:00 0 65160 0 : F T 1.0986e-05 5.7078e-08 : rprate 9 : 2014-09-03 06:00:00 0 65160 0 : F T 6.7139e-06 5.7189e-08 : rprate 10 : 2014-09-03 06:00:00 0 65160 0 : F T 6.7139e-06 5.7140e-08 : rprate 11 : 2014-09-03 06:00:00 0 65160 0 : F T 3.6621e-06 5.7078e-08 : rprate 12 : 2014-09-03 06:00:00 0 65160 0 : F T 3.6621e-06 5.7078e-08 : rprate 13 : 2014-09-03 06:00:00 0 65160 0 : F T 4.2725e-06 5.7084e-08 : rprate 14 : 2014-09-03 06:00:00 0 65160 0 : F T 3.6621e-06 5.7140e-08 : rprate 15 : 2014-09-03 06:00:00 0 65160 0 : F T 3.6621e-06 5.7189e-08 : rprate 16 : 2014-09-03 06:00:00 0 65160 0 : F T 3.6621e-06 5.7084e-08 : rprate 17 : 2014-09-03 06:00:00 0 65160 0 : F T 6.7139e-06 5.7078e-08 : rprate 18 : 2014-09-03 06:00:00 0 65160 0 : F T 3.6621e-06 5.7189e-08 : rprate 19 : 2014-09-03 06:00:00 0 65160 0 : F T 3.6621e-06 5.7140e-08 : rprate 20 : 2014-09-03 06:00:00 0 65160 0 : F T 6.7139e-06 5.7140e-08 : rprate 21 : 2014-09-03 06:00:00 0 65160 0 : F T 7.3242e-06 5.7140e-08 : rprate 21 of 21 records differ 0 of 21 records differ more than 0.001 cdo diffn: Processed 2736720 values from 2 variables over 42 timesteps ( 1.18s )
I just wanted to make sure this is to be expected.
A sinfon showed that the grib2 seemed to be, maybe, 16-bit (??):
(1533) $ cdo sinfon CMC_naefs-geps-raw_ARAIN_SFC_0_latlon1p0x1p0_2014090212_P018_allmbrs.grib2 File format: GRIB2 JPEG -1 : Institut Source Ttype Levels Num Gridsize Num Dtype : Parameter name 1 : unknown unknown accum 1 1 65160 1 P16z : rprate ...snip...
while the NC4 was 32-bit:
(1534) $ cdo sinfon CMC_naefs-geps-raw_ARAIN_SFC_0_latlon1p0x1p0_2014090212_P018_allmbrs.nc4 File format: netCDF4 -1 : Institut Source Ttype Levels Num Gridsize Num Dtype : Parameter name 1 : unknown unknown instant 1 1 65160 1 F32 : rprate ...snip...
I'm fairly certain F32 is 32-bit float, but I'm not too sure what P16z means. Well, the z seems to be JPEG compression, it's the P16 I'm not sure of. I'm guessing if it isn't 32-bit but some shaved representation for compression efficiency that one could small differences upon conversion.
Matt
Replies (3)
RE: Converting GRIB2 to NC4: Should I expect zero-diff? - Added by Uwe Schulzweida about 10 years ago
CDO is internally using 64-bit floats. That means the 16-bit packed and jpeg compress GRIB record is convert to 64-bit float. Some information could be lost if you convert it to 32-bit float (this is the default) netCDF. You can use the CDO option -b 64 to write 64-bit float. Then is the result 100% the same:
cdo -f nc4 -b 64 copy CMC.grib2 CMC.nc4
RE: Converting GRIB2 to NC4: Should I expect zero-diff? - Added by Matt Thompson about 10 years ago
Uwe Schulzweida wrote:
CDO is internally using 64-bit floats. That means the 16-bit packed and jpeg compress GRIB record is convert to 64-bit float. Some information could be lost if you convert it to 32-bit float (this is the default) netCDF. You can use the CDO option -b 64 to write 64-bit float. Then is the result 100% the same:
[...]
Uwe,
Right you are!
One other question that is grib-ish related. I also tried converting a GRIB1 file as well and I noticed the order changed after I did a cdo diffn and got different results:
(1677) $ cdo -f nc4 copy sample.grib sample.nc4 cdo copy: Processed 1563840 values from 2 variables over 3 timesteps ( 0.04s ) (1678) $ cdo diffn sample.nc4 sample.grib Date Time Level Gridsize Miss : S Z Max_Absdiff Max_Reldiff : Parameter name 1 : 2004-04-02 00:00:00 100000 65160 0 : F T 7.6294e-07 4.7095e-08 : var33 2 : 2004-04-02 00:00:00 100000 65160 0 : T T 49.000 0.99664 : var33 3 : 2004-04-02 00:00:00 85000 65160 0 : T T 42.000 0.99711 : var33 4 : 2004-04-02 00:00:00 85000 65160 0 : T T 82.400 0.99855 : var33 5 : 2004-04-02 00:00:00 50000 65160 0 : T T 63.900 0.99826 : var34 6 : 2004-04-02 00:00:00 50000 65160 0 : T T 44.900 0.99699 : var34 7 : 2004-04-02 00:00:00 20000 65160 0 : T T 92.300 0.99858 : var34 8 : 2004-04-02 00:00:00 20000 65160 0 : F T 1.5259e-06 4.7535e-08 : var34 9 : 2004-04-02 06:00:00 100000 65160 0 : F T 7.6294e-07 4.7095e-08 : var33 10 : 2004-04-02 06:00:00 100000 65160 0 : T T 49.700 0.99678 : var33 11 : 2004-04-02 06:00:00 85000 65160 0 : T T 44.500 0.99676 : var33 12 : 2004-04-02 06:00:00 85000 65160 0 : T T 81.000 0.99828 : var33 13 : 2004-04-02 06:00:00 50000 65160 0 : T T 58.200 0.99823 : var34 14 : 2004-04-02 06:00:00 50000 65160 0 : T T 42.200 0.99742 : var34 15 : 2004-04-02 06:00:00 20000 65160 0 : T T 95.600 0.99842 : var34 16 : 2004-04-02 06:00:00 20000 65160 0 : F T 1.5259e-06 4.7535e-08 : var34 17 : 2004-04-02 12:00:00 100000 65160 0 : F T 7.6294e-07 4.7095e-08 : var33 18 : 2004-04-02 12:00:00 100000 65160 0 : T T 48.800 0.99628 : var33 19 : 2004-04-02 12:00:00 85000 65160 0 : T T 46.200 0.99760 : var33 20 : 2004-04-02 12:00:00 85000 65160 0 : T T 88.900 0.99858 : var33 21 : 2004-04-02 12:00:00 50000 65160 0 : T T 60.000 0.99796 : var34 22 : 2004-04-02 12:00:00 50000 65160 0 : T T 40.700 0.99696 : var34 23 : 2004-04-02 12:00:00 20000 65160 0 : T T 95.300 0.99853 : var34 24 : 2004-04-02 12:00:00 20000 65160 0 : F T 1.5259e-06 4.7535e-08 : var34 24 of 24 records differ 18 of 24 records differ more than 0.001 cdo diffn: Processed 3127680 values from 4 variables over 6 timesteps ( 0.10s ) (1679) $ cdo infon sample.grib -1 : Date Time Level Gridsize Miss : Minimum Mean Maximum : Parameter name 1 : 2004-04-02 00:00:00 100000 65160 0 : -24.800 -0.15758 29.300 : var33 2 : 2004-04-02 00:00:00 100000 65160 0 : -21.600 -0.016963 23.900 : var34 3 : 2004-04-02 00:00:00 85000 65160 0 : -29.200 1.3656 36.900 : var33 4 : 2004-04-02 00:00:00 85000 65160 0 : -24.300 0.13997 30.200 : var34 5 : 2004-04-02 00:00:00 50000 65160 0 : -31.900 6.6042 60.800 : var33 6 : 2004-04-02 00:00:00 50000 65160 0 : -49.900 0.020106 56.300 : var34 7 : 2004-04-02 00:00:00 20000 65160 0 : -33.300 15.183 81.100 : var33 8 : 2004-04-02 00:00:00 20000 65160 0 : -49.900 0.064019 52.200 : var34 9 : 2004-04-02 06:00:00 100000 65160 0 : -27.500 -0.10051 28.000 : var33 10 : 2004-04-02 06:00:00 100000 65160 0 : -19.800 -0.019998 21.500 : var34 11 : 2004-04-02 06:00:00 85000 65160 0 : -31.600 1.2970 36.800 : var33 12 : 2004-04-02 06:00:00 85000 65160 0 : -27.700 0.10693 30.800 : var34 13 : 2004-04-02 06:00:00 50000 65160 0 : -28.800 7.0169 60.900 : var33 14 : 2004-04-02 06:00:00 50000 65160 0 : -47.400 -0.017706 53.500 : var34 15 : 2004-04-02 06:00:00 20000 65160 0 : -31.200 15.194 79.500 : var33 16 : 2004-04-02 06:00:00 20000 65160 0 : -52.100 0.11962 48.800 : var34 17 : 2004-04-02 12:00:00 100000 65160 0 : -27.400 -0.037144 28.900 : var33 18 : 2004-04-02 12:00:00 100000 65160 0 : -20.800 0.042563 22.000 : var34 19 : 2004-04-02 12:00:00 85000 65160 0 : -31.200 1.4964 36.300 : var33 20 : 2004-04-02 12:00:00 85000 65160 0 : -28.700 0.073415 38.000 : var34 21 : 2004-04-02 12:00:00 50000 65160 0 : -26.000 7.1112 58.500 : var33 22 : 2004-04-02 12:00:00 50000 65160 0 : -44.900 0.099064 54.200 : var34 23 : 2004-04-02 12:00:00 20000 65160 0 : -31.900 15.193 78.400 : var33 24 : 2004-04-02 12:00:00 20000 65160 0 : -55.000 0.053691 49.400 : var34 cdo infon: Processed 1563840 values from 2 variables over 3 timesteps ( 0.04s ) (1680) $ cdo infon sample.nc4 -1 : Date Time Level Gridsize Miss : Minimum Mean Maximum : Parameter name 1 : 2004-04-02 00:00:00 100000 65160 0 : -24.800 -0.15758 29.300 : var33 2 : 2004-04-02 00:00:00 85000 65160 0 : -29.200 1.3656 36.900 : var33 3 : 2004-04-02 00:00:00 50000 65160 0 : -31.900 6.6042 60.800 : var33 4 : 2004-04-02 00:00:00 20000 65160 0 : -33.300 15.183 81.100 : var33 5 : 2004-04-02 00:00:00 100000 65160 0 : -21.600 -0.016963 23.900 : var34 6 : 2004-04-02 00:00:00 85000 65160 0 : -24.300 0.13997 30.200 : var34 7 : 2004-04-02 00:00:00 50000 65160 0 : -49.900 0.020106 56.300 : var34 8 : 2004-04-02 00:00:00 20000 65160 0 : -49.900 0.064019 52.200 : var34 9 : 2004-04-02 06:00:00 100000 65160 0 : -27.500 -0.10051 28.000 : var33 10 : 2004-04-02 06:00:00 85000 65160 0 : -31.600 1.2970 36.800 : var33 11 : 2004-04-02 06:00:00 50000 65160 0 : -28.800 7.0169 60.900 : var33 12 : 2004-04-02 06:00:00 20000 65160 0 : -31.200 15.194 79.500 : var33 13 : 2004-04-02 06:00:00 100000 65160 0 : -19.800 -0.019998 21.500 : var34 14 : 2004-04-02 06:00:00 85000 65160 0 : -27.700 0.10693 30.800 : var34 15 : 2004-04-02 06:00:00 50000 65160 0 : -47.400 -0.017706 53.500 : var34 16 : 2004-04-02 06:00:00 20000 65160 0 : -52.100 0.11962 48.800 : var34 17 : 2004-04-02 12:00:00 100000 65160 0 : -27.400 -0.037144 28.900 : var33 18 : 2004-04-02 12:00:00 85000 65160 0 : -31.200 1.4964 36.300 : var33 19 : 2004-04-02 12:00:00 50000 65160 0 : -26.000 7.1112 58.500 : var33 20 : 2004-04-02 12:00:00 20000 65160 0 : -31.900 15.193 78.400 : var33 21 : 2004-04-02 12:00:00 100000 65160 0 : -20.800 0.042563 22.000 : var34 22 : 2004-04-02 12:00:00 85000 65160 0 : -28.700 0.073415 38.000 : var34 23 : 2004-04-02 12:00:00 50000 65160 0 : -44.900 0.099064 54.200 : var34 24 : 2004-04-02 12:00:00 20000 65160 0 : -55.000 0.053691 49.400 : var34 cdo infon: Processed 1563840 values from 2 variables over 3 timesteps ( 0.02s )
Now, obviously I can select a certain time/level/name combo and diff, but I was wondering if you had a slick way to "enforce" an ordering that isn't alphabetical?
Matt
RE: Converting GRIB2 to NC4: Should I expect zero-diff? - Added by Uwe Schulzweida about 10 years ago
The records in the GRIB file are sorted by levels. In netCDF these records are sorted by variable. You can use the undocumented CDO operator sortlevel to compare these files with diff:
cdo diffn -sortlevel sample.nc4 -sortlevel sample.grib