Converting GRIB2 to NC4: Should I expect zero-diff?
Added by Matt Thompson over 11 years ago
All,
I recently tried converting a GRIB2 file (grabbed at random from internet) to NC4 using CDO and all went well. However, when I do a diffn I see:
(1531) $ cdo -f nc4 copy CMC_naefs-geps-raw_ARAIN_SFC_0_latlon1p0x1p0_2014090212_P018_allmbrs.grib2 CMC_naefs-geps-raw_ARAIN_SFC_0_latlon1p0x1p0_2014090212_P018_allmbrs.nc4
cdo copy: Processed 1368360 values from 1 variable over 21 timesteps ( 1.13s )
(1532) $ cdo diffn CMC_naefs-geps-raw_ARAIN_SFC_0_latlon1p0x1p0_2014090212_P018_allmbrs.grib2 CMC_naefs-geps-raw_ARAIN_SFC_0_latlon1p0x1p0_2014090212_P018_allmbrs.nc4
Date Time Level Gridsize Miss : S Z Max_Absdiff Max_Reldiff : Parameter name
1 : 2014-09-03 06:00:00 0 65160 0 : F T 2.1362e-06 5.7140e-08 : rprate
2 : 2014-09-03 06:00:00 0 65160 0 : F T 4.8828e-06 5.7078e-08 : rprate
3 : 2014-09-03 06:00:00 0 65160 0 : F T 7.3242e-06 5.7140e-08 : rprate
4 : 2014-09-03 06:00:00 0 65160 0 : F T 7.3242e-06 5.7078e-08 : rprate
5 : 2014-09-03 06:00:00 0 65160 0 : F T 3.6621e-06 5.7189e-08 : rprate
6 : 2014-09-03 06:00:00 0 65160 0 : F T 6.7139e-06 5.7078e-08 : rprate
7 : 2014-09-03 06:00:00 0 65160 0 : F T 4.8828e-06 5.7140e-08 : rprate
8 : 2014-09-03 06:00:00 0 65160 0 : F T 1.0986e-05 5.7078e-08 : rprate
9 : 2014-09-03 06:00:00 0 65160 0 : F T 6.7139e-06 5.7189e-08 : rprate
10 : 2014-09-03 06:00:00 0 65160 0 : F T 6.7139e-06 5.7140e-08 : rprate
11 : 2014-09-03 06:00:00 0 65160 0 : F T 3.6621e-06 5.7078e-08 : rprate
12 : 2014-09-03 06:00:00 0 65160 0 : F T 3.6621e-06 5.7078e-08 : rprate
13 : 2014-09-03 06:00:00 0 65160 0 : F T 4.2725e-06 5.7084e-08 : rprate
14 : 2014-09-03 06:00:00 0 65160 0 : F T 3.6621e-06 5.7140e-08 : rprate
15 : 2014-09-03 06:00:00 0 65160 0 : F T 3.6621e-06 5.7189e-08 : rprate
16 : 2014-09-03 06:00:00 0 65160 0 : F T 3.6621e-06 5.7084e-08 : rprate
17 : 2014-09-03 06:00:00 0 65160 0 : F T 6.7139e-06 5.7078e-08 : rprate
18 : 2014-09-03 06:00:00 0 65160 0 : F T 3.6621e-06 5.7189e-08 : rprate
19 : 2014-09-03 06:00:00 0 65160 0 : F T 3.6621e-06 5.7140e-08 : rprate
20 : 2014-09-03 06:00:00 0 65160 0 : F T 6.7139e-06 5.7140e-08 : rprate
21 : 2014-09-03 06:00:00 0 65160 0 : F T 7.3242e-06 5.7140e-08 : rprate
21 of 21 records differ
0 of 21 records differ more than 0.001
cdo diffn: Processed 2736720 values from 2 variables over 42 timesteps ( 1.18s )
I just wanted to make sure this is to be expected.
A sinfon showed that the grib2 seemed to be, maybe, 16-bit (??):
(1533) $ cdo sinfon CMC_naefs-geps-raw_ARAIN_SFC_0_latlon1p0x1p0_2014090212_P018_allmbrs.grib2
File format: GRIB2 JPEG
-1 : Institut Source Ttype Levels Num Gridsize Num Dtype : Parameter name
1 : unknown unknown accum 1 1 65160 1 P16z : rprate
...snip...
while the NC4 was 32-bit:
(1534) $ cdo sinfon CMC_naefs-geps-raw_ARAIN_SFC_0_latlon1p0x1p0_2014090212_P018_allmbrs.nc4
File format: netCDF4
-1 : Institut Source Ttype Levels Num Gridsize Num Dtype : Parameter name
1 : unknown unknown instant 1 1 65160 1 F32 : rprate
...snip...
I'm fairly certain F32 is 32-bit float, but I'm not too sure what P16z means. Well, the z seems to be JPEG compression, it's the P16 I'm not sure of. I'm guessing if it isn't 32-bit but some shaved representation for compression efficiency that one could small differences upon conversion.
Matt
Replies (3)
RE: Converting GRIB2 to NC4: Should I expect zero-diff? - Added by Uwe Schulzweida over 11 years ago
CDO is internally using 64-bit floats. That means the 16-bit packed and jpeg compress GRIB record is convert to 64-bit float. Some information could be lost if you convert it to 32-bit float (this is the default) netCDF. You can use the CDO option -b 64 to write 64-bit float. Then is the result 100% the same:
cdo -f nc4 -b 64 copy CMC.grib2 CMC.nc4
RE: Converting GRIB2 to NC4: Should I expect zero-diff? - Added by Matt Thompson over 11 years ago
Uwe Schulzweida wrote:
CDO is internally using 64-bit floats. That means the 16-bit packed and jpeg compress GRIB record is convert to 64-bit float. Some information could be lost if you convert it to 32-bit float (this is the default) netCDF. You can use the CDO option -b 64 to write 64-bit float. Then is the result 100% the same:
[...]
Uwe,
Right you are!
One other question that is grib-ish related. I also tried converting a GRIB1 file as well and I noticed the order changed after I did a cdo diffn and got different results:
(1677) $ cdo -f nc4 copy sample.grib sample.nc4
cdo copy: Processed 1563840 values from 2 variables over 3 timesteps ( 0.04s )
(1678) $ cdo diffn sample.nc4 sample.grib
Date Time Level Gridsize Miss : S Z Max_Absdiff Max_Reldiff : Parameter name
1 : 2004-04-02 00:00:00 100000 65160 0 : F T 7.6294e-07 4.7095e-08 : var33
2 : 2004-04-02 00:00:00 100000 65160 0 : T T 49.000 0.99664 : var33
3 : 2004-04-02 00:00:00 85000 65160 0 : T T 42.000 0.99711 : var33
4 : 2004-04-02 00:00:00 85000 65160 0 : T T 82.400 0.99855 : var33
5 : 2004-04-02 00:00:00 50000 65160 0 : T T 63.900 0.99826 : var34
6 : 2004-04-02 00:00:00 50000 65160 0 : T T 44.900 0.99699 : var34
7 : 2004-04-02 00:00:00 20000 65160 0 : T T 92.300 0.99858 : var34
8 : 2004-04-02 00:00:00 20000 65160 0 : F T 1.5259e-06 4.7535e-08 : var34
9 : 2004-04-02 06:00:00 100000 65160 0 : F T 7.6294e-07 4.7095e-08 : var33
10 : 2004-04-02 06:00:00 100000 65160 0 : T T 49.700 0.99678 : var33
11 : 2004-04-02 06:00:00 85000 65160 0 : T T 44.500 0.99676 : var33
12 : 2004-04-02 06:00:00 85000 65160 0 : T T 81.000 0.99828 : var33
13 : 2004-04-02 06:00:00 50000 65160 0 : T T 58.200 0.99823 : var34
14 : 2004-04-02 06:00:00 50000 65160 0 : T T 42.200 0.99742 : var34
15 : 2004-04-02 06:00:00 20000 65160 0 : T T 95.600 0.99842 : var34
16 : 2004-04-02 06:00:00 20000 65160 0 : F T 1.5259e-06 4.7535e-08 : var34
17 : 2004-04-02 12:00:00 100000 65160 0 : F T 7.6294e-07 4.7095e-08 : var33
18 : 2004-04-02 12:00:00 100000 65160 0 : T T 48.800 0.99628 : var33
19 : 2004-04-02 12:00:00 85000 65160 0 : T T 46.200 0.99760 : var33
20 : 2004-04-02 12:00:00 85000 65160 0 : T T 88.900 0.99858 : var33
21 : 2004-04-02 12:00:00 50000 65160 0 : T T 60.000 0.99796 : var34
22 : 2004-04-02 12:00:00 50000 65160 0 : T T 40.700 0.99696 : var34
23 : 2004-04-02 12:00:00 20000 65160 0 : T T 95.300 0.99853 : var34
24 : 2004-04-02 12:00:00 20000 65160 0 : F T 1.5259e-06 4.7535e-08 : var34
24 of 24 records differ
18 of 24 records differ more than 0.001
cdo diffn: Processed 3127680 values from 4 variables over 6 timesteps ( 0.10s )
(1679) $ cdo infon sample.grib
-1 : Date Time Level Gridsize Miss : Minimum Mean Maximum : Parameter name
1 : 2004-04-02 00:00:00 100000 65160 0 : -24.800 -0.15758 29.300 : var33
2 : 2004-04-02 00:00:00 100000 65160 0 : -21.600 -0.016963 23.900 : var34
3 : 2004-04-02 00:00:00 85000 65160 0 : -29.200 1.3656 36.900 : var33
4 : 2004-04-02 00:00:00 85000 65160 0 : -24.300 0.13997 30.200 : var34
5 : 2004-04-02 00:00:00 50000 65160 0 : -31.900 6.6042 60.800 : var33
6 : 2004-04-02 00:00:00 50000 65160 0 : -49.900 0.020106 56.300 : var34
7 : 2004-04-02 00:00:00 20000 65160 0 : -33.300 15.183 81.100 : var33
8 : 2004-04-02 00:00:00 20000 65160 0 : -49.900 0.064019 52.200 : var34
9 : 2004-04-02 06:00:00 100000 65160 0 : -27.500 -0.10051 28.000 : var33
10 : 2004-04-02 06:00:00 100000 65160 0 : -19.800 -0.019998 21.500 : var34
11 : 2004-04-02 06:00:00 85000 65160 0 : -31.600 1.2970 36.800 : var33
12 : 2004-04-02 06:00:00 85000 65160 0 : -27.700 0.10693 30.800 : var34
13 : 2004-04-02 06:00:00 50000 65160 0 : -28.800 7.0169 60.900 : var33
14 : 2004-04-02 06:00:00 50000 65160 0 : -47.400 -0.017706 53.500 : var34
15 : 2004-04-02 06:00:00 20000 65160 0 : -31.200 15.194 79.500 : var33
16 : 2004-04-02 06:00:00 20000 65160 0 : -52.100 0.11962 48.800 : var34
17 : 2004-04-02 12:00:00 100000 65160 0 : -27.400 -0.037144 28.900 : var33
18 : 2004-04-02 12:00:00 100000 65160 0 : -20.800 0.042563 22.000 : var34
19 : 2004-04-02 12:00:00 85000 65160 0 : -31.200 1.4964 36.300 : var33
20 : 2004-04-02 12:00:00 85000 65160 0 : -28.700 0.073415 38.000 : var34
21 : 2004-04-02 12:00:00 50000 65160 0 : -26.000 7.1112 58.500 : var33
22 : 2004-04-02 12:00:00 50000 65160 0 : -44.900 0.099064 54.200 : var34
23 : 2004-04-02 12:00:00 20000 65160 0 : -31.900 15.193 78.400 : var33
24 : 2004-04-02 12:00:00 20000 65160 0 : -55.000 0.053691 49.400 : var34
cdo infon: Processed 1563840 values from 2 variables over 3 timesteps ( 0.04s )
(1680) $ cdo infon sample.nc4
-1 : Date Time Level Gridsize Miss : Minimum Mean Maximum : Parameter name
1 : 2004-04-02 00:00:00 100000 65160 0 : -24.800 -0.15758 29.300 : var33
2 : 2004-04-02 00:00:00 85000 65160 0 : -29.200 1.3656 36.900 : var33
3 : 2004-04-02 00:00:00 50000 65160 0 : -31.900 6.6042 60.800 : var33
4 : 2004-04-02 00:00:00 20000 65160 0 : -33.300 15.183 81.100 : var33
5 : 2004-04-02 00:00:00 100000 65160 0 : -21.600 -0.016963 23.900 : var34
6 : 2004-04-02 00:00:00 85000 65160 0 : -24.300 0.13997 30.200 : var34
7 : 2004-04-02 00:00:00 50000 65160 0 : -49.900 0.020106 56.300 : var34
8 : 2004-04-02 00:00:00 20000 65160 0 : -49.900 0.064019 52.200 : var34
9 : 2004-04-02 06:00:00 100000 65160 0 : -27.500 -0.10051 28.000 : var33
10 : 2004-04-02 06:00:00 85000 65160 0 : -31.600 1.2970 36.800 : var33
11 : 2004-04-02 06:00:00 50000 65160 0 : -28.800 7.0169 60.900 : var33
12 : 2004-04-02 06:00:00 20000 65160 0 : -31.200 15.194 79.500 : var33
13 : 2004-04-02 06:00:00 100000 65160 0 : -19.800 -0.019998 21.500 : var34
14 : 2004-04-02 06:00:00 85000 65160 0 : -27.700 0.10693 30.800 : var34
15 : 2004-04-02 06:00:00 50000 65160 0 : -47.400 -0.017706 53.500 : var34
16 : 2004-04-02 06:00:00 20000 65160 0 : -52.100 0.11962 48.800 : var34
17 : 2004-04-02 12:00:00 100000 65160 0 : -27.400 -0.037144 28.900 : var33
18 : 2004-04-02 12:00:00 85000 65160 0 : -31.200 1.4964 36.300 : var33
19 : 2004-04-02 12:00:00 50000 65160 0 : -26.000 7.1112 58.500 : var33
20 : 2004-04-02 12:00:00 20000 65160 0 : -31.900 15.193 78.400 : var33
21 : 2004-04-02 12:00:00 100000 65160 0 : -20.800 0.042563 22.000 : var34
22 : 2004-04-02 12:00:00 85000 65160 0 : -28.700 0.073415 38.000 : var34
23 : 2004-04-02 12:00:00 50000 65160 0 : -44.900 0.099064 54.200 : var34
24 : 2004-04-02 12:00:00 20000 65160 0 : -55.000 0.053691 49.400 : var34
cdo infon: Processed 1563840 values from 2 variables over 3 timesteps ( 0.02s )
Now, obviously I can select a certain time/level/name combo and diff, but I was wondering if you had a slick way to "enforce" an ordering that isn't alphabetical?
Matt
RE: Converting GRIB2 to NC4: Should I expect zero-diff? - Added by Uwe Schulzweida over 11 years ago
The records in the GRIB file are sorted by levels. In netCDF these records are sorted by variable. You can use the undocumented CDO operator sortlevel to compare these files with diff:
cdo diffn -sortlevel sample.nc4 -sortlevel sample.grib