Project

General

Profile

File Sizes

Added by Bjorn Stevens about 5 years ago

Is it easy to understand in what flavor of netCDF to write files. I have the same file, zonmean, time-varying but they differ by more than a factor of ten in size. Consider:

ls /work/mh0492/m219063/DYAMOND/PostProc/ZonalMeanOn0.1degGrid/GEOS-3.3km*
-rw-r--r-- 1 m219063 423986717 Jan 25 22:14 GEOS-3.3km_PRECTOT_0.10deg_zonmean.nc
-rw-r--r-- 1 m219063  28314968 Feb 22 11:08 GEOS-3.3km_PRECTOT_0.10deg_zonmean.nc2

where the latter was created from the former (netCDF4) via

cdo -P 4 -f nc2 copy GEOS-3.3km_PRECTOT_0.10deg_zonmean.nc GEOS-3.3km_PRECTOT_0.10deg_zonmean.nc2

Replies (2)

RE: File Sizes - Added by Ralf Mueller about 5 years ago

netcdf2 and netcdf4 are quite different in their implementation. with your data having dimensions

time = 3925 ;
lon = 1 ;
lat = 1800 ;

you hit an extreme point for netcdf4 I guess. the internal data layout of netcdf4 is obviously not designed for such data. In terms of processing speed my impression is that netcdf2 is very hard to beat. But there might be cases (i.e. combination of dimension sizes) where netcdf2 shows its weakness.

hth
ralf

RE: File Sizes - Added by Uwe Schulzweida about 5 years ago

The size of a netCDF4 file depends among other things on the chunksize. The chunksize in your netCDF4 file is 1, which leads to large files.
A simple copy of the netCDF4 file with CDO sets the chunksize to the grid size, in this case 1800. This leads to a much smaller file:

-rw-r----- 1 m214003 mh0287 423986717 Feb 22 11:57 GEOS-3.3km_PRECTOT_0.10deg_zonmean.nc
-rw-r----- 1 m214003 mh0287  28314968 Feb 22 11:57 GEOS-3.3km_PRECTOT_0.10deg_zonmean.nc2
-rw-r----- 1 m214003 mh0287  28567261 Feb 22 14:35 GEOS-3.3km_PRECTOT_0.10deg_zonmean.nc4
Unfortunately, I can't reproduce how the chunksize of 1 has come into the file. With CDO this is only possible with the option '-k lines'.

    (1-2/2)