Project

General

Profile

Cannot allocate memory

Added by Russell Glazer over 10 years ago

Hello,

I am trying to convert a netcdf file to grb1 but only for 1 variable in the netcdf file.

Here is the command with the error:
cdo -f grb copy -selname,analysed_sst -setcode,11 -setltype,1 20100912-JPL_OUROCEAN-L4UHfnd-GLOB-v01-fv01_0-G1SST.nc G1SST-2010-09-12.nc

cdo copy: Started child process "selname,analysed_sst -setcode,11 -setltype,1 20100912-JPL_OUROCEAN-L4UHfnd-GLOB-v01-fv01_0-G1SST.nc (pipe1.1)".
cdo(2) selname: Started child process "setcode,11 -setltype,1 20100912-JPL_OUROCEAN-L4UHfnd-GLOB-v01-fv01_0-G1SST.nc (pipe2.1)".
cdo(3) setcode: Started child process "setltype,1 20100912-JPL_OUROCEAN-L4UHfnd-GLOB-v01-fv01_0-G1SST.nc (pipe3.1)".

Error (Set) : Allocation of 4608000000 bytes failed. [ line 181 file Set.c ]
System error message : Cannot allocate memory

Error (Selvar) : Allocation of 4608000000 bytes failed. [ line 410 file Selvar.c ]
System error message : Cannot allocate memory
Segmentation fault (core dumped)

The netcdf file I am trying to convert is about 3GB but I am on a large server system so I don't think memory is an issue? Is there a way to increase the amount of allocated memory for cdo?

Here is my cdo -V

Climate Data Operators version 1.6.1 (http://code.zmaw.de/projects/cdo)
Compiler: gcc -std=gnu99 -O2 -m64 -mtune=generic -fPIC -pthread
version: gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4)
Compiled: by paulvdm on build (x86_64-unknown-linux-gnu) Sep 5 2013 16:39:30
with: PTHREADS NC4 OPeNDAP Z UDUNITS2 PROJ.4 XML2
filetype: srv ext ieg grb grb2 nc nc2 nc4 nc4c
CDI library version : 1.6.1 of Sep 5 2013 16:39:24
CGRIBEX library version : 1.6.1 of Jun 27 2013 15:38:33
GRIB_API library version : 1.11.0
netCDF library version : 4.1.3 of Nov 20 2012 11:37:38 $
HDF5 library version : 1.8.7
SERVICE library version : 1.3.1 of Sep 5 2013 16:39:22
EXTRA library version : 1.3.1 of Sep 5 2013 16:39:22
IEG library version : 1.3.1 of Sep 5 2013 16:39:22
FILE library version : 1.8.2 of Sep 5 2013 16:39:22


Replies (9)

RE: Cannot allocate memory - Added by Uwe Schulzweida over 10 years ago

You can reduce the amount of allocated memory by reducing the number of CDO operator:

cdo        selname,analysed_sst file1.nc file2.nc
cdo -f grb setcode,11  file2.nc file2.grb

RE: Cannot allocate memory - Added by Russell Glazer over 10 years ago

Even doing the first operator gave me the same error:

cdo selname,analysed_sst 20100912-JPL_OUROCEAN-L4UHfnd-GLOB-v01-fv01_0-G1SST.nc G1SST-2010-09-12.nc

Error (cdf_write_var_data) : Allocation of 4608000000 bytes failed. [ line 3311 file stream_cdf.c ]
System error message : Cannot allocate memory

RE: Cannot allocate memory - Added by Uwe Schulzweida over 10 years ago

The allocated number of bytes is the horizontal grid size multiplied by 8.
Has the horizontal grid really 576000000 cells? You can check it with 'cdo sinfo'.
Maybe you are not allowed to allocated 4,3 GB memory. You can check it with the command 'limit'
in the csh.

RE: Cannot allocate memory - Added by Russell Glazer over 10 years ago

The horizontal grid is indeed 576000000 gridcells, the dataset resolution is 1km and is global(accept for 10degrees at the poles I think) so this makes sense.

my limits are:
cputime 1:00:00
filesize unlimited
datasize unlimited
stacksize 10240 kbytes
coredumpsize 0 kbytes
memoryuse unlimited
vmemoryuse 6000000 kbytes
descriptors 1024
memorylocked 6000000 kbytes
maxproc 200

so memory use is unlimited, and vmemory use is 6GB. I don't see what the problem is. Is the datafile just too large for these cdo operators?

RE: Cannot allocate memory - Added by Uwe Schulzweida over 10 years ago

Unfortunately I also don't see what the problem is. The minimum memory requirement for this grid is 4.3GB. It seems that the system can't allocate one continuous block of this size. CDO has no problem to process such large datafile.

RE: Cannot allocate memory - Added by Alexander Robinson over 3 years ago

Hello,

I am also seeing a memory exhaustion issue, while trying to perform conservative interpolation for a very high resolution dataset (dx=150m with 10218*18346=187459428 grid points) down to a relatively low resolution (dx=16km with 106x181=19186 grid points).

The command I run is the following:

cdo remapcon,cdogrid_GRL-16KM.txt -setgrid,cdogrid_TOPO-M17.txt BedMachineGreenland-2017-09-20.nc outfile.nc

I have uploaded the grid definition files here. The dataset itself is ~2Gb, publicly available here:

I have been using the following cdo version:

cdo -V
Climate Data Operators version 1.9.6 (http://mpimet.mpg.de/cdo)
System: x86_64-pc-linux-gnu
CXX Compiler: g++  -march=native -std=c++14 -fopenmp 
CXX version : g++ (GCC) 7.3.0
C Compiler: gcc -g -O2 -fopenmp  
C version : gcc (GCC) 7.3.0
F77 Compiler: gfortran -g -O2
F77 version : GNU Fortran (GCC) 7.3.0
Features: 249GB 48threads C++14 Fortran DATA PTHREADS OpenMP45 HDF5 NC4/HDF5/threadsafe OPeNDAP SZ UDUNITS2 PROJ.4 CURL AVX2
Libraries: HDF5/1.10.4 proj/5.2 curl/7.52.1
Filetypes: srv ext ieg grb1 grb2 nc1 nc2 nc4 nc4c nc5 
     CDI library version : 1.9.6
 cgribex library version : 1.9.2
 ecCodes library version : 2.1.0
  NetCDF library version : 4.6.2 of Feb 19 2019 13:41:23 $
    hdf5 library version : 1.10.4 threadsafe
    exse library version : 1.4.1
    FILE library version : 1.8.3

However, our cluster support also tried some tests with v1.9.8 without success. He passed on the following information: I've done a few test runs and monitored the memory use on the nodes. It looks like cdo is exhausting all available memory (64GB on Haswell and 128GB on Broadwell - so fast by the end of the run that neither Slurm nor the operating system are able to log the failure before cdo segfaults).

I would be grateful if you have any idea of what I could be doing wrong. I have found that if I first perform 'cdo -samplegrid,3 ...` on the original dataset, then I can perform the interpolation without problems. But it would be ideal to be able to work with the original dataset directly.

Many thanks!
Alex

RE: Cannot allocate memory - Added by Ralf Mueller over 3 years ago

I think this is pretty normal: you have roughly 200 Million gridpoints. conservative remapping need the bounds, which means (in the case of square cells) additional 800 Million cells. this alone means 7.5 GB of memory. Now consider the whole grid weight computation.

I ran this on a node with 1TB of memory:

cdo -P 32 -f nc -remapcon,cdogrid_GRL-16KM.txt -topo,cdogrid_TOPO-M17.txt t.nc
cdo(1) topo: Process started
cdo    remapcon: YAC first order conservative weights from curvilinear (10218x18346) to projection (106x181) grid
cdo(1) topo:         
cdo    remapcon: Processed 187459428 values from 1 variable over 1 timestep [1265.16s 104GB].

did this with cdo-1.9.8. So I think you might retry this on the 128GB Broadwell node with cdo-1.9.8

hth
ralf

RE: Cannot allocate memory - Added by Alexander Robinson over 3 years ago

Great, thanks for the tip. Indeed it worked for us on the 128GB Broadwell node, even with cdo-1.9.6.

Thanks for the help!

Alex

    (1-9/9)