Project

General

Profile

CDO self-test crash

Added by simon michnowicz about 4 years ago

Dear Group,
I am installing CDO for a user on our cluster. It crashes when performing a 'make test'.(See below)
Could you please advise me if this is significant and if so, how to resolve this.

The code was built with this
$ ./configure --prefix=/usr/local/cdo/1.9.8 --with-hdf5=/usr/local/hdf5/1.10.0-patch1 --with-netcdf=/usr/local/netcdf/4.7.0
and the following modules

1) gcc/4.9.3                 2) openmpi/1.10.3-gcc4-mlx   3) hdf5/1.10.0-patch1        4) netcdf/4.7.0

thanks
Simon Michnowicz

/usr/local/src/CDO/1.9.8/cdo-1.9.8/src/cdo -fldmin -timmean -select,code=130 ifile29034 thread8_res
cdo(1) timmean: Process started
cdo(2) select: Process started
  • Error in `/usr/local/src/CDO/1.9.8/cdo-1.9.8/src/cdo': double free or corruption (fasttop): 0x0000000002adfa10 *** ======= Backtrace: =========
    /lib64/libc.so.6(+0x81679)[0x7f5e6e947679]
    /usr/local/hdf5/1.10.0-patch1/lib/libhdf5.so.100(H5MM_xfree+0xe)[0x7f5e6f75780e]
    /usr/local/hdf5/1.10.0-patch1/lib/libhdf5.so.100(+0xd3276)[0x7f5e6f6b3276]
    /usr/local/hdf5/1.10.0-patch1/lib/libhdf5.so.100(H5E_clear_stack+0x67)[0x7f5e6f6b3417]
    /usr/local/hdf5/1.10.0-patch1/lib/libhdf5.so.100(H5Sclose+0x45)[0x7f5e6f7cf525]
    /usr/local/netcdf/4.7.0/lib/libnetcdf.so.15(NC4_get_vars+0xd04)[0x7f5e704b9804]
    /usr/local/netcdf/4.7.0/lib/libnetcdf.so.15(NC4_get_vara+0x48)[0x7f5e704b7f54]
    /usr/local/netcdf/4.7.0/lib/libnetcdf.so.15(NC_get_vara+0xa3)[0x7f5e70459daa]
    /usr/local/netcdf/4.7.0/lib/libnetcdf.so.15(nc_get_vara_float+0x3c)[0x7f5e7045b152]
    /usr/local/src/CDO/1.9.8/cdo-1.9.8/src/cdo[0x7bdc84]
    /usr/local/src/CDO/1.9.8/cdo-1.9.8/src/cdo[0x797418]
    /usr/local/src/CDO/1.9.8/cdo-1.9.8/src/cdo[0x79777d]
    /usr/local/src/CDO/1.9.8/cdo-1.9.8/src/cdo[0x797b38]
    /usr/local/src/CDO/1.9.8/cdo-1.9.8/src/cdo[0x797e43]
    /usr/local/src/CDO/1.9.8/cdo-1.9.8/src/cdo[0x7bb315]
    /usr/local/src/CDO/1.9.8/cdo-1.9.8/src/cdo[0x7bb370]
    /usr/local/src/CDO/1.9.8/cdo-1.9.8/src/cdo[0x671993]
    /usr/local/src/CDO/1.9.8/cdo-1.9.8/src/cdo[0x736acd]
    /usr/local/src/CDO/1.9.8/cdo-1.9.8/src/cdo[0x6f54a8]
    /usr/local/src/CDO/1.9.8/cdo-1.9.8/src/cdo[0x5994fb]
    /lib64/libpthread.so.0(+0x7e65)[0x7f5e6ec9be65]
    /lib64/libc.so.6(clone+0x6d)[0x7f5e6e9c488d] ======= Memory map: ========
    00400000-00a6a000 r-xp 00000000 00:2c 78222080 /usr/local/src/CDO/1.9.8/cdo-1.9.8/src/cdo
    00c6a000-00c78000 r--p 0066a000 00:2c 78222080 /usr/local/src/CDO/1.9.8/cdo-1.9.8/src/cdo
    00c78000-00c7b000 rw-p 00678000 00:2c 78222080 /usr/local/src/CDO/1.9.8/cdo-1.9.8/src/cdo
    00c7b000-00c91000 rw-p 00000000 00:00 0
    02a3d000-02b24000 rw-p 00000000 00:00 0 [heap]
    7f5e5c000000-7f5e5c0ea000 rw-p 00000000 00:00 0
    7f5e5c0ea000-7f5e60000000 ---p 00000000 00:00 0
    7f5e64000000-7f5e64021000 rw-p 00000000 00:00 0
    7f5e64021000-7f5e68000000 ---p 00000000 00:00 0
    7f5e69c82000-7f5e69c83000 ---p 00000000 00:00 0
    7f5e69c83000-7f5e6a483000 rw-p 00000000 00:00 0
    7f5e6a483000-7f5e6a484000 ---p 00000000 00:00 0
    7f5e6a484000-7f5e6ac84000 rw-p 00000000 00:00 0
    7f5e6ac84000-7f5e6ac86000 r-xp 00000000 fd:01 134015 /usr/lib64/libfreebl3.so
    7f5e6ac86000-7f5e6ae85000 ---p 00002000 fd:01 134015 /usr/lib64/libfreebl3.so
    7f5e6ae85000-7f5e6ae86000 r--p 00001000 fd:01 134015 /usr/lib64/libfreebl3.so
    7f5e6ae86000-7f5e6ae87000 rw-p 00002000 fd:01 134015 /usr/lib64/libfreebl3.so
    7f5e6ae87000-7f5e6aee7000 r-xp 00000000 fd:01 135214 /usr/lib64/libpcre.so.1.2.0
    7f5e6aee7000-7f5e6b0e7000 ---p 00060000 fd:01 135214 /usr/lib64/libpcre.so.1.2.0
    7f5e6b0e7000-7f5e6b0e8000 r--p 00060000 fd:01 135214 /usr/lib64/libpcre.so.1.2.0
    7f5e6b0e8000-7f5e6b0e9000 rw-p 00061000 fd:01 135214 /usr/lib64/libpcre.so.1.2.0
    7f5e6b0e9000-7f5e6b0f1000 r-xp 00000000 fd:01 134770 /usr/lib64/libcrypt-2.17.so
    7f5e6b0f1000-7f5e6b2f0000 ---p 00008000 fd:01 134770 /usr/lib64/libcrypt-2.17.so
    7f5e6b2f0000-7f5e6b2f1000 r--p 00007000 fd:01 134770 /usr/lib64/libcrypt-2.17.so
    7f5e6b2f1000-7f5e6b2f2000 rw-p 00008000 fd:01 134770 /usr/lib64/libcrypt-2.17.so
    7f5e6b2f2000-7f5e6b320000 rw-p 00000000 00:00 0
    7f5e6b320000-7f5e6b344000 r-xp 00000000 fd:01 135225 /usr/lib64/libselinux.so.1
    7f5e6b344000-7f5e6b543000 ---p 00024000 fd:01 135225 /usr/lib64/libselinux.so.1
    7f5e6b543000-7f5e6b544000 r--p 00023000 fd:01 135225 /usr/lib64/libselinux.so.1
    7f5e6b544000-7f5e6b545000 rw-p 00024000 fd:01 135225 /usr/lib64/libselinux.so.1
    7f5e6b545000-7f5e6b547000 rw-p 00000000 00:00 0
    7f5e6b547000-7f5e6b563000 r-xp 00000000 fd:01 136425 /usr/lib64/libsasl2.so.3.0.0
    7f5e6b563000-7f5e6b762000 ---p 0001c000 fd:01 136425 /usr/lib64/libsasl2.so.3.0.0
    7f5e6b762000-7f5e6b763000 r--p 0001b000 fd:01 136425 /usr/lib64/libsasl2.so.3.0.0
    7f5e6b763000-7f5e6b764000 rw-p 0001c000 fd:01 136425 /usr/lib64/libsasl2.so.3.0.0
    7f5e6b764000-7f5e6b77a000 r-xp 00000000 fd:01 151260 /usr/lib64/libresolv-2.17.so
    7f5e6b77a000-7f5e6b979000 ---p 00016000 fd:01 151260 /usr/lib64/libresolv-2.17.so
    7f5e6b979000-7f5e6b97a000 r--p 00015000 fd:01 151260 /usr/lib64/libresolv-2.17.so
    7f5e6b97a000-7f5e6b97b000 rw-p 00016000 fd:01 151260 /usr/lib64/libresolv-2.17.so
    7f5e6b97b000-7f5e6b97d000 rw-p 00000000 00:00 0
    7f5e6b97d000-7f5e6b980000 r-xp 00000000 fd:01 135328 /usr/lib64/libkeyutils.so.1.5
    7f5e6b980000-7f5e6bb7f000 ---p 00003000 fd:01 135328 /usr/lib64/libkeyutils.so.1.5
    7f5e6bb7f000-7f5e6bb80000 r--p 00002000 fd:01 135328 /usr/lib64/libkeyutils.so.1.5
    7f5e6bb80000-7f5e6bb81000 rw-p 00003000 fd:01 135328 /usr/lib64/libkeyutils.so.1.5
    7f5e6bb81000-7f5e6bb8f000 r-xp 00000000 fd:01 151267 /usr/lib64/libkrb5support.so.0.1
    7f5e6bb8f000-7f5e6bd8f000 ---p 0000e000 fd:01 151267 /usr/lib64/libkrb5support.so.0.1
    7f5e6bd8f000-7f5e6bd90000 r--p 0000e000 fd:01 151267 /usr/lib64/libkrb5support.so.0.1
    7f5e6bd90000-7f5e6bd91000 rw-p 0000f000 fd:01 151267 /usr/lib64/libkrb5support.so.0.1
    7f5e6bd91000-7f5e6bd98000 r-xp 00000000 fd:01 151261 /usr/lib64/librt-2.17.so
    7f5e6bd98000-7f5e6bf97000 ---p 00007000 fd:01 151261 /usr/lib64/librt-2.17.so
    7f5e6bf97000-7f5e6bf98000 r--p 00006000 fd:01 151261 /usr/lib64/librt-2.17.so
    7f5e6bf98000-7f5e6bf99000 rw-p 00007000 fd:01 151261 /usr/lib64/librt-2.17.so
    7f5e6bf99000-7f5e6c1cf000 r-xp 00000000 fd:01 135431 /usr/lib64/libcrypto.so.1.0.2k
    7f5e6c1cf000-7f5e6c3cf000 ---p 00236000 fd:01 135431 /usr/lib64/libcrypto.so.1.0.2k
    7f5e6c3cf000-7f5e6c3eb000 r--p 00236000 fd:01 135431 /usr/lib64/libcrypto.so.1.0.2k
    7f5e6c3eb000-7f5e6c3f8000 rw-p 00252000 fd:01 135431 /usr/lib64/libcrypto.so.1.0.2k
    7f5e6c3f8000-7f5e6c3fc000 rw-p 00000000 00:00 0
    7f5e6c3fc000-7f5e6c463000 r-xp 00000000 fd:01 135433 /usr/lib64/libssl.so.1.0.2k

Replies (2)

RE: CDO self-test crash - Added by Ralf Mueller about 4 years ago

Hi Simon!

I am pretty sure, that your gcc is too old. I would use nothing older than 6.4. On the other hand the hdf5 library could be installed in a non-threadsafe way. this can lead to issues like this, too. you can add the option '-L' to the cdo command line - this fixes those issues.

If you install packages on a cluster, have you thought about using a package manager? there are pre-build packages for debian, fedora and other systems. CDO can be installed via conda,too. And if you want to compile things, spack is a very good option IMO. I mean, I would try to avoid handling all dependencies (and esp. inter-dependencies) of software on my own.

cheers
ralf

RE: CDO self-test crash - Added by simon michnowicz about 4 years ago

Ralf
thanks for the hints. I will look at making a Singularity container of this for our system
regards
Simon

    (1-2/2)