Project

General

Profile

Issue Building CDO 1.6.4 with PGI 14.7

Added by Matt Thompson over 9 years ago

All,

Today I tried to build CDO 1.6.4 with PGI 14.7 and it threw this error:

Making install in src
make[3]: Entering directory `/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/src/cdo/src'
source='Eof3d.c' object='cdo-Eof3d.o' libtool=no \
    DEPDIR=.deps depmode=pgcc /bin/sh ../config/depcomp \
    mpicc -DHAVE_CONFIG_H -I.  -I../libcdi/src -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include -DpgiFortran  -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include/    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include/zlib    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include/szlib    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include/jpeg    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include/hdf5    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include/hdf    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include/uuid    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include/netcdf    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include/udunits2   -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include  -fPIC  -mp  -c -o cdo-Eof3d.o `test -f 'Eof3d.c' || echo './'`Eof3d.c
PGC-F-0155-Illegal context for omp  (Eof3d.c: 486)
PGC/x86-64 Linux 14.7-0: compilation aborted
make[3]: *** [cdo-Eof3d.o] Error 2
make[3]: Leaving directory `/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/src/cdo/src'
make[2]: *** [install-recursive] Error 1
make[2]: Leaving directory `/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/src/cdo'
make[1]: *** [cdo.install] Error 2
make[1]: Leaving directory `/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/src'
make: *** [install] Error 2

Now, first, I'm using mpicc only because netCDF and HDF5 were built with that so I have to build and link with mpicc. That isn't the issue, but I thought I'd head that off at the pass. The version:

$ mpicc -V
pgcc 14.7-0 64-bit target on x86-64 Linux -tp nehalem 
The Portland Group - PGI Compilers and Tools
Copyright (c) 2014, NVIDIA CORPORATION.  All rights reserved.

Looking at the code:

    475           if ( sum > 0 ) {
    476             sum = sqrt(sum);
    477 #if defined(_OPENMP)
    478 #pragma omp parallel for private(i) default(none) \
    479   shared(sum,npack,eigenvec,pack)
    480 #endif
    481             for( i = 0; i < npack; i++ )
    482               eigenvec[pack[i]] /= sum;
    483           }
    484           else
    485 #if defined(_OPENMP)
    486 #pragma omp parallel for private(i) default(none) \
    487   shared(eigenvec,pack,missval,npack)
    488 #endif
    489             for( i = 0; i < npack; i++ )
    490               eigenvec[pack[i]] = missval;
    491         }     /* for ( eofID = 0; eofID < n_eig; eofID++ )     */

There is nothing really wrong here that I see. I was able to "fix" the code and allow PGI to compile by doing this:

    475           if ( sum > 0 ) {
    476             sum = sqrt(sum);
    477 #if defined(_OPENMP)
    478 #pragma omp parallel for private(i) default(none) \
    479   shared(sum,npack,eigenvec,pack)
    480 #endif
    481             for( i = 0; i < npack; i++ )
    482               eigenvec[pack[i]] /= sum;
    483           }
    484           else
    485      {
    486 #if defined(_OPENMP)
    487 #pragma omp parallel for private(i) default(none) \
    488   shared(eigenvec,pack,missval,npack)
    489 #endif
    490             for( i = 0; i < npack; i++ )
    491               eigenvec[pack[i]] = missval;
    492      }
    493         }     /* for ( eofID = 0; eofID < n_eig; eofID++ )     */

where I inserted a brace around the "else" structure. Now, I'm mainly a Fortran programmer, but I don't see why that was necessary since the else bit only had one statement. (Note my braces at 485 and 492 are all out of indent because my Vim uses spaces not tabs.) Perhaps PGI reads the standard a bit differently that I think? Not sure what to do with this, but I wanted you (okay, Uwe) to be aware of this.

After that the compile continued and threw another error:

    mpicc -DHAVE_CONFIG_H -I.  -I../libcdi/src -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include -DpgiFortran  -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include/    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include/zlib    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include/szlib    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include/jpeg    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include/hdf5    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include/hdf    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include/uuid    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include/netcdf    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include/udunits2   -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/pgfortran_14.7-openmpi_1.8.1/Linux/include  -fPIC  -mp  -c -o cdo-remap_distwgt_scrip.o `test -f 'remap_distwgt_scrip.c' || echo './'`remap_distwgt_scrip.c
PGC-S-0061-Sizeof dimensionless array  required (remap_distwgt_scrip.c: 540)
PGC-S-0061-Sizeof dimensionless array  required (remap_distwgt_scrip.c: 540)
PGC-S-0061-Sizeof dimensionless array  required (remap_distwgt_scrip.c: 553)
PGC/x86-64 Linux 14.7-0: compilation completed with severe errors
make[3]: *** [cdo-remap_distwgt_scrip.o] Error 2
make[3]: Leaving directory `/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/src/cdo/src'
make[2]: *** [install-recursive] Error 1
make[2]: Leaving directory `/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/src/cdo'
make[1]: *** [cdo.install] Error 2
make[1]: Leaving directory `/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/src'
make: *** [install] Error 2

at this code:

    534       /* Find nearest grid points on source grid and distances to each point */
    535       if ( remap_grid_type == REMAP_GRID_TYPE_REG2D )
    536         grid_search_nbr_reg2d(num_neighbors, src_grid, nbr_add, nbr_dist, 
    537                               plat, plon, src_grid->dims,
    538                               coslat_dst, coslon_dst, sinlat_dst, sinlon_dst,
    539                               sinlat, coslat, sinlon, coslon,
    540                               src_grid->reg2d_center_lat, src_grid->reg2d_center_lon);
    541       else
    542         grid_search_nbr(num_neighbors, src_grid, nbr_add, nbr_dist, 
    543                         plat, plon, src_grid->bin_addr,
    544                         coslat_dst, coslon_dst, sinlat_dst, sinlon_dst,
    545                         sinlat, coslat, sinlon, coslon);
    546 
    547       /* Compute weights based on inverse distance if mask is false, eliminate those points */
    548 
    549       dist_tot = 0.;
    550       for ( n = 0; n < num_neighbors; ++n )
    551         {
    552           // printf("dst_add %ld %ld %d %g\n", dst_add, n, nbr_add[n], nbr_dist[n]);
    553           nbr_mask[n] = FALSE;
    554 
    555           /* Uwe Schulzweida: check if nbr_add is valid */

Again, I'm a Fortran coder and in this case, I have no freaking idea what's going on here. Possible compiler bug?

Thanks,
Matt


Replies (4)

RE: Issue Building CDO 1.6.4 with PGI 14.7 - Added by Matt Thompson over 9 years ago

Okay, some additional testing showed that PGI 13.10 threw the same errors as PGI 14.7. For my next attempt I decided to build with --disable-openmp and, yup, everything compiled just fine.

So, what does this mean? Well, I suppose for the first error (Eof3d.c) that is a case where PGI doesn't see the scope of the parallel for correctly, I guess?

The second one, I'm still a bit befuddled by. Since removing -mp worked, that means somehow one of the OMP pragmas is interfering. I'm also then not too sure if those line numbers (540 and 553) are actually what I think they are. It's possible that, perhaps, there's an interaction with the preprocessor and the number lines are shifted a bit from what's being reported? I'm really confused by that 553 if it's right. "nbr_mask[n] = FALSE;" isn't exactly exciting code...and nbr_mask seems to be dimensioned num_neighbors pretty well.

Finally, I suppose I'll just build with --disable-openmp for the moment. In truth we don't use many of the compute-bound operators in practice and I can always steer someone to one built with gcc or Intel compilers until this is resolved.

Still, any ideas?

Matt

RE: Issue Building CDO 1.6.4 with PGI 14.7 and GCC 4.9.1 - Added by Matt Thompson over 9 years ago

More to add here. I just tried building CDO 1.6.4 with gcc 4.9.1 (OpenMPI 1.8.1) and it does this:

mpicc -std=gnu99 -DHAVE_CONFIG_H -I.  -I../libcdi/src -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/gfortran_4.9.1-openmpi_1.8.1/Linux/include -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/gfortran_4.9.1-openmpi_1.8.1/Linux/include -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/gfortran_4.9.1-openmpi_1.8.1/Linux/include -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/gfortran_4.9.1-openmpi_1.8.1/Linux/include -DgFortran  -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/gfortran_4.9.1-openmpi_1.8.1/Linux/include/    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/gfortran_4.9.1-openmpi_1.8.1/Linux/include/zlib    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/gfortran_4.9.1-openmpi_1.8.1/Linux/include/szlib    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/gfortran_4.9.1-openmpi_1.8.1/Linux/include/jpeg    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/gfortran_4.9.1-openmpi_1.8.1/Linux/include/hdf5    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/gfortran_4.9.1-openmpi_1.8.1/Linux/include/hdf    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/gfortran_4.9.1-openmpi_1.8.1/Linux/include/uuid    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/gfortran_4.9.1-openmpi_1.8.1/Linux/include/netcdf    -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/gfortran_4.9.1-openmpi_1.8.1/Linux/include/udunits2   -I/ford1/share/gmao_SIteam/Baselibs/GMAO-Baselibs-4_1_0/x86_64-unknown-linux-gnu/gfortran_4.9.1-openmpi_1.8.1/Linux/include  -fPIC  -fopenmp  -MT cdo-remap_distwgt_scrip.o -MD -MP -MF .deps/cdo-remap_distwgt_scrip.Tpo -c -o cdo-remap_distwgt_scrip.o `test -f 'remap_distwgt_scrip.c' || echo './'`remap_distwgt_scrip.c
remap_distwgt_scrip.c: In function ‘grid_search_nbr’:
remap_distwgt_scrip.c:303:14: error: expected ‘;’ before ‘,’ token
   for ( j = 0, i = 0; i < ndist; ++i )
              ^
remap_distwgt_scrip.c:303:23: error: invalid controlling predicate
   for ( j = 0, i = 0; i < ndist; ++i )
                       ^
remap_distwgt_scrip.c:303:34: error: invalid increment expression
   for ( j = 0, i = 0; i < ndist; ++i )
                                  ^

If I remove -fopenmp, then it does compile.

RE: Issue Building CDO 1.6.4 with PGI 14.7 - Added by Jaison-Thomas Ambadan over 9 years ago

just to report ...

remap_distwgt_scrip.c:303:14: error: expected ‘;’ before ‘,’ token
for ( j = 0, i = 0; i < ndist; ++i )

I also had exactly the same problem compiling CDO v1.6.4 and v1.6.5rc2 with gcc version 4.9.1 (Debian 4.9.1-4), [no Openmpi]; however both CDO versions worked perfectly with gcc version 4.8.3 (Debian 4.8.3-7)

RE: Issue Building CDO 1.6.4 with PGI 14.7 - Added by Uwe Schulzweida over 9 years ago

Thanks for this report! I fixed all CDO problems with the PGI compiler now. A prereleased CDO version is available in the download area.

    (1-4/4)