Project

General

Profile

Compiling CDO for parallel computing

Added by Clement Tisseuil almost 14 years ago

Dear all,

I am using supercomputers to perform parallel computing. I am really not an export on that but I wondered if there was a way to explicitly compile CDO for parallel computing (maybe by linking CDO to MPT or MPI libraries...)

Thanks in advance

Cheers

Clement Tisseuil (Ph.D)
UMR BOREA 7208 « Biologie des Organismes et Ecosystèmes Aquatiques »
(CNRS-MNHN-UPMC-IRD), Equipe « Biodiversité et Macroécologie »,
43 rue Cuvier, 75005 Paris, France


Replies (20)

RE: Compiling CDO for parallel computing - Added by Ralf Mueller almost 14 years ago

Hi Clement!

CDO uses OpenMP for parallelism. So you can use an additional CFLAGS during the configuration. Watch out here for the correct flag for you compiler. For GCC it is CFLAGS='-fopenmp' for example.

regards
ralf

RE: Compiling CDO for parallel computing - Added by Clement Tisseuil almost 14 years ago

Hi Ralf!

Good news for the possibility of parallel computing.

I am new with parallelism, so maybe we can see with the following example... let us say I have 20 available processors. If I am running a NCO command (e.g. mpirun cdo mergetim file1.nc file2.nc file.nc), is CDO going to use all the available processors "by itself", or should I write some specific command lines? Maybe you have an example using parallelism with NCO ?

Thanks in advance.

Regards

Clement Tisseuil (Ph.D)
UMR BOREA 7208 « Biologie des Organismes et Ecosystèmes Aquatiques »
(CNRS-MNHN-UPMC-IRD), Equipe « Biodiversité et Macroécologie »,
43 rue Cuvier, 75005 Paris, France

RE: Compiling CDO for parallel computing - Added by Ralf Mueller almost 14 years ago

For using cdo with multiple OpenMP threads, you have to use the '-P' option for setting the number threads. cdo does not use MPI, so I guess, you don't have to start cdo with 'mpirun'. If it's possible for you, check the CPU usage when using cdo with multiplt threads (e.g. with top).

regards
ralf

RE: Compiling CDO for parallel computing - Added by Uwe Schulzweida almost 14 years ago

Hi Clement,

Not all of the CDO operators are parallelized with OpenMP. Here is a list of all CDO operators with OpenMP support.

Regards,
Uwe

RE: Compiling CDO for parallel computing - Added by George Miloshevich over 3 years ago

Hello all,

I am new to CDO and was trying to figure out the following with the documentation. The issue is that the administrator of our network is complaining that the command I am executing on of their machines runs in parallel on multiple cores. Indeed, it seems to be the case when I look at the task manager. The command in question is the following one:

cdo -select,name=tas source/tc.tc /destination/tc.tc

I don't understand why it would run in parallel, since the documentation doesn't say anything about its parallelisation.

Regards,
George

RE: Compiling CDO for parallel computing - Added by Ralf Mueller over 3 years ago

hi George!
can you upload the input file for further investigation?

allthough this looks more like a bot ...

RE: Compiling CDO for parallel computing - Added by George Miloshevich over 3 years ago

Hi Ralf,

Here's the file. I was extracting with the command
cdo -select,name=tas /.../CONTROL_BATCH0001_misc.05.0020.nc /local/.../CONTROL_BATCH0001_tas.05.0020.nc
There is a command "htop", on the machine I am running this on, according to which for a brief period of time (about 1 to 30 sec) there are 24 copies of the process running simultaneously ( which is the number of the threads on the machine) . The complaint of the administrator is that the processes are running concurrently and generally slowing down the performance of the entire remote storage, which is also used by other independent scientific teams. Could it be that cdo is somehow configured improperly on that machine?

The reason I am doing preprocessing prior to copying the useful data to my workspace is because we are also discouraged from copying large amounts of data directly from that remote storage. I hope I have explained the situation clearly enough, don't hesitate if you have further questions.

Kind Regards,
George

RE: Compiling CDO for parallel computing - Added by Ralf Mueller over 3 years ago

thx for proving me wrong about being a bot, George.

I run this but it's so fast I cannot reproduce this behaviour

$ cdo -select,name=tas CONTROL_BATCH0001_misc.05.0020.nc tas.nc
cdo    select: Processed 1966080 values from 11 variables over 240 timesteps [0.09s 43MB].
cdo -select,name=tas CONTROL_BATCH0001_misc.05.0020.nc tas.nc  0.06s user 0.04s system 58% cpu 0.170 total

RE: Compiling CDO for parallel computing - Added by Ralf Mueller over 3 years ago

your paths look a strange

 /.../CONTROL_BATCH0001_misc.05.0020.nc /local/.../CONTROL_BATCH0001_tas.05.0020.nc
Seems like you are working in the root directory.

CDO uses standard POSIX IO, for netcdf it is handled by the netcdf library. could you upload the output of

cdo -V
and tell me something about the remote file system. I can run CDO on a lustre system with your input without problems.

cheers
ralf

RE: Compiling CDO for parallel computing - Added by George Miloshevich over 3 years ago

So the output of cdo -V looks like this:

Climate Data Operators version 1.9.6 (http://mpimet.mpg.de/cdo)
System: x86_64-pc-linux-gnu
CXX Compiler: g++ -g -O2 -fdebug-prefix-map=/build/cdo-1.9.6=. -fstack-protector-strong -Wformat -Werror=format-security -fopenmp
CXX version : g++ (Debian 8.2.0-15) 8.2.0
C Compiler: gcc -g -O2 -fdebug-prefix-map=/build/cdo-1.9.6=. -fstack-protector-strong -Wformat -Werror=format-security -Wall -pedantic -fPIC -fopenmp
C version : gcc (Debian 8.2.0-15) 8.2.0
F77 Compiler: f77 -g -O2 -fdebug-prefix-map=/build/cdo-1.9.6=. -fstack-protector-strong
F77 version : unknown
Features: 23GB 24threads C++14 Fortran DATA PTHREADS OpenMP45 HDF5 NC4/HDF5/threadsafe OPeNDAP SZ UDUNITS2 PROJ.4 MAGICS CURL FFTW3 SSE2
Libraries: HDF5/1.10.4 proj/5.2 curl/7.64.0(h7.63.0)
Filetypes: srv ext ieg grb1 grb2 nc1 nc2 nc4 nc4c nc5
CDI library version : 1.9.6
ecCodes library version : 2.12.0
NetCDF library version : 4.6.2 of Nov 20 2018 06:04:35 $
hdf5 library version : library undefined
exse library version : 1.4.1
FILE library version : 1.8.3

RE: Compiling CDO for parallel computing - Added by George Miloshevich over 3 years ago

Concerning the paths,

Well we have a space called /distonet which is where the data are stored in /distonet, a storage space distributed across multiple servers. /local is sort of the hard drive of a machine.
So the command I mentioned starts by referencing this distonet and transferring the output to the local storage.

cdo -select,name=tas /distonet/.../CONTROL_BATCH0001_misc.05.0020.nc /local/.../CONTROL_BATCH0001_tas.05.0020.nc

But I suppose the detailed address is not relevant. We are also advised against executing commands which involve reading and writing to the same volume simulataneously. That's why I am not performing

cdo -select,name=tas /distonet/.../CONTROL_BATCH0001_misc.05.0020.nc /distonet/.../CONTROL_BATCH0001_tas.05.0020.nc

RE: Compiling CDO for parallel computing - Added by Ralf Mueller over 3 years ago

but with

.../
it doesn't make sense to me. What is this?

RE: Compiling CDO for parallel computing - Added by George Miloshevich over 3 years ago

Well the full address is

cdo -select,name=tas /distonet/projects/ClimateLearningFR/Plasim/misc/CONTROL_BATCH0001_misc.05/CONTROL_BATCH0001_misc.05.0020.nc /local/gmiloshe/CONTROL_BATCH0001_tas.05/CONTROL_BATCH0001_tas.05.0020.nc

RE: Compiling CDO for parallel computing - Added by Ralf Mueller over 3 years ago

ah ok - my bad then ;-)

I see no reason why CDO should create multiple processes for this operation. can you retry this with the current 1.9.9 release?

RE: Compiling CDO for parallel computing - Added by George Miloshevich over 3 years ago

You mean to update the CDO version that is installed?

G

RE: Compiling CDO for parallel computing - Added by Ralf Mueller over 3 years ago

yes, could be that a strange bug causes this. I don't know - never observed something like this before. it's just a guess. and maybe not even a smart one ;-)

RE: Compiling CDO for parallel computing - Added by George Miloshevich over 3 years ago

Ok. I'll ask our admin to do that. Meanwhile I read on the manual the following sentences:
" CDO is a multi-threaded application. Therefore all the above libraries should be compiled thread safe. Using non-threadsafe libraries could cause unexpected errors!"
Is it possible that some libraries displayed with the "cdo -V" command are "non-threadsafe", which caused this error ?

RE: Compiling CDO for parallel computing - Added by Ralf Mueller over 3 years ago

sure. The most-likely candidate is the hdf5 library, on which the cdf4 file format of the netcdf library is based on. But therefore you can use the '-L' option. This should not be a problem in your call, since it is not multi-threaded.

cdo -V indicates that your hdf5 installation in-fact IS thread-save.

cheers
ralf

RE: Compiling CDO for parallel computing - Added by George Miloshevich over 3 years ago

Hi,

We found a simple, brute force solution that solves the problem, although we still do not know the origin of the issue.
We just set the environment variable OMP_NUM_THREADS to 1 with the command export OMP_NUM_THREADS=1. It seems to enforce OpenMPI to only use one thread.
Our cluster admin didn't update cdo, so we don't know if it may solve the problem.

Cheers,
George

RE: Compiling CDO for parallel computing - Added by Ralf Mueller over 3 years ago

Using OMP_NUM_THREADS is a normal way to set the number of OpenMP-Thread in an application (part of the standard). 1 is the default.

As long as it works :-)

happy new year!

    (1-20/20)