Project

General

Profile

python wrapper

Added by Jay Su over 3 years ago

I happened to check out this page:
https://gitlab.dkrz.de/k204210/cdo_cei/-/blob/master/cdo-cei.ipynb
and learnt that we can install cdo via conda while installing the python cdo binder via pip.

How exactly? Thanks!


Replies (24)

RE: python wrapper - Added by Karin Meier-Fleischer over 3 years ago

Hi Jay,

as mentioned in the README part of the repository you can download the environment.yml file and use it with conda:

conda env create -f environment.yml

Conda then tells you how to activate the environment after installing the software.

-Karin

RE: python wrapper - Added by Jay Su over 3 years ago

Thank you Karin,

Just to check, this is a third party wrapper, not developed by your group, right?

RE: python wrapper - Added by Karin Meier-Fleischer over 3 years ago

It is from our group :)

RE: python wrapper - Added by Jay Su over 3 years ago

I tried
pip install cdo
and it downloaded cdo-1.5.3.tar.gz

So this wrapper is cdo v1.5.3 but when setting the environment then it points to the version in the conda, right?

Thanks,

RE: python wrapper - Added by Jay Su over 3 years ago

Q: So this wrapper is cdo v1.5.3 but when setting the environment then it points to the version in the conda, right?
A: Yes.

Another question, how to do chain operators in the python wrapper? Manual or more examples?

Thanks.

RE: python wrapper - Added by Ralf Mueller over 3 years ago

I think some clarification is needed:

  • in pip there is just one package called cdo. This is the python-wrapper
  • in conda there is also just one package called cdo, BUT this is the official CDO executable
  • in conda the package for the python-wrapper for CDO is called python-cdo

any python wrapper will call the CDO binary that is first in your PATH, no matter if it comes from conda, spack, your OS or a manually installation.

RE: python wrapper - Added by Jay Su over 3 years ago

Thank you Ralf,

I notice your group is also testing ruby and julia wrapper, which will make cdo more popular and convenient!

RE: python wrapper - Added by Ralf Mueller over 3 years ago

the ruby wrapper is working, it is infact older than the python wrapper. you can use it.

the julia wrapper for CDO is basically a couple of PyCall code. there is no extra wrapper needed because the python integration of Julia is so powerful

RE: python wrapper - Added by Jay Su about 3 years ago

I am stuck with the chain operators ie https://github.com/Try2Code/cdo-bindings/blob/master/python/test/test_cdo.py#L181

ofile = cdo.setname("veloc", input=" -copy -random,r1x1",options = "-f nc")
self.assertEqual(["veloc"],cdo.showname(input = ofile))

Where is the definition of "assertEqual"?
How to execute the chain operators?

Thanks,

RE: python wrapper - Added by Jay Su about 3 years ago

Also after operations with CDO, the result becomes a file in the disk. Is it possible to not to output as a file, but to save as an object or whatever in the memory for the python script's use?

RE: python wrapper - Added by Ralf Mueller about 3 years ago

hi!

instead of files you get numpy arrays, masked arrays, xarrays or xdatasets as return values. See here for more

cheers
ralf

RE: python wrapper - Added by Jay Su about 3 years ago

Thank you Ralf,

Could you also answer my question about the chain operators?

ofile = cdo.setname("veloc", input=" -copy -random,r1x1",options = "-f nc")
self.assertEqual(["veloc"],cdo.showname(input = ofile))

How do we execute chain operators?

What are in the input and what are in the options?

RE: python wrapper - Added by Jay Su about 3 years ago

Also

cdo.remap(gridfile,input = ifile, output = ofile)

=>

Error in calling operator remap with:

cdo -O -remap,-remapbil,gridfile ifile ofile<<<

STDOUT:
STDERR:
cdo remap (Abort): Open failed on -remapbil!

"-remap," should not be there, right?

RE: python wrapper - Added by Karin Meier-Fleischer about 3 years ago

Hi Jay,

if you want to use remapbil then do something like:

cdo.remapbil('r360x180', input='-seltimestep,1 -selvar,tsurf '+infile, output=outfile)

or

cdo.remapbil('gridfile.txt', input=infile, output=outfile)

-Karin

RE: python wrapper - Added by Jay Su about 3 years ago

Thank you Karin,

If I understand it correctly, for chain operators, the last operator is the function I need to call, eg in the above case, cdo.remapbil. Right?

It's really inconvenient for scripts to check and pick. Do we have some universal command like "cdo" in the command line case?

RE: python wrapper - Added by Ralf Mueller about 3 years ago

Jay Su wrote:

Thank you Karin,

If I understand it correctly, for chain operators, the last operator is the function I need to call, eg in the above case, cdo.remapbil. Right?

right

It's really inconvenient for scripts to check and pick. Do we have some universal command like "cdo" in the command line case?

you can set

cdo.debug = True
to see the generated command lines during runtime

RE: python wrapper - Added by Jay Su about 3 years ago

Thanks Ralf,

I know the commands to do the jobs but it's annoying to tell the scripts which executable to use. It could be
cdo.remapbil, cdo.remapnn, etc. Could you in a future version put this part in the "input" of a universal executable such as cdo.run?

Speaking of speed, the python wrapper is much slower than the command line. Is there any optimization to speed up?

RE: python wrapper - Added by Ralf Mueller about 3 years ago

hi Jay!

the python-wrapper is designed to work with different versions of cdo. usually on hpc-systems there are several versions available, not all with the same feature set. Besides data set like CMIP5 are published wrt. to a certain CDO version.

In general that's how modules/classes are used in python. compare it to numpy usage, there you import numpy may with a shortcut to 'np' but then you use this in namespace like 'np.ndarray(...)'

technically the usage of objects here is good for getting thread-safty for free regarding what is done under the hood in the python module.

in the development-version of cdo.py there is something like that: https://github.com/Try2Code/cdo-bindings/blob/master/python/test/test_cdo.py#L188

But the interface is not ready for a release,yet

cheers
ralf

regarding performance: Please provide a script and how you call it so that I can run it for comparison

RE: python wrapper - Added by Pauline Millet about 3 years ago

Hello all,

Since it seems to be a general talk about CDO python bindings, I join the discussion with some questions:

Currently working with CDO Python bindings, I was wondering if it exists a way to format the output saved in numpy arrays/xarrays/xdatasets/... as we are somehow able to do with -outputtab option?
For example, when I compute yearly stats, I only save the reference year in the results:

cdo -outputtab,name,year,lon,lat,value -yearsum -gec,30 -select,month=6/9 -sellonlatbox,1.5,1.61,43.43,43.54 tasmaxAdjust.nc>nbsummerhotdays.txt

When dealing with Python, the time reference in output array is the number of days since the given reference date, which is not very convenient.

Also, I read on the github repo that a new version of the package is under construction, any idea of when the release is planned?

Cheers,
Pauline

RE: python wrapper - Added by Ralf Mueller about 3 years ago

Pauline Millet wrote:

Hello all,

Since it seems to be a general talk about CDO python bindings, I join the discussion with some questions:

Currently working with CDO Python bindings, I was wondering if it exists a way to format the output saved in numpy arrays/xarrays/xdatasets/... as we are somehow able to do with -outputtab option?
For example, when I compute yearly stats, I only save the reference year in the results:
[...]

When dealing with Python, the time reference in output array is the number of days since the given reference date, which is not very convenient.

I don't fully understand your question: do you want to output arrays/xarrays/xdatasets or to you want to do certain operation on these outputs? or in other words: What exactly is the output your like to have?

Also, I read on the github repo that a new version of the package is under construction, any idea of when the release is planned?

no, I cannot say at the moment. you can use the development version from the current master branch if you like. I try to commit only stuff that survives the tests. But I cannot guarantee that.

The biggest step will be the introduction of the chaining on python level as the new default way of using cdo.py. In my mind there are still some corner-cases that need more attention, but I don't have enough time at the moment.
you can come up with proposals for syntax if you like. my original plan was to release new versions for Python and Ruby at the same time.

Cheers,
Pauline

best wishes
ralf

RE: python wrapper - Added by Pauline Millet about 3 years ago

Thanks Ralf for your quick answer. I let you know if I have syntax proposals.

About my first question: I do want to get arrays in output but I would like to be able to set their content, and in particular the format of the time variable.

Here is the description of one of the output I obtained:

>>> myarray.variables

{'time': <class 'netCDF4._netCDF4.Variable'>
float64 time(time)
    standard_name: time
    long_name: time
    bounds: time_bnds
    units: days since 1949-12-01 00:00:00
    calendar: proleptic_gregorian
    axis: T
unlimited dimensions: time
current shape = (46,)
filling on, default _FillValue of 9.969209968386869e+36 used, 'time_bnds': <class 'netCDF4._netCDF4.Variable'>
float64 time_bnds(time, bnds)
unlimited dimensions: time
[...]

As you can notice, the unit of the time variable is 'days since 1949-12-01 00:00:00', and looking at its values:

>>> myarray.variables["time"][:]

masked_array(data=[15051. , 15416.5, 15782. , 16147. , 16512. , 16877.5,
                   17243. , 17608. , 17973. , 18338.5, 18704. , 19069. ,
                   19434. , 19799.5, 20165. , 25643.5, 26009. , 26374. ,
                   26739. , 27104.5, 27470. , 27835. , 28200. , 28565.5,
                   28931. , 29296. , 29661. , 30026.5, 30392. , 30757. ,
                   31122. , 31487.5, 31853. , 32218. , 32583. , 32948.5,
                   33314. , 33679. , 34044. , 34409.5, 34775. , 35140. ,
                   35505. , 35870.5, 36236. , 36601. ],
             mask=False,
       fill_value=1e+20)

Is there a way to choose another time unit in output? Or the only solution is to reprocess my output then? I actually would like to keep only the year as time reference.

I hope that explains a little better,

Pauline

RE: python wrapper - Added by Ralf Mueller about 3 years ago

ah, ok. I think what your are looking for is an absolute time axis. This is easier to read because the unit looks like this

time:units = "day as %Y%m%d.%f" ;

2012-12-08T12:00:00 looks like 202121208.5 in this unit. you can always generate an absolute time axis with CDO by adding -a to the options.

XArray offers different functionality when it comes to time/date data: http://xarray.pydata.org/en/stable/time-series.html#. the CF-convention though is a bit more limited compared to that https://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html#time-coordinate

CDO sticks to CF, but I think if you choose to output xarray instead of numpy array, you can make use of the date/time features of xarray.

did I get you right?

hth
ralf

RE: python wrapper - Added by Pauline Millet about 3 years ago

Yes, thanks! I was not aware of the functionalities that xArrays/xDatasets were offering for dealing with time data. I'll have a look on what fits better for my application.

    (1-24/24)