Project

General

Profile

Return numpy array with CDO piping in Python

Added by Oliver Angelil almost 10 years ago

In Python I typically use system commands to generate new netCDF files, and then read them in again. For example:

import os
from netCDF4 import Dataset

string = 'cdo -yearmean -fldmean file.nc file_out.nc'
os.system(string)
f = Dataset('file_out.nc')

Can one do this in Python with the cdo python module?

Ideally I would like to do all this in Python without the I/O. I see on this page [[https://code.zmaw.de/projects/cdo/wiki/Cdo%7Brbpy%7D]] it's possible to do:
# Comput the field mean value timeseries and return it as a numpy array
vals = cdo.fldmean(input=ifile,returnCdf=True).variables['tsurf'][:]

Although I'm wanting to use cdo -yearmean too, i.e. do it with multiple cdo commands / piping.

Thanks,
Oliver


Replies (21)

RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller almost 10 years ago

use this for a handle to the netcdf file

cdf = cdo.yearmean(input="-fldmean file.nc",readCdf=True)

or the following for getting 'tsurf' as numpy array
tsurf = cdo.yearmean(input="-fldmean file.nc",returnArray='tsurf')

If you want tou use the cdo pipes, just pu them in the input parameter as if you would write the standard command line.

hth
ralf

RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 10 years ago

Thanks, I'll be using the second one. So the first command after "cdo." (e.g. "yearmean" in your example) is basically the first command I usually use in the standard cdo command line usage.

Oliver

RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller almost 10 years ago

you could set

cdo.debug = True 
to get some insight in the commands, that are really executed.

RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 10 years ago

Thanks Ralf,

A follow-up question:

How would one use cdo.showyear to return all years in the netCDF file as a numpy array.

Regards,
Oliver

RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 10 years ago

I just found out:

x = cdo.showyear(input = infile)

RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller almost 10 years ago

all operators which write to stdout with CDO on the command line, create an array of stings within python: https://github.com/Try2Code/cdo-bindings/blob/master/python/test/test_cdo.py#L125

in you context it should look like this

x = ['1990 1991 1992 1993']

in order to do something with it:

  • as a list of strings
    listOfYears = x[0].split(' ')
  • as a list of integers
    listOfYears = list(map(lambda v: int(v), x[0].split(' ')))

RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 10 years ago

I'm getting a "segmentation error" on one of my netcdf files. When I use the standard command line approach, adding an "-L" prevents this error. It is possible to do this within python?

Oliver

RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller almost 10 years ago

I guess, this happens due to non-thread-safe installation of hdf5 backend for netcdf. You can give it as an options parameter value within python

x = cdo.showyear(input = infile,options="-L")

RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 10 years ago

I'm trying to do the following:

cdo.runmean,5(input = infile, output=outfile). However Python then throws an error:
"'int' object is not callable"

This is because of the ",5"

How can I do a running mean across 5 timesteps?

Thanks,
Oliver

RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller almost 10 years ago

Try

cdo.runmean(5,input = infile, output=outfile)

RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 10 years ago

Thanks. This is an incredibly handy wrapper around an impressively powerful piece of software!

RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 10 years ago

Hi Ralf,

How can I apply "SKIP_SAME_TIME=1" for the cdo.mergetime command?

Regards,
Oliver

RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller almost 10 years ago

You van use the dict notation for the env key:

cdo.mergetime(input='foo.nc',env={"SKIP_SAME_TIME": 1})

some more examples here

RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 10 years ago

Thank you. I believe you made a typo. The line should be:

cdo.mergetime(input='foo.nc',env={"SKIP_SAME_TIME": "1"})

Regards,
Oliver

RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 10 years ago

Hi Ralf,

I have another query, thanks for all your help so far, I have found it extremely handy.

The following takes about 20 minutes to run (although I repeat this command with a different mask ~100 times so it takes more than a day to run):

cdo.fldmean(input = '-subc,273.15 -div data.nc -gec,0.5 mask.nc', returnArray='tas',  options = '-L')

Where "data.nc" has dimensions 37800 x 324 x 432 (time, lat, lon), and "mask.nc" consists of values between 0 and 1 (being the fraction of land) and has dimensions 324 x 432.

Is there a trick to speed this up? Is the "splitsel" operator advised for such a task?

Thanks,
Oliver

RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller almost 10 years ago

splitsel is a good idea, because fldmean is time independent. Depending on your IO speed and the number of variables, splitname or splitcode could be a little bit easier to use, because you probably don't need to merge anything together at the end (full timeseries in a single file).

The next step is using a ramdisk.

hth
ralf

RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil over 8 years ago

Hi Ralf,

Do you know how to return multiple files with the "returnArray" argument in Python? For example, the cdo -eoftime operator returns 2 files.

The idea would be something like (which does not work):
x1, x2 = cdo.eoftime(2, input = 'ifile.nc', returnArray='variable_name')

Thanks,
Oliver

RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller over 8 years ago

sorry, I missed your post ;-)
Currently there is no way to handle multiple output files in that way.

You could open a ticket here or on github because I guess, it will take some time. some changed to CDO itself might be needed to implement that in a consistent way.

anyway - thx for the report ;-)

best wishes
ralf

RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller over 8 years ago

I added a ticket myself: https://github.com/Try2Code/cdo-bindings/issues/16

cdo-1.9.2 will provide the infrastructure needed for that feature - it should be release in about two weeks

cheers
ralf

    (1-21/21)