Project

General

Profile

Return numpy array with CDO piping in Python

Added by Oliver Angelil almost 8 years ago

In Python I typically use system commands to generate new netCDF files, and then read them in again. For example:

import os
from netCDF4 import Dataset

string = 'cdo -yearmean -fldmean file.nc file_out.nc'
os.system(string)
f = Dataset('file_out.nc')

Can one do this in Python with the cdo python module?

Ideally I would like to do all this in Python without the I/O. I see on this page [[https://code.zmaw.de/projects/cdo/wiki/Cdo%7Brbpy%7D]] it's possible to do:
# Comput the field mean value timeseries and return it as a numpy array
vals = cdo.fldmean(input=ifile,returnCdf=True).variables['tsurf'][:]

Although I'm wanting to use cdo -yearmean too, i.e. do it with multiple cdo commands / piping.

Thanks,
Oliver


Replies (21)

RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller almost 8 years ago

use this for a handle to the netcdf file

cdf = cdo.yearmean(input="-fldmean file.nc",readCdf=True)

or the following for getting 'tsurf' as numpy array
tsurf = cdo.yearmean(input="-fldmean file.nc",returnArray='tsurf')

If you want tou use the cdo pipes, just pu them in the input parameter as if you would write the standard command line.

hth
ralf

RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 8 years ago

Thanks, I'll be using the second one. So the first command after "cdo." (e.g. "yearmean" in your example) is basically the first command I usually use in the standard cdo command line usage.

Oliver

RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller almost 8 years ago

you could set

cdo.debug = True 
to get some insight in the commands, that are really executed.

RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 8 years ago

Thanks Ralf,

A follow-up question:

How would one use cdo.showyear to return all years in the netCDF file as a numpy array.

Regards,
Oliver

RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 8 years ago

I just found out:

x = cdo.showyear(input = infile)

RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller almost 8 years ago

all operators which write to stdout with CDO on the command line, create an array of stings within python: https://github.com/Try2Code/cdo-bindings/blob/master/python/test/test_cdo.py#L125

in you context it should look like this

x = ['1990 1991 1992 1993']

in order to do something with it:

  • as a list of strings
    listOfYears = x[0].split(' ')
  • as a list of integers
    listOfYears = list(map(lambda v: int(v), x[0].split(' ')))

RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 8 years ago

I'm getting a "segmentation error" on one of my netcdf files. When I use the standard command line approach, adding an "-L" prevents this error. It is possible to do this within python?

Oliver

RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller almost 8 years ago

I guess, this happens due to non-thread-safe installation of hdf5 backend for netcdf. You can give it as an options parameter value within python

x = cdo.showyear(input = infile,options="-L")

RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 8 years ago

I'm trying to do the following:

cdo.runmean,5(input = infile, output=outfile). However Python then throws an error:
"'int' object is not callable"

This is because of the ",5"

How can I do a running mean across 5 timesteps?

Thanks,
Oliver

RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller almost 8 years ago

Try

cdo.runmean(5,input = infile, output=outfile)

RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 8 years ago

Thanks. This is an incredibly handy wrapper around an impressively powerful piece of software!

RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 8 years ago

Hi Ralf,

How can I apply "SKIP_SAME_TIME=1" for the cdo.mergetime command?

Regards,
Oliver

RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller almost 8 years ago

You van use the dict notation for the env key:

cdo.mergetime(input='foo.nc',env={"SKIP_SAME_TIME": 1})

some more examples here

RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 8 years ago

Thank you. I believe you made a typo. The line should be:

cdo.mergetime(input='foo.nc',env={"SKIP_SAME_TIME": "1"})

Regards,
Oliver

RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 8 years ago

Hi Ralf,

I have another query, thanks for all your help so far, I have found it extremely handy.

The following takes about 20 minutes to run (although I repeat this command with a different mask ~100 times so it takes more than a day to run):

cdo.fldmean(input = '-subc,273.15 -div data.nc -gec,0.5 mask.nc', returnArray='tas',  options = '-L')

Where "data.nc" has dimensions 37800 x 324 x 432 (time, lat, lon), and "mask.nc" consists of values between 0 and 1 (being the fraction of land) and has dimensions 324 x 432.

Is there a trick to speed this up? Is the "splitsel" operator advised for such a task?

Thanks,
Oliver

RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller almost 8 years ago

splitsel is a good idea, because fldmean is time independent. Depending on your IO speed and the number of variables, splitname or splitcode could be a little bit easier to use, because you probably don't need to merge anything together at the end (full timeseries in a single file).

The next step is using a ramdisk.

hth
ralf

RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil over 6 years ago

Hi Ralf,

Do you know how to return multiple files with the "returnArray" argument in Python? For example, the cdo -eoftime operator returns 2 files.

The idea would be something like (which does not work):
x1, x2 = cdo.eoftime(2, input = 'ifile.nc', returnArray='variable_name')

Thanks,
Oliver

RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller over 6 years ago

sorry, I missed your post ;-)
Currently there is no way to handle multiple output files in that way.

You could open a ticket here or on github because I guess, it will take some time. some changed to CDO itself might be needed to implement that in a consistent way.

anyway - thx for the report ;-)

best wishes
ralf

RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller over 6 years ago

I added a ticket myself: https://github.com/Try2Code/cdo-bindings/issues/16

cdo-1.9.2 will provide the infrastructure needed for that feature - it should be release in about two weeks

cheers
ralf

    (1-21/21)