Return numpy array with CDO piping in Python
Added by Oliver Angelil almost 9 years ago
In Python I typically use system commands to generate new netCDF files, and then read them in again. For example:
import os
from netCDF4 import Dataset
string = 'cdo -yearmean -fldmean file.nc file_out.nc'
os.system(string)
f = Dataset('file_out.nc')
Can one do this in Python with the cdo python module?
Ideally I would like to do all this in Python without the I/O. I see on this page [[https://code.zmaw.de/projects/cdo/wiki/Cdo%7Brbpy%7D]] it's possible to do:# Comput the field mean value timeseries and return it as a numpy array
vals = cdo.fldmean(input=ifile,returnCdf=True).variables['tsurf'][:]
Although I'm wanting to use cdo -yearmean
too, i.e. do it with multiple cdo commands / piping.
Thanks,
Oliver
Replies (21)
RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller almost 9 years ago
use this for a handle to the netcdf file
cdf = cdo.yearmean(input="-fldmean file.nc",readCdf=True)
or the following for getting 'tsurf' as numpy array
tsurf = cdo.yearmean(input="-fldmean file.nc",returnArray='tsurf')
If you want tou use the cdo pipes, just pu them in the input parameter as if you would write the standard command line.
hth
ralf
RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 9 years ago
Thanks, I'll be using the second one. So the first command after "cdo." (e.g. "yearmean" in your example) is basically the first command I usually use in the standard cdo command line usage.
Oliver
RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller almost 9 years ago
you could set
cdo.debug = Trueto get some insight in the commands, that are really executed.
RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 9 years ago
Thanks Ralf,
A follow-up question:
How would one use cdo.showyear to return all years in the netCDF file as a numpy array.
Regards,
Oliver
RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 9 years ago
I just found out:
x = cdo.showyear(input = infile)
RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller almost 9 years ago
all operators which write to stdout with CDO on the command line, create an array of stings within python: https://github.com/Try2Code/cdo-bindings/blob/master/python/test/test_cdo.py#L125
in you context it should look like this
x = ['1990 1991 1992 1993']
in order to do something with it:
- as a list of strings
listOfYears = x[0].split(' ')
- as a list of integers
listOfYears = list(map(lambda v: int(v), x[0].split(' ')))
RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 9 years ago
I'm getting a "segmentation error" on one of my netcdf files. When I use the standard command line approach, adding an "-L" prevents this error. It is possible to do this within python?
Oliver
RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller almost 9 years ago
I guess, this happens due to non-thread-safe installation of hdf5 backend for netcdf. You can give it as an options
parameter value within python
x = cdo.showyear(input = infile,options="-L")
RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 9 years ago
I'm trying to do the following:
cdo.runmean,5(input = infile, output=outfile). However Python then throws an error:
"'int' object is not callable"
This is because of the ",5"
How can I do a running mean across 5 timesteps?
Thanks,
Oliver
RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller almost 9 years ago
Try
cdo.runmean(5,input = infile, output=outfile)
RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil almost 9 years ago
Thanks. This is an incredibly handy wrapper around an impressively powerful piece of software!
RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil over 8 years ago
Hi Ralf,
How can I apply "SKIP_SAME_TIME=1" for the cdo.mergetime command?
Regards,
Oliver
RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller over 8 years ago
You van use the dict notation for the env
key:
cdo.mergetime(input='foo.nc',env={"SKIP_SAME_TIME": 1})
some more examples here
RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil over 8 years ago
Thank you. I believe you made a typo. The line should be:
cdo.mergetime(input='foo.nc',env={"SKIP_SAME_TIME": "1"})
Regards,
Oliver
RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller over 8 years ago
Yeah, you're right ;-)
RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil over 8 years ago
Hi Ralf,
I have another query, thanks for all your help so far, I have found it extremely handy.
The following takes about 20 minutes to run (although I repeat this command with a different mask ~100 times so it takes more than a day to run):
cdo.fldmean(input = '-subc,273.15 -div data.nc -gec,0.5 mask.nc', returnArray='tas', options = '-L')
Where "data.nc" has dimensions 37800 x 324 x 432 (time, lat, lon), and "mask.nc" consists of values between 0 and 1 (being the fraction of land) and has dimensions 324 x 432.
Is there a trick to speed this up? Is the "splitsel" operator advised for such a task?
Thanks,
Oliver
RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller over 8 years ago
splitsel
is a good idea, because fldmean
is time independent. Depending on your IO speed and the number of variables, splitname
or splitcode
could be a little bit easier to use, because you probably don't need to merge anything together at the end (full timeseries in a single file).
The next step is using a ramdisk.
hth
ralf
RE: Return numpy array with CDO piping in Python - Added by Oliver Angelil over 7 years ago
Hi Ralf,
Do you know how to return multiple files with the "returnArray" argument in Python? For example, the cdo -eoftime operator returns 2 files.
The idea would be something like (which does not work):
x1, x2 = cdo.eoftime(2, input = 'ifile.nc', returnArray='variable_name')
Thanks,
Oliver
RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller over 7 years ago
sorry, I missed your post ;-)
Currently there is no way to handle multiple output files in that way.
You could open a ticket here or on github because I guess, it will take some time. some changed to CDO itself might be needed to implement that in a consistent way.
anyway - thx for the report ;-)
best wishes
ralf
RE: Return numpy array with CDO piping in Python - Added by Ralf Mueller over 7 years ago
I added a ticket myself: https://github.com/Try2Code/cdo-bindings/issues/16
cdo-1.9.2 will provide the infrastructure needed for that feature - it should be release in about two weeks
cheers
ralf