How would I write this using CDO?
Added by Eleanor Percy-Rouhaud almost 2 years ago
Im relatively new to CDO, but am loving the robustness of it so far. There is something i am struggling to do though - is there a way of counting how many non zero elements i have in a particular dimension of my data set? Its a netCDF data set (written in the inline code as ds). ive done it in python, the particular part about counting how many non missing data points there are, is:
data = ds['variable']
num_non_missing = np.count_nonzero(~np.isnan(data))
I would then need to store all these values somehow - either in another netcdf file, or ideally, tag it on to a new dimension in the netcdf file it belongs to (ds in this case). But thats getting ahead of myself - my main question: is there a way to count this non missing thing in cdo?
RE: How would I write this using CDO? - Added by Ralf Mueller almost 2 years ago
hi Eleanor!
First thing popping in my head is
- transform the data into a 0-1-mask where 1 is the value you need (be it zero, non-zero, missing values or whatever)
- sum up the field with the
this with work individual levels or in full 3d.
If (like in your case) the mask is a normal missing value, the fldsum
will ignore them anyway. you can check the number of missing values with
cdo infov
Does this help?
RE: How would I write this using CDO? - Added by Eleanor Percy-Rouhaud almost 2 years ago
thanks for your quick reply! It does help yes, the fldsum idea also came to me, but whats tripping me up a bit is transforming it into a 0-1 mask. I forgot to mention that I would loop this over a directory with about 20 files, which have different masks. Is there a way of doing this 'if its not 0' thing in cdo? or would it be better to somehow loop over the ranges for each file (they are well defined).
RE: How would I write this using CDO? - Added by Ralf Mueller almost 2 years ago
so .. is it based on missing values? or normal data values? if missing values are represented in the file in a CF-conform-way, the procedure is not dependent on the exact numeric value of the missing value
RE: How would I write this using CDO? - Added by Karin Meier-Fleischer almost 2 years ago
maybe this is what you are looking for
num_non_missing = cdo.fldsum(input='-expr,"tos=((tos > 0. ? 1 : 0))" -seltimestep,1 '+infile, options='--reduce_dim', returnXArray='tos') print(num_non_missing.values)
RE: How would I write this using CDO? - Added by Karin Meier-Fleischer almost 2 years ago
For completeness: the variable name here in the example is 'tos' and only the first time step is selected
from cdo import Cdo cdo = Cdo() num_non_missing = cdo.fldsum(input='-expr,"tos=((tos > 0. ? 1 : 0))" -seltimestep,1 '+infile, options='--reduce_dim', returnXArray='tos') print(num_non_missing.values)
RE: How would I write this using CDO? - Added by Ralf Mueller almost 2 years ago
this looks like the gtc
operator (greater-than-constant).
What about
cdo.fldsum(input=' -gtc,1 '+infile, options='--reduce_dim', returnXArray='tos')
RE: How would I write this using CDO? - Added by Karin Meier-Fleischer almost 2 years ago
Out of curiosity, which of our suggestions is what you were looking for?
RE: How would I write this using CDO? - Added by Eleanor Percy-Rouhaud almost 2 years ago
thanks again for all your responses
I think the first answer with the fldsum
is the most like what i was looking for, since it was most like what i tried before. but so far nothing is quite fully working for me, and i think its creating the 0-1 mask that is tripping me up. would i do this using setrtoc
, for example cdo setrtoc,10,19.99999,1 '+mask+' '+temp_file
? but wouldn't this set all data points in that range equal to 1, even if they are NaN or 0 ?
To summarise, the fldsum
i think is exactly what i need, im just a bit confused about how i can create a 0-1 mask ..
RE: How would I write this using CDO? - Added by Eleanor Percy-Rouhaud almost 2 years ago
to add to this: in pure python, i did this using the non missing thing i wrote above, in my original question. in cdo, i am unsure how to implement the same idea, and cant think of another way that achieves the same thing.
RE: How would I write this using CDO? - Added by Ralf Mueller almost 2 years ago
will not touch NaNs. If the NaN is not declaired a the missing value you will get problem working with these cells. Usually you want to avoid these points anyway, so setting then to be _missing values _ is my recommendation. CDO does ignore declared missing values (how to do this is part of the CF-convention) unless you ask for it
RE: How would I write this using CDO? - Added by Ralf Mueller almost 2 years ago
cdo -h setrtomissfor operators to change or set missing values
RE: How would I write this using CDO? - Added by Eleanor Percy-Rouhaud almost 2 years ago
Ah okay, good to know that missing values are automatically ignored. I set them to missing values, and then the setrtoc,10, 19.99999,1 '+mask+' '+temp_file
and fldsum ' +temp_file+' '+output_file
should work. i do get a single valued array in my netcdf files out of it, but they are incorrect - way too big. this happens when i try it for a single file, or in a loop over a directory containing all the files i need to do this operation on. Its really strange. Ill think more on it, maybe it isnt wrong, or maybe i think of something!
thanks for all the replies, i really appreciate it
RE: How would I write this using CDO? - Added by Ralf Mueller almost 2 years ago
Eleanor, you could upload some sample of the data (may gzip/zip if needed). We are too close to solution to stop now ;-)
RE: How would I write this using CDO? - Added by Eleanor Percy-Rouhaud almost 2 years ago
So sorry it took so long to reply! I agree, and I still haven't found a solution. I could upload a sample, but I guess it could also be any spatial data where a particular range is set to 1 and summed up over the whole area, like a land cover product for example...this would be good to test the command. In fact, I might try this later!
You'll hear from me again with some zipped data soon