histogram + percentile
Added by Marston Ward over 12 years ago
I'm trying to use CDO to find the 90th percentile threshold of all rain rates above a minimum value: 0.1.
I would like to do this using 2 years of TRMM data that is very large at 0.25x0.25 degree resolution. It seems like CDO can do this but the manual is not too clear
on what is done. I used the command: cdo timpctl,90 ifile -timmin ifile -timmax file ofile but the results have negative values.
I'm not sure what CDO is doing and so I would like some more info. The original data does not have any negative values.
1.) Is CDO doing any interpolation?
2.) Are the results all the values that are above the 90th percentile of the data?
3.) Since there are lots of zeros in the original data, are these excluded?
/Marston
Replies (6)
RE: histogram + percentile - Added by Jaison-Thomas Ambadan over 12 years ago
Hi Marston,
Can you please upload a sample of your TRMM data? It will be very helpful for the CDO guys here to identify the problem a bit fast. I assume the data is global and huge so it may be better if you can strip the global data to a a region (using "sellonlatbox", say africa/tropics/europe etc) and with 3 or 4 time step - then the file size should be sufficient to upload that here.
Cheers,
J.
RE: histogram + percentile - Added by Marston Ward over 12 years ago
Hi J,
I attached the original TRMM file 3B42 and the file with 2 years of data that I would like to find the 90th percentile: trmm_2yrs_sample.nc.
3B42.070102.9.6.nc (4.4 MB) 3B42.070102.9.6.nc | Original TRMM data for one hour | ||
trmm_2yrs_sample.nc (42.9 MB) trmm_2yrs_sample.nc | Merged TRMM data over 2 years sample sellonlatbox,160,180,3,0 |
RE: histogram + percentile - Added by Jaison-Thomas Ambadan over 12 years ago
Hi,
sorry for taking such a long-time to get back to you. Your command sequence is right BUT CDO is also taking the missing values "-9999.9" in your input file for the calculation (in timmin, timmax): if you try:
cdo -timpctl,90 -setmissval,-9999.9 trmm_2yrs_sample.nc -timmin -setmissval,-9999.9 trmm_2yrs_sample.nc -timmax -setmissval,-9999.9 trmm_2yrs_sample.nc 90_Percentile.nc
you will get what you want. Please let us know if this works!
Cheers,
J.
RE: histogram + percentile - Added by Marston Ward over 12 years ago
Hi,
I tried this method and the results look good, i.e., no negative values, but one question remains: what is it that I get? Is it the 90th percentile at each lat and lon position over time? It seems like this is what I get.
What I want is a scalar result, one single value that depicts the 90th percentile for the entire domain and over all time steps.
RE: histogram + percentile - Added by Jaison-Thomas Ambadan over 12 years ago
Hi Marston,
what is it that I get? Is it the 90th percentile at each lat and lon position over time?
Yes, exactly!
What I want is a scalar result, one single value that depicts the 90th percentile for the entire domain and over all time steps.
I don't think CDO can do that. The operator "timpctl" operates on time axis ONLY and "fldpctl" for lat/lon fields; either operator will not work on both the axis at the same time. Besides CDO operations are based on grid/co-ordinate info. For your needs, you need to convert your data into a vector (from both time and spatial axes), which CDO CAN do, BUT you can't store all the values into a NetCDF or GRIB file ONLY as an ASCII file on which neither "timpctl" nor "fldpctl" works.
----------------------------------------------------------------------
So the "easy way" would be: first convert your data to a ASCII file (just a vector without any lat/lon/time info). On linux/unix:
cdo -outputf,%13.6f,1 -seltimestep,1/5848 -setmissval,-9999.9 -selvar,RR trmm_2yrs_sample.nc > RR_data.dat
and then use other tools such as MATLAB for calculating the 90th percentile (loading ASCII file is very easy)
----------------------------------------------------------------------
On the other hand you may do a field average (over the globe/region, doesn't matter) and then do the "timpctl" (or do a time average and do the "fldpctl"), then you will get a single value for the 90th percentile, representing the region (which make much more sense to me). In this case CDO can do all calculations.
Hope this helps!
Cheers,
J.
RE: histogram + percentile - Added by Marston Ward over 12 years ago
Hi,
Thanks, your last method is indeed the better one but it cannot be applied to this dataset. Taking the time average removes too much information and as precipitation is a small scale feature, the resulting 90th percentile comes out wrong. But I learned more about CDO. Thanks.
/M