Project

General

Profile

Counting number of days with a condition

Added by Matt Hyde over 1 year ago

Hi,

I need a help regarding CDO operation on a netcdf file. I downloaded Daily mean temperature data for 40 years from ERA5 over a grid region (variable name: temp) and I masked temp values for a range (30-50degrees) to 1 and other values to 0 using cdo.
cdo -expr,'temp=temp*(temp>=30 && temp<50)' data1.nc data2.nc

Now I want to calculate number of times each grid cell recorded temp = 1 consecutively for 5 days but less than 10 days in the last 40 years. Is that possible using cdo?


Replies (3)

RE: Counting number of days with a condition - Added by Karin Meier-Fleischer about 1 year ago

Hi Nithin,

cdo -expr,'temp = temp*(temp >= 303.15 && temp < 323.15)' data1.nc data2.nc

cdo -timsum -expr,'count = ((temp != 0) ? 1 : 0)' data2.nc count.nc

RE: Counting number of days with a condition - Added by Matt Hyde about 1 year ago

Hi Karin,

Thanks for the reply. I get this part. You are right regarding the kelvin-Celsius conversion. My concern is this

The command cdo -timsum -expr,'count = ((temp != 0) ? 1 : 0)' data2.nc count.nc
counts all the temp events following the criteria over all timespaces.

My question is different. I want to count the number of times a gridcell has witnessed atleast 303.15 for consecutively 5 days but less than 10 days. THis is sort of a running count through all the timesteps. Is it possible to check that through cdo or nco?

[More context. For example If a gridcell A gets 39degrees for 7 days in one day in 1990, its to be counted and then again,say if 36 degrees for 5 days in 1997, that should be added to the counter. I hope you are getting what I am telling. THis is just an example. I dont mind if i can get a code through cdo without the upper limit of temperature (which i mentioned as 50degrees or 323.15k)]

RE: Counting number of days with a condition - Added by Karin Meier-Fleischer about 1 year ago

That is exactly what the command above does. (Sorry for the Kelvin example but without your data I had to use my own.)

Just to explain you need to know your input data (in general, are there missing values, variable range, ...) in the following I describe what I did to get the result.

Have a look at my example data (variable tas) with 'cdo sinfon':

cdo sinfon infile.nc

   File format : NetCDF
    -1 : Institut Source   T Steptype Levels Num    Points Num Dtype : Parameter name
     1 : MPIMET   MPI-ESM-LR v instant       1   1     18432   1  F32  : tas           
   Grid coordinates :
     1 : gaussian                 : points=18432 (192x96)  F48
                              lon : 0 to 358.125 by 1.875 degrees_east  circular
                              lat : -88.57217 to 88.57217 degrees_north
                        available : cellbounds
   Vertical coordinates :
     1 : height                   : levels=1  scalar
                           height : 2 m
   Time coordinate :
                             time : 1140 steps
     RefTime =  1850-01-01 00:00:00  Units = days  Calendar = proleptic_gregorian  Bounds = true
  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss
  2006-01-16 12:00:00  2006-02-15 00:00:00  2006-03-16 12:00:00  2006-04-16 00:00:00
  2006-05-16 12:00:00  2006-06-16 00:00:00  2006-07-16 12:00:00  2006-08-16 12:00:00
  2006-09-16 00:00:00  2006-10-16 12:00:00  2006-11-16 00:00:00  2006-12-16 12:00:00
  ...

and a more detailed view at the data for each time step:

cdo infon infile.nc

    -1 :       Date     Time   Level Gridsize    Miss :     Minimum        Mean     Maximum : Parameter name
     1 : 2006-01-16 12:00:00       2    18432       0 :      229.32      276.73      308.33 : tas           
     2 : 2006-02-15 00:00:00       2    18432       0 :      222.84      276.70      307.96 : tas           
     3 : 2006-03-16 12:00:00       2    18432       0 :      217.00      277.31      306.88 : tas           
     4 : 2006-04-16 00:00:00       2    18432       0 :      208.94      278.52      309.75 : tas           
     5 : 2006-05-16 12:00:00       2    18432       0 :      207.32      280.00      316.96 : tas           
     6 : 2006-06-16 00:00:00       2    18432       0 :      206.89      281.00      318.31 : tas        
    ...

1. Extract the data in the wanted range (in my case there ares no missing values)

cdo -expr,'tas = tas*(tas >= 303.15 && tas < 323.15)' infile.nc tas_value_range.nc

2. Have a look at this output. You can see that the values in the given range are extracted and all other values are set to zero.

cdo infon tas_value_range.nc 

    -1 :       Date     Time   Level Gridsize    Miss :     Minimum        Mean     Maximum : Parameter name
     1 : 2006-01-16 12:00:00       2    18432       0 :      0.0000     0.94210      308.33 : tas           
     2 : 2006-02-15 00:00:00       2    18432       0 :      0.0000     0.67735      307.96 : tas           
     3 : 2006-03-16 12:00:00       2    18432       0 :      0.0000      1.9990      306.88 : tas           
     4 : 2006-04-16 00:00:00       2    18432       0 :      0.0000      4.0911      309.75 : tas      
     ...    

The grid itself has not been changed.

cdo sinfon tas_value_range.nc

   File format : NetCDF
    -1 : Institut Source   T Steptype Levels Num    Points Num Dtype : Parameter name
     1 : unknown  unknown  v instant       1   1     18432   1  F32  : tas           
   Grid coordinates :
     1 : gaussian                 : points=18432 (192x96)  F48
                              lon : 0 to 358.125 by 1.875 degrees_east  circular
                              lat : -88.57217 to 88.57217 degrees_north
                        available : cellbounds
   Vertical coordinates :
     1 : height                   : levels=1  scalar
                           height : 2 m

3. Count for each grid cell when it is NOT zero (so it fulfills your request)

cdo -timsum -expr,'count = ((tas != 0) ? 1 : 0)' tas_value_range.nc tas_count.nc

4. Let's see what the output looks like now. It is one single time step that contains the grid and each grid cell contains the value of the sum of your request.

cdo sinfon tas_count.nc

   File format : NetCDF
    -1 : Institut Source   T Steptype Levels Num    Points Num Dtype : Parameter name
     1 : unknown  unknown  v instant       1   1     18432   1  F32  : count         
   Grid coordinates :
     1 : gaussian                 : points=18432 (192x96)  F48
                              lon : 0 to 358.125 by 1.875 degrees_east  circular
                              lat : -88.57217 to 88.57217 degrees_north
                        available : cellbounds
   Vertical coordinates :
     1 : height                   : levels=1  scalar
                           height : 2 m
   Time coordinate :
                             time : 1 step
     RefTime =  1850-01-01 00:00:00  Units = days  Calendar = proleptic_gregorian  Bounds = true
  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss
  2053-07-01 06:00:00

Or

cdo infon tas_count.nc

    -1 :       Date     Time   Level Gridsize    Miss :     Minimum        Mean     Maximum : Parameter name
     1 : 2053-07-01 06:00:00       2    18432       0 :      0.0000      28.592      1017.0 : count   

    (1-3/3)