Broadcasting climate indices
Added by Gruel Axles over 1 year ago
What is the recommended approach to calculate ECASU - Summer days index per time period for multiple temperature thresholds?
Intuitively I want to broadcast the greater than comparison over an array of constants rather than a single constant. Is this possible with CDO Indices?
Replies (6)
RE: Broadcasting climate indices - Added by Karin Meier-Fleischer over 1 year ago
Hi Gruel,
see the ECA documentation https://code.mpimet.mpg.de/projects/cdo/embedded/cdo_eca.pdf#subsection.2.0.3.
RE: Broadcasting climate indices - Added by Gruel Axles over 1 year ago
Thanks for the quick response. I had seen the documentation, but I’m still not clear on the best approach to computing the ECASU index for multiple threshold values. The documentation says T is an integer, so can it only be done for a single threshold at a time? And multiple thresholds requires running the command multiple times, which duplicates all the I/O? Just looking for a way to avoid all the extra I/O for computing multiple threshold values. I don’t mind bypassing the eca_su operator and manually crafting the index calculation using more primitive operators that can handle broadcasting.
RE: Broadcasting climate indices - Added by Gruel Axles over 1 year ago
Gruel Axles wrote in RE: Broadcasting climate indices:
Thanks for the quick response. I had seen the documentation, but I’m still not clear on the best approach to computing the ECASU index for multiple threshold values. The documentation says T is an FLOAT, so can it only be done for a single threshold at a time? And multiple thresholds requires running the command multiple times, which duplicates all the I/O? Just looking for a way to avoid all the extra I/O for computing multiple threshold values. I don’t mind bypassing the eca_su operator and manually crafting the index calculation using more primitive operators that can handle broadcasting.
RE: Broadcasting climate indices - Added by Ralf Mueller over 1 year ago
hi!
There is no way to calculate the index for multiple thresholds at once. you have to create additional output (a single 2d field).
BTW: i don't understand the term broadcasting in that context.
cheers
ralf
RE: Broadcasting climate indices - Added by Gruel Axles over 1 year ago
Ah sorry. I use broadcasting in the way NumPy describes https://numpy.org/doc/stable/user/basics.broadcasting.html
So if you have a size (M,N) 2D array of temperature data and a size (Q,) 1D array of threshold values, then the operation
T_data > T_thresh
will be broadcasted, resulting in a 3D array of size (M, N, Q) where each slice in the Q dimension represents the comparison with each constant in the threshold array independently.
RE: Broadcasting climate indices - Added by Ralf Mueller over 1 year ago
I see. In the case of ecasu, the operation is a reduction of the time dimension (T(time,lon,lat) -> ecasu(lon,lat,T) Sure, you can save multiple results in a single array. But in the same way you can merge each output file (for each given T threshold) into a single file. The algorithm itself will not be parallelized in any way with CDO. Instead you could run multiple CDO commands for process-level parallelization.
You can rewrite the whole implementation with dask to ad the parallelization within a single process, but writing it will take time and I doubt there will be any performance benefit left in the end.
If you want to avoid IO-overhead, you can use /dev/shm
on linux systems. It's limited in terms of space, but very fast.
cheers
ralf