Project

General

Profile

Broadcasting climate indices

Added by Gruel Axles over 1 year ago

What is the recommended approach to calculate ECASU - Summer days index per time period for multiple temperature thresholds?

Intuitively I want to broadcast the greater than comparison over an array of constants rather than a single constant. Is this possible with CDO Indices?


Replies (6)

RE: Broadcasting climate indices - Added by Gruel Axles over 1 year ago

Thanks for the quick response. I had seen the documentation, but I’m still not clear on the best approach to computing the ECASU index for multiple threshold values. The documentation says T is an integer, so can it only be done for a single threshold at a time? And multiple thresholds requires running the command multiple times, which duplicates all the I/O? Just looking for a way to avoid all the extra I/O for computing multiple threshold values. I don’t mind bypassing the eca_su operator and manually crafting the index calculation using more primitive operators that can handle broadcasting.

RE: Broadcasting climate indices - Added by Gruel Axles over 1 year ago

Gruel Axles wrote in RE: Broadcasting climate indices:

Thanks for the quick response. I had seen the documentation, but I’m still not clear on the best approach to computing the ECASU index for multiple threshold values. The documentation says T is an FLOAT, so can it only be done for a single threshold at a time? And multiple thresholds requires running the command multiple times, which duplicates all the I/O? Just looking for a way to avoid all the extra I/O for computing multiple threshold values. I don’t mind bypassing the eca_su operator and manually crafting the index calculation using more primitive operators that can handle broadcasting.

RE: Broadcasting climate indices - Added by Ralf Mueller over 1 year ago

hi!
There is no way to calculate the index for multiple thresholds at once. you have to create additional output (a single 2d field).

BTW: i don't understand the term broadcasting in that context.

cheers
ralf

RE: Broadcasting climate indices - Added by Gruel Axles over 1 year ago

Ah sorry. I use broadcasting in the way NumPy describes https://numpy.org/doc/stable/user/basics.broadcasting.html

So if you have a size (M,N) 2D array of temperature data and a size (Q,) 1D array of threshold values, then the operation

T_data > T_thresh

will be broadcasted, resulting in a 3D array of size (M, N, Q) where each slice in the Q dimension represents the comparison with each constant in the threshold array independently.

RE: Broadcasting climate indices - Added by Ralf Mueller over 1 year ago

I see. In the case of ecasu, the operation is a reduction of the time dimension (T(time,lon,lat) -> ecasu(lon,lat,T) Sure, you can save multiple results in a single array. But in the same way you can merge each output file (for each given T threshold) into a single file. The algorithm itself will not be parallelized in any way with CDO. Instead you could run multiple CDO commands for process-level parallelization.

You can rewrite the whole implementation with dask to ad the parallelization within a single process, but writing it will take time and I doubt there will be any performance benefit left in the end.

If you want to avoid IO-overhead, you can use /dev/shm on linux systems. It's limited in terms of space, but very fast.

cheers
ralf

    (1-6/6)