


Broadcasting climate indices

Added by Gruel Axles 12 months ago

What is the recommended approach to calculate ECASU - Summer days index per time period for multiple temperature thresholds?

Intuitively I want to broadcast the greater than comparison over an array of constants rather than a single constant. Is this possible with CDO Indices?

Replies (6)

RE: Broadcasting climate indices - Added by Gruel Axles 12 months ago

Thanks for the quick response. I had seen the documentation, but I’m still not clear on the best approach to computing the ECASU index for multiple threshold values. The documentation says T is an integer, so can it only be done for a single threshold at a time? And multiple thresholds requires running the command multiple times, which duplicates all the I/O? Just looking for a way to avoid all the extra I/O for computing multiple threshold values. I don’t mind bypassing the eca_su operator and manually crafting the index calculation using more primitive operators that can handle broadcasting.

RE: Broadcasting climate indices - Added by Gruel Axles 12 months ago

Gruel Axles wrote in RE: Broadcasting climate indices:

Thanks for the quick response. I had seen the documentation, but I’m still not clear on the best approach to computing the ECASU index for multiple threshold values. The documentation says T is an FLOAT, so can it only be done for a single threshold at a time? And multiple thresholds requires running the command multiple times, which duplicates all the I/O? Just looking for a way to avoid all the extra I/O for computing multiple threshold values. I don’t mind bypassing the eca_su operator and manually crafting the index calculation using more primitive operators that can handle broadcasting.

RE: Broadcasting climate indices - Added by Ralf Mueller 12 months ago

There is no way to calculate the index for multiple thresholds at once. you have to create additional output (a single 2d field).

BTW: i don't understand the term broadcasting in that context.


RE: Broadcasting climate indices - Added by Gruel Axles 12 months ago

Ah sorry. I use broadcasting in the way NumPy describes

So if you have a size (M,N) 2D array of temperature data and a size (Q,) 1D array of threshold values, then the operation

T_data > T_thresh

will be broadcasted, resulting in a 3D array of size (M, N, Q) where each slice in the Q dimension represents the comparison with each constant in the threshold array independently.

RE: Broadcasting climate indices - Added by Ralf Mueller 12 months ago

I see. In the case of ecasu, the operation is a reduction of the time dimension (T(time,lon,lat) -> ecasu(lon,lat,T) Sure, you can save multiple results in a single array. But in the same way you can merge each output file (for each given T threshold) into a single file. The algorithm itself will not be parallelized in any way with CDO. Instead you could run multiple CDO commands for process-level parallelization.

You can rewrite the whole implementation with dask to ad the parallelization within a single process, but writing it will take time and I doubt there will be any performance benefit left in the end.

If you want to avoid IO-overhead, you can use /dev/shm on linux systems. It's limited in terms of space, but very fast.

