Project

General

Profile

GPU support

Added by Erdem YILMAZ over 4 years ago

Hi Everyone,

I couldnt find anything in the Forum, and wanted to ask if there is a project going on for getting use of GPU resources on a computer for doing the calculations.
I was wondering if OpenCL can be utilised for operations where data can be processed in parallel.
Is there a previous design decision for CDO that prevents us using GPU resources where available.
Regards.


Replies (2)

RE: GPU support - Added by Ralf Mueller over 4 years ago

hi Erdem!

Several years ago there was a development done using GPU for some computations and it turned out that is is indeed doable. the Problem is that the overall design of CDO is not suitable for GPU-usage in general: with the ability to chain operators and the bottle neck of data transfer between GPU and CPU there should be a lot operators ported to GPU before it is beneficial to use GPUs at all. Hence a lot of programming for a questionable effect. in fact the operators of a chain do run in parallel (in different threads).
Many operators do very basic stuff - that's why it's so powerful to combine them. But for GPUs that's bad because this limits their possible workload - again there will be not big benefit from using GPUs.
Finally CDO is a general tool that should be able to run an many systems. Although GPU-clusters became more popular in the last decade, user of regular PC, laptops or workstations are a second important user group beside people with access to HPC systems.
The only GPU use-case I currently can imagine is AI: If this becomes a more popular tool in climate and weather prediction, there might be some operators using GPUs based on existing libraries.

after all these are just my thoughts ;-)
cheers
ralf

RE: GPU support - Added by Erdem YILMAZ over 4 years ago

Hi Ralf,

Thanks for the detailed response,
Each operator implemented in CDO might require a matching GPU kernel function to be implemented, thats a bit of a hustle.
However, I believe, if we can identify a commonly used subset of existing operators that are working on data blocks which doesnt have inter-dependency(cases like, embarrassingly parallel), overall performance of a workflow will gain a lot. Of course, saying so, is easier than doing it :)
Regards.

    (1-2/2)