architecture, bigdata, coding

Processing data using your GPU cores – with or without OpenCL

I’ve been personally fascinated by the progress big data computing is making these days. Few weeks i’ve been experiment with h2o (in memory java cluster wide math) that processes all kinds of algorithms across multiple clusters, ALL in memory.

What eluded me to understand is what’s happening with using those GPUs we have in our everyday laptops and desktops.


I’ve recently reached out to a lead engineer for AMD’sĀ Aparapi project and asked him what’s happening with the project, after all there hasn’t been a release in over a year!

Gary Frost – lead engineer and contributor to Aparapi wrote:

It is active, but mostly in the lambda branch.   if you like AMD GPU's (well APU's) you will love the lambda branch it allows you to use Java 8 lambda features *AND* allows you to execute code on HSA enabled devices (no OpenCL required).  This means no buffer transfers and much less restrictions on the type of code you can execute. 

So the new API's map/match the new Java 8 stream APIs

Aparapi.range(0,100).parallel().forEach(id -> square[i]=in[i]*in[i]);

If you drop the 'parallel' the lambda is executed sequentially, if you include the parallel Aparapi looks for HSA, then OpenCL then if neither exist will fall back to a thread pool. 

The reason that there are less 'checkins' in trunk, and no merges from lambda into trunk is because we can;t check the lambda stuff into trunk without forcing all users to move to Java 8. 

This is really exciting and interesting !

In a more practical use case, what i would image doing is running map reduce with data provided (e.g. hazecast or other datagrid) for each node and utilize AMD’s finest GPUs to process data much quicker than costly Xeon 12 cores can do.This provides an affordable scalability.

For the time being, there will still be a need to have Hadoop’s data/name nodes and job tracker that would control which piece of data is process and where since GPUs won’t be able to share data between their remote nodes (at least for now).

Next steps to try it out:

Check out branch “lambda”

and Compile/Run

Aparapi.range(0,100).parallel().forEach(id -> square[i]=in[i]*in[i]);

There are plenty of other examples in the project’s source.




Share This: