JayBeams  0.1
Another project to have fun coding.
Public Member Functions | Static Public Member Functions | Private Attributes | List of all members
jb::opencl::generic_reduce< reducer, input_type_t, output_type_t > Class Template Reference

Implement a generic reducer for OpenCL. More...

#include <generic_reduce.hpp>

Public Types

Type traits
using input_type = input_type_t
 The host type used to represent the input into the reduction. More...
 
using output_type = output_type_t
 The host type representing the output of the reduction. More...
 
using vector_iterator = typename boost::compute::vector< input_type >::iterator
 The type of the vector used to store the results. More...
 

Public Member Functions

 generic_reduce (std::size_t size, boost::compute::command_queue const &queue)
 Constructor. More...
 
template<typename InputIterator >
boost::compute::future< vector_iteratorexecute (InputIterator begin, InputIterator end, boost::compute::wait_list const &wait=boost::compute::wait_list())
 Schedule the execution of a reduction. More...
 
boost::compute::future< vector_iteratorexecute (boost::compute::vector< input_type > const &src, boost::compute::wait_list const &wait=boost::compute::wait_list())
 Schedule a reduction for a full vector. More...
 

Static Public Member Functions

static boost::compute::program create_program (boost::compute::command_queue const &queue)
 

Private Attributes

boost::compute::command_queue queue_
 
boost::compute::program program_
 
boost::compute::kernel initial_
 
boost::compute::kernel intermediate_
 
std::size_t max_workgroup_size_
 
std::size_t sizeof_output_type_
 
std::size_t scratch_size_
 
std::size_t effective_workgroup_size_
 
boost::compute::vector< output_typeping_
 
boost::compute::vector< output_typepong_
 

Detailed Description

template<typename reducer, typename input_type_t, typename output_type_t>
class jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >

Implement a generic reducer for OpenCL.

Aggregating all the values in a vector to a single value, also known as reductions, is a common building block in parallel algorithms. All the reductions follow a common form, this template class implements a generic reduction given the aggregation function and its input / output types.

This implementation uses a parallel reduction, for a general motivation and description please see: http://developer.amd.com/resources/articles-whitepapers/opencl-optimization-case-study-simple-reductions/

TODO(coryan) this class is work in progress, it is not fully implemented

Template Parameters
reducera class derived from generic_reduce<reducer,...>. Please see jb::opencl::reducer_concept for details.
input_type_tthe host type that represents the input
output_type_tthe host type that represents the output

Definition at line 37 of file generic_reduce.hpp.

Member Typedef Documentation

◆ input_type

template<typename reducer, typename input_type_t, typename output_type_t>
using jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::input_type = input_type_t

The host type used to represent the input into the reduction.

Definition at line 44 of file generic_reduce.hpp.

◆ output_type

template<typename reducer, typename input_type_t, typename output_type_t>
using jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::output_type = output_type_t

The host type representing the output of the reduction.

Definition at line 47 of file generic_reduce.hpp.

◆ vector_iterator

template<typename reducer, typename input_type_t, typename output_type_t>
using jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::vector_iterator = typename boost::compute::vector<input_type>::iterator

The type of the vector used to store the results.

The final output is a single element, but OpenCL makes it easier to treat that as a result of a vector with a single element.

Definition at line 55 of file generic_reduce.hpp.

Constructor & Destructor Documentation

◆ generic_reduce()

template<typename reducer, typename input_type_t, typename output_type_t>
jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::generic_reduce ( std::size_t  size,
boost::compute::command_queue const &  queue 
)
inline

Member Function Documentation

◆ create_program()

template<typename reducer, typename input_type_t, typename output_type_t>
static boost::compute::program jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::create_program ( boost::compute::command_queue const &  queue)
inlinestatic

Definition at line 238 of file generic_reduce.hpp.

References jb::error, jb::opencl::generic_reduce_program_source, and JB_LOG.

◆ execute() [1/2]

template<typename reducer, typename input_type_t, typename output_type_t>
template<typename InputIterator >
boost::compute::future<vector_iterator> jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::execute ( InputIterator  begin,
InputIterator  end,
boost::compute::wait_list const &  wait = boost::compute::wait_list() 
)
inline

Schedule the execution of a reduction.

The algorithm works in phases, each phase runs in the OpenCL device, reducing the input to a (typically much smaller) vector, which is stored in either the ping_ or pong_ variable.

If necessary the algorithm schedules multiple repeated phases, asynchronously (but waiting for each other), until the output has been reduced to a vector with a single element.

Parameters
beginthe beginning of the range to be reduced.
endthe end of the range to be reduced.
waita list of events to wait for before any work starts on the device.
Returns
a boost::compute::future<>, when said future is ready, it contains an iterator pointing to the result. Calls to execute() invalidate this iterator.

Definition at line 126 of file generic_reduce.hpp.

References jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::effective_workgroup_size_, jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::initial_, jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::intermediate_, JB_ASSERT_THROW, jb::p2ceil(), jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::ping_, jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::pong_, jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::queue_, and jb::testing::defaults::size.

Referenced by jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::execute().

◆ execute() [2/2]

template<typename reducer, typename input_type_t, typename output_type_t>
boost::compute::future<vector_iterator> jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::execute ( boost::compute::vector< input_type > const &  src,
boost::compute::wait_list const &  wait = boost::compute::wait_list() 
)
inline

Schedule a reduction for a full vector.

See the other overload of this member function for details.

Parameters
srcthe vector to be reduced
waita wait list that must be completed before the reduction starts
Returns
a boost::compute::future<>, when said future is ready, it contains an iterator pointing to the result. Calls to execute() invalidate this iterator.

Definition at line 231 of file generic_reduce.hpp.

References jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::execute().

Member Data Documentation

◆ effective_workgroup_size_

template<typename reducer, typename input_type_t, typename output_type_t>
std::size_t jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::effective_workgroup_size_
private

◆ initial_

template<typename reducer, typename input_type_t, typename output_type_t>
boost::compute::kernel jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::initial_
private

◆ intermediate_

template<typename reducer, typename input_type_t, typename output_type_t>
boost::compute::kernel jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::intermediate_
private

◆ max_workgroup_size_

template<typename reducer, typename input_type_t, typename output_type_t>
std::size_t jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::max_workgroup_size_
private

◆ ping_

template<typename reducer, typename input_type_t, typename output_type_t>
boost::compute::vector<output_type> jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::ping_
private

◆ pong_

template<typename reducer, typename input_type_t, typename output_type_t>
boost::compute::vector<output_type> jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::pong_
private

◆ program_

template<typename reducer, typename input_type_t, typename output_type_t>
boost::compute::program jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::program_
private

◆ queue_

template<typename reducer, typename input_type_t, typename output_type_t>
boost::compute::command_queue jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::queue_
private

◆ scratch_size_

template<typename reducer, typename input_type_t, typename output_type_t>
std::size_t jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::scratch_size_
private

◆ sizeof_output_type_

template<typename reducer, typename input_type_t, typename output_type_t>
std::size_t jb::opencl::generic_reduce< reducer, input_type_t, output_type_t >::sizeof_output_type_
private

The documentation for this class was generated from the following file: