c++ - without - openmp tutorial




Is it possible to do a reduction on an array with openmp? (4)

Does OpenMP natively support reduction of a variable that represents an array?

This would work something like the following...

float* a = (float*) calloc(4*sizeof(float));
omp_set_num_threads(13);
#pragma omp parallel reduction(+:a)
for(i=0;i<4;i++){
   a[i] += 1;  // Thread-local copy of a incremented by something interesting
}
// a now contains [13 13 13 13]

Ideally, there would be something similar for an omp parallel for, and if you have a large enough number of threads for it to make sense, the accumulation would happen via binary tree.


Array reduction is now possible with OpenMP 4.5 for C and C++. Here's an example:

#include <iostream>

int main()
{

  int myArray[6] = {};

  #pragma omp parallel for reduction(+:myArray[:6])
  for (int i=0; i<50; ++i)
  {
    double a = 2.0; // Or something non-trivial justifying the parallelism...
    for (int n = 0; n<6; ++n)
    {
      myArray[n] += a;
    }
  }
  // Print the array elements to see them summed   
  for (int n = 0; n<6; ++n)
  {
    std::cout << myArray[n] << " " << std::endl;
  } 
}

Outputs:

100
100
100
100
100
100

I compiled this with GCC 6.2. You can see which common compiler versions support the OpenMP 4.5 features here: http://www.openmp.org/resources/openmp-compilers/

Note from the comments above that while this is convenient syntax, it may invoke a lot of overheads from creating copies of each array section for each thread.



OpenMP can perform this operation as of OpenMP 4.5 and GCC 6.3 (and possibly lower) supports it. An example program looks as follows:

#include <vector>
#include <iostream>

int main(){
  std::vector<int> vec;

  #pragma omp declare reduction (merge : std::vector<int> : omp_out.insert(omp_out.end(), omp_in.begin(), omp_in.end()))

  #pragma omp parallel for default(none) schedule(static) reduction(merge: vec)
  for(int i=0;i<100;i++)
    vec.push_back(i);

  for(const auto x: vec)
    std::cout<<x<<"\n";

  return 0;
}

Note that omp_out and omp_in are special variables and that the type of the declare reduction must match the vector you are planning to reduce on.


OpenMP cannot perform reductions on array or structure type variables (see restrictions).

You also might want to read up on private and shared clauses. private declares a variable to be private to each thread, where as shared declares a variable to be shared among all threads. I also found the answer to this question very useful with regards to OpenMP and arrays.





reduction