It seems like Apple's built-in clang will never support OpenMP. I really used OpenMP only for its #omp parallel for
. Now that C++11
has support for multi-threading, I tried to write a very basic parallel for without OpenMP.
#include <vector>
#include <thread>
#include <iostream>
int main(int argc, char *argv[])
{
const size_t nloop = 11;
// Serial version
{
// Pre loop
std::cout<<"serial:"<<std::endl;
// loop over all items
for(int i = 0;i<nloop;i++)
{
// inner loop
{
const int j = i*i;
std::cout<<j<<std::endl;
}
}
// Post loop
std::cout<<std::endl;
}
// Parallel version
// number of threads
const size_t nthreads = std::thread::hardware_concurrency();
{
// Pre loop
std::cout<<"parallel ("<<nthreads<<" threads):"<<std::endl;
std::vector<std::thread> threads(nthreads);
std::mutex critical;
for(int t = 0;t<nthreads;t++)
{
threads[t] = std::thread(std::bind(
[&](const int bi, const int ei, const int t)
{
// loop over all items
for(int i = bi;i<ei;i++)
{
// inner loop
{
const int j = i*i;
// (optional) make output critical
std::lock_guard<std::mutex> lock(critical);
std::cout<<j<<std::endl;
}
}
},t*nloop/nthreads,(t+1)==nthreads?nloop:(t+1)*nloop/nthreads,t));
}
std::for_each(threads.begin(),threads.end(),[](std::thread& x){x.join();});
// Post loop
std::cout<<std::endl;
}
}
It's not as simple as slapping down #omp parallel for
but it's really just a few lines above and below the for loop. It can even determine the number of cores available and handle simple atomic operations.