OpenMP on Ubuntu

JArmstrong
Oct 16, 2018 · 3 min read

OpenMP is a library for executing C, C++ and Fortran code on multiple processors at the same time. This makes your code much faster if it uses a lot of loops, and utilizes the full power of your CPU. In case you just want to see the code, the link is here.

Setting up OpenMP on Ubuntu / Linux

I am not entirely sure how this works on other platforms. It is near-impossible on Mac, there are some weird setup instructions for Windows, but on Linux and Ubuntu it is really easy.

  1. Run sudo apt-get install libomp-dev in your Terminal.
  2. Create a C++ Project , and title it HelloOpenMP .
  3. Select your project, and go to the Properties dialog.
  4. Go to C/C++ Build -> Settings .
  5. Select GCC C++ Compiler / Miscellaneous .
  6. In the Other flags input, add on -fopenmp .
  7. Select GCC C++ Linker / Libraries .
  8. In the Libraries (-l) field, click the add button and type in gomp.

Afterwards, your properties should look something like this:

That’s it!

Using OpenMP

Say you have a awesome program that prints out a list of 10 numbers:

#include <stdio.h>int main(){
for(int i=0;i<10;i++){
printf("%i\n",i);
}
return 0;
}

This outputs something like:

0
1
2
3
4
5
6
7
8
9

Now, lets OpenMPinize it!

#include <stdio.h>int main(){
#pragma omp parallel for
for(int i=0;i<10;i++){
printf("%i\n",i);
}
return 0;
}

This outputs something like:

4
7
6
9
0
1
8
2
3
5

The numbers are out of order because each iteration in the loop is executed at a slightly different time, in parallel.

Wait, what? “How could it be that easy?” I hear you say. It actually is this easy, if your compiler supports OpenMP. In general, GCC with a recent version should be fine. And if your compiler doesn’t support it — the pragmas are ignored! And your code falls back to single-core sluggishness. So OpenMP is completely compatible with any machine.

The source code can be found here.

The End

I made a little mandelbrot program using my custom PPM image library I made:

#include <math.h>
#include "ppm.h"
#include <chrono>
#include "complex.h"
#include "omp.h"
using namespace std::chrono;///https://stackoverflow.com/a/19555298/9609025
long curTime(){
milliseconds ms = duration_cast< milliseconds >(system_clock::now().time_since_epoch());
return ms.count();
}
int main(){
int w=1000;
int h=1000;

ppm img;
img.setSize(w,h);
img.allocMem();
long start,end; start=curTime();
#pragma omp parallel for
for(int x=0;x<w;x++){
#pragma omp parallel for
for(int y=0;y<h;y++){
// printf("%i %i\n",x,y);
float fx=x;
float fy=y;
fx/=w;
fy/=h;
fx*=4;
fy*=4;
fx-=2;
fy-=2;
complex c=fromXY(fx,fy);
complex c0=c;
int max=50;
int i=0;
for(i=0;i<max&&c.r<10000;i++){
c=c^2;
c=c+c0;
}
float f=((float)i)/((float)max); img.setPixel(x,y,f);
}
}
end=curTime(); unsigned long diff1=end-start; start=curTime();
for(int x=0;x<w;x++){
for(int y=0;y<h;y++){
// printf("%i %i\n",x,y);
float fx=x;
float fy=y;
fx/=w;
fy/=h;
fx*=4;
fy*=4;
fx-=2;
fy-=2;
complex c=fromXY(fx,fy);
complex c0=c;
int max=50;
int i=0;
for(i=0;i<max&&c.r<10000;i++){
c=c^2;
c=c+c0;
}
float f=((float)i)/((float)max); img.setPixel(x,y,f);
}
}
end=curTime();
long diff2=end-start; printf("With OMP : %lums\n",diff1);
printf("Without OMP : %lums\n",diff2);
printf("Speedup : %lums\n",diff2-diff1);
img.clamp();
img.save("mandelbrot.ppm");
img.dealloc();
return 0;}

On my computer, this gives the following:

With OMP    : 451ms
Without OMP : 1475ms
Speedup : 1024ms

Pretty awesome!

This can be applied to any loop in C++ that can execute independently of the other iterations, with a HUGE speedup. I get an even bigger speedup when using OpenMP with some of my pathtracing algorithms.

This new library makes it much easier to run multiple things on the CPU at once, and compared to OpenCL, is much easier to use.

This story is published in The Startup, Medium’s largest entrepreneurship publication followed by + 379,528 people.

Subscribe to receive our top stories here.

The Startup

Get smarter at building your thing. Join The Startup’s +729K followers.