OpenMP is an API that allows developers to easily write shared-memory parallel applications in C/C++ and Fortran. A shared memory model runs on a single system and utilizes its multiple processing units or cores to achieve concurrent computation using the same memory space. OpenMP provides high-level compiler directives. This includes #prgma omp parallel
, used to identify which parts of the code need to run concurrently #pragma omp critical
to identify critical sections. It then automatically deals with many of the low-level complications like handling threads (each core's line of execution) and locks (constructs used to avoid problems like race and deadlock). It even handles reductions and compiles the results of each thread into a final answer.
OpenMPI is an implementation of the Message Passing Interface (MPI), used in distributed memory architectures. Unlike shared memory described above, distributed memory uses a collection of independent core memory pairs that synchronize using a network, mostly found in supercomputers. This means that each core or node has a memory of its own and does not require locks like shared memory. However, synchronization is still required to distribute the computation and collect results and that is done through message passing. OpenMPI provides API calls such as MPI_Send
and MPI_Recv
to allow communication between computation nodes. Unlike OpenMP, here each computational unit has to send its results to a master and it manually compiles and aggregates the final result.
Both of these libraries thus implement different standards of parallel computing for two different architectures.
The diagram below shows the architectural difference:
OpenMP | OpenMPI |
High-level API allowing shared-memory parallel computing | High-level implementation of Message Passing Interace (MPI) for distributed-memory systems. |
Allows parallel code to run on a single multi-core system | Allows parallel code to run on multiple systems connected by a network |
Automatically creates multiple threads and deals with synchronization | Provides API that allows programmer to control communication between distributed nodes |
Automatically reduces/compiles the final results | Programmer has to manually receive and compile the results |
Can run offline in isolation | Needs a network to function |
Can execute directly on any multi-core system | Needs a setup before execution where we identify each node in the network |
Owned by OpenMP Architecture Review Board | Open-source BSD license |
Available in C, C++, and Fortran | Available in many languages through bindings. Comes with wrapper compilers for C, C++, and Fortran |