Introduction to OpenMP

Introduction to OpenMP

Postby Cuchulainn » Sun Jun 10, 2007 2:16 pm

This blog is the first in a series of three blogs on the application of parallel programming techniques to Computational Finance with special attention being paid to Monte Carlo and PDE techniques. The current blog is meant as an overview of OpenMP at 30,000 feet.

OpenMP is a software product consisting of a collection of compiler directives (called pragmas), library functions and environment variables that developers can use to specify shared-memory parallelism in C. By shared memory we mean that all threads share a single address space and they communicate with each other by writing and reading shared variables. The OpenMP Application Programming Interface (API) is portable between various shared memory models. It has been tailored in such a way to support programs than run in both parallel and sequential modes. It is especially useful for large array-based applications. The main features are:

Support for parallelization of loops and iterations

Work-sharing constructs and the construction or parallel regions

Synchronisation constructs

Sharing and privatization of data

We shall discuss how to realize the above features in OpenMP in the next blog.

Many engineering, scientific and computational finance applications are expressed in the form of iterative constructs. In these cases the code consists of various kinds of loops. In general, loops are needed when we need to navigate in (hierarchical) data structures such and vectors, matrices, trees and graphs. Improving the performance of applications that rely heavily on loop constructs is a high priority. Given that many developers will choose to port their serial programs to the equivalent parallel ones using OpenMP we now discuss the forces to be reckoned with when we port loop-based code written for serial machines to parallelized loop code. There are a number of forces to examine that are related to the correctness and efficiency of the parallel code:

. Sequential equivalence: a given program should produce the same results when executed with one thread or with more than one thread. We use a series of transformations that we apply on the loops in order to achieve this end. These are called semantically neutral transformations if they leave the semantics of the program unchanged. A point to note is that due to round-off errors the results from a serial loop may be different from those in the equivalent parallel loop. For example, adding numbers in a loop will give different answers depending on the order in which they are added, for example if we add the numbers serially or in ascending order. This may be unacceptable and we should then decide not to parallelise the loop.

. Incremental parallelism: we parallelise a program by examining one loop at a time, for example. We apply a sequence of incremental transformations and we test the sequential equivalence after each increment. In this way we can be assured of correctness on the one hand and we can measure the performance improvement on the other hand.

. Memory utilization: In order to achieve good performance it is important to note that the way data is accessed (for example, using indexing operators or by dereferencing) is consistent with the memory model of the operating system.
User avatar
Posts: 676
Joined: Mon Dec 18, 2006 2:48 pm
Location: Amsterdam, the Netherlands

Return to MPI/OpenMP

Who is online

Users browsing this forum: No registered users and 1 guest