OpenMP Example from Comp++

OpenMP Example from Comp++

Postby Cuchulainn » Mon Jul 02, 2007 8:19 am

Here is the source code for one of the blogs on Comp++ (first register, then you can log in)
Attachments
TestSerial.cpp
(2.17 KiB) Downloaded 717 times
User avatar
Cuchulainn
 
Posts: 673
Joined: Mon Dec 18, 2006 2:48 pm
Location: Amsterdam, the Netherlands

Postby Cuchulainn » Mon Jul 09, 2007 12:39 am

Here is an example of loop-level optimisation

///



C++ Parallel Programming in Computational Finance, Part II: A ?Hello World, 101? Example using OpenMP



Daniel J. Duffy



In the previous blog (Part I) I gave a high-level overview of the OpenMP Application Programming Interface (API). OpenMP consists of library functions, directives and environment variables that allow developers to create multi-threaded code on shared-memory architectures.

In this blog I would like to give a concrete example of using OpenMP C++ code. The example is simple but it does show how serial code can be made parallel. We concentrate on a problem that creates two STL vectors, calculates their inner product and then prints them on the console. The code was compiled under Microsoft VS2005 that supports OpenMP. We have created a Win32 console application and all code is placed in one file. The first statements tell the compiler that we are using STL and OpenMP:





#include <vector>

#include <iostream>

#include <omp.h>

using namespace std;



After having done this we can then use the OpenMP API library functions to parallelise the code. The main program is very simple:





int main()

{

// Preprocessing: Input

cout << "Give size of the arrays: ";

int N; cin >> N;

cout << "Give value in the first array: ";

double val1; cin >> val1;

cout << "Give value in the second array: ";

double val2; cin >> val2;





// Processing; Data and algorithms

vector<double> v1(N, val1);

vector<double> v2(N, val2);



double result = InnerProduct(v1, v2); // Sum of products



// Postprocessing: Output

print(v1);

print(v2);



cout << endl << "Inner product is: " << result << endl;



return 0;

}



This program prompts for input and then creates two STL vectors. It then calculates their inner product and prints both of them on the console. This program is serial but the functions for calculating the inner product and printing use loop-level parallel pragmas. First, the code for the inner product is:



double InnerProduct(const vector<double>& v1, const vector<double>& v2)

{





double result = v1[0] * v2[0];



// Assume sizes of v1 and v2 are equal



// Perform a reduction

#pragma omp parallel for reduction (+: result)

for (int j = 1; j < (int)v1.size(); ++j)

{

result += v1[j]*v2[j];

}



// implicit barrier here



return result;

}



The presence of an OpenMP directive ensures that the master thread forks a number of child threads. Each thread is allocated parts of the work to calculate the inner product. Each thread?s contribution is added to a global variable result. We use a special keyword reduction in order to add the individual contributions and to avoid race conditions at the same time.

The code for printing a vector is given by:





void print(const vector<double>& vec)

{

cout << endl;



// Since we only read the values of vec, the default shared

// variable access is OK

#pragma omp parallel for

for (int j=0; j < (int)vec.size(); ++j)

{

cout << "vec[" << j << "] = " << vec[j] << endl;

}



// implicit barrier here



cout << endl;

}



In this case, multiple threads are created and each thread is responsible for printing one block of the vector. We mention that a so-called implicit barrier is defined in both functions at the end of loop. This implies that the threads are removed and the code goes back into serial mode.

We can draw some conclusions: first, we see that it is easy to incorporate parallel commands in serial code in order to improve speedup. Second, OpenMP API takes care of thread creation and destruction and this fact lessens the burden on the developer. Finally, the loop-level code can be used to improve the performance of matrix-based computations in finance. For example, in some cases speedup of 80% is possible on duo-core machines.

The full source code for the test program can be found at www.datasimfinancial.com (where you can register and log into the forum, see the OpenMP thread, http://www.datasimfinancial.com/frm/viewtopic.php?t=96).



In the next blog I shall discuss the application of coarse-grained techniques to the development of efficient code for Monte Carlo applications.
User avatar
Cuchulainn
 
Posts: 673
Joined: Mon Dec 18, 2006 2:48 pm
Location: Amsterdam, the Netherlands


Return to MPI/OpenMP

Who is online

Users browsing this forum: No registered users and 1 guest

cron