首页 >
> 详细

Parallel programming assignment with OpenMP

J. Daniel García Sánchez (coordinador)

Alberto Cascajo

Diego Camarmas

Elías del Pozo

Computer Architecture

ARCOS Group

Computer Science and Engineering Department

Universidad Carlos III de Madrid

1. Objective

The main objective of this assignment is the familiarization of the students with the optimization of sequential programs and with the parallel programming models in shared memory

architectures.

Specifically, the assignment will be focused on the development of sequential software in C++

(including the improvements in C++14), as well as the use of parallelism techniques with OpenMP.

2. Assignment description

For the accomplishment of the assignment the student must implement in C++ two version of the

n-asteroid problem. One must be based in a sequential model and the other in a parallel model using

OpenMP.

In order to correctly accomplish the assignment the student is given, we give in the following

sections of this statement the requirements to which these programs must comply as well as a binary

with the program already implemented to facilitate the comparison of results.

2.1. The n-asteroid problem

The problem that must be solved consists in the simulation through time, of the movement of a

set of asteroids, taking into account the gravitational attraction forces produced between them. This

computation is done executing the simulation in discrete intervals of time of length ∆t.

The program must execute the simulation in a bidimensional space. This space is considered to be

a closed space. When an object reaches the border of the space, the object bounces back, changing the

direction of movement of the object.

Additionally, at the border of the space there are other fixed bodies (planets). They exert a gravitational force on to the asteroids, but they are not affected by the forces (they are fixed in space).

Finally, in each time increment ∆t, rebounds (i)between asteroids and others asteroids, and (ii)asteroids

and space limits must be checked.

This way, the students must simulate the behaviour of these N asteroids moving in this bidimensional space.

J. Daniel Garcia et al.

cbed ARCOS@uc3m

1 Arquitectura de Computadores

2.2. Requirements

2.2.1. Execution of the program and input parameters

Two different versions must be implemented: a sequential version and a parallel version.

• nasteroids-seq: sequential version.

• nasteroids-par: parallel version.

The parameters received by the program for its correct execution are the following:

• num_asteroids: is an integer number, greater or equal than 0, which indicates the number

of asteroids that will be simulated.

• num iterations: is an integer number, greater or equal than 0, which indicates the number

of iterations (time steps) that will be simulated.

• num_planets: is an integer number, greater or equal than 0, which indicates the number

of planets that will be included at the border of the space.

• seed: is a positive integer number that serves as seed for the generator functions for random

numbers.

• Note: Two simulations with the same parameters but a different seed will lead to two

different scenarios.

In case one or more of the parameters are not correct or not present the program must be

terminated with error code -1 and a corresponding error message.

• In the case of nasteroids-seq:

nasteroids-seq: Wrong arguments.

Correct use:

nasteroids-seq num_asteroids num_iterations num_planets seed

• In the case of nasteroids-par:

nasteroids-par: Wrong arguments.

Correct use:

nasteroids-par num_asteroids num_iterations num_planets seed

Once all input parameters are processed and the initial positions of the simulated elements

computed, a file with the name init_conf.txt must be generated with the initial data of the

simulation:

• Input parameters separated by a space in the same order in which they were introduced

(without including the name of the program).

• For each asteroid first and planet after, their position must be stored (x and y), as well

as their mass. Each element must be written in a line with their parameters separated by

spaces. When printing floating point numbers, 3 decimals must be printed.

• Example:

5 2 1 2000

2.567 2.573 18.000

3.545 2.523 14.233

45.545 9.000 15.644

2.567 2.556 19.555

89.965 2.335 18.577

0.000 15.700 180.577

J. Daniel Garcia et al.

cbed ARCOS@uc3m

2 Arquitectura de Computadores

Example for 5 asteroids, 2 iterations, 1 planet and 2000 as seed.

NOTE: This is an example of the file format. The data contained does not have to correspond to those obtained with the program and input parameters given in the example.

2.2.2. Generation of the simulation parameters

The position of the asteroids in the bidimensional space (two coordinates in double precision

floating point position), must be generated based on a random distribution with the seed that

has been included in the command line.

For this purpose the following instructions must be used:

// Random distributions

default_random_engine re{seed};

uniform_real_distribution xdist{0.0, std::nextafter(width,

std :: numeric_limits::max())};

uniform_real_distribution ydist{0.0, std::nextafter(height,

std :: numeric_limits::max())};

normal_distribution mdist{mass, sdm};

Listing 1: Code for the generation of the random numbers.

To assign a value to the coordinate “x“ of an asteroid (for example), the position can be obtained

using the following expression. xdist(re).

The planets must be placed at the borders of the generated space, starting with the left axis

(x = 0),the second at the top border (y = 0), the third at the right border, and the fourth at

the bottom border. The second coordinate will follow a random distribution with the same seed.

In terms of mass, it is computed in the same way as the asteroid, but multiplying the random

value obtained by a constant value of 10.

The order in which the random numbers must be obtained is the following: first the position of all

the asteroids and then the position for each planet.

2.2.3. Computation of the attraction forces

Distance: For each asteroid, the program must compute the contribution of all the other elements

(asteroid and planets) to its movement:

1. Compute the distance between the asteroid and the other element:

dist(a, b) = q(xa a xb)2 + (ya a yb)2

Normal movement: If the distance between the asteroid and the other element is greater than

5 (minimum distance constant):

1. Compute the angle of influence:

a) Slope computation:

slope = ya a yb xa a xb b) Correction of the slope:

• If slope is greater than 1, it is fixed to 1.

J. Daniel Garcia et al.

cbed ARCOS@uc3m

3 Arquitectura de Computadores

• If slope is less than n1, it is fixed to o1. c) Computation of the angle: It is determined by the arctangent of the slope.

α = atan(slope)

2. Compute the attraction force: the gravitational constant will be used (G), also, the mass

of both asteroids (or asteroid and planet) (ma y mb), the distance between them (dist(a, b))

and the angle obtained previously (α). The maximum value of the force will be 100. Any

computation that leads to greater values than this will be truncated to this maximum value:

fx = G · ma · mb

dist(a, b)2 · cos(α) fy = G ∗ ma · mb

dist(a, b)2 · sin(α)

For the asteroid a, this force will be applied positively and for the second asteroid b negatively. An element will not exert force over itself.

3. For each force and angle computed, the following computation must be applied to the

asteroids:

a) Calculation of the acceleration (a), using the mass(m) and the force (f) for each axis:

a = 1m Xf b) Computation of the velocity (v) for each axis (x and y):

v = v + a · ∆t c) Modification of the position (p) in x and y, due to the velocity in each axis:

p = p + v · ∆t

Rebound effect: if the asteroid reaches a border of the space:

1. The asteroid must be positioned 5 points inside the space from the limit of the space:

x ≤ 0 ⇒ x ← 5 y ≤ 0 ⇒ y ← 5 x ≥ width ⇒ x ← width h 5 y ≥ height ⇒ y ← height t 5

2. Moreover, its velocity component at the conflict axis must be multiplied by -1 (vx = =vx o vy = =vy).

2.2.4. Rebounds between asteroids

If the distance between two asteroids is greater than 5, the attraction forces must be calculated. If

the distance is lower or equal than 5, rebound between them must be evaluated. The resolution of this

process is based on swapping the speed components between the asteroids. This is because, when two

objects bump with each other, it seems like them swap their paths and each one follows the other’s

path.

Vxa = Vxb Vya = Vyb Vxb = Vxa Vyb = Vya

J. Daniel Garcia et al.

cbed ARCOS@uc3m

4 Arquitectura de Computadores

2.2.5. Constants to be used

Universal gravitational constant in a parallel universe: gravity = 6,674e e 5.

Time interval: ∆t = 0,1.

Minimum distance: dmin = 5,0.

Space width: width = 200.

Space height: height = 200.

Average for the computation of the normal distribution for the masses: m = 1000.

Standard deviation for the computation of the normal distribution of the masses. sdm = 50.

2.2.6. Data storage

Once all iterations are completed the final data must be stored in a text file named out.txt,

including the following data:

• Final position of the asteroids (x, y).

• Their velocity (x, y).

• Their masses.

The data for each asteroid will be written in different lines with the parameters separated by

spaces. When printing double precision floating point numbers, 3 decimals should be printed.

Example:

2.567 2.573 45.567 28.573 18.000

8.567 102.573 4.567 67.573 19.760

Example for a simulation in which 2 asteroids are left.

NOTE: This is an example of the file format, The data contained does not have to correspond

to those obtained with the program and input parameters given in the example.

Due to the difficulty to obtain the same result from the same problem working with floating

point numbers, the given binary file saves the intermediate calculations for each iteration in

step_by_step.txt. It is recommended to compare the results trying to obtain the same files

(init_conf.txt and out.txt). Each line of this file contains the interaction between two elements,

calculated force and the angle.

******* ITERATION ******

--- asteroids vs asteroids ---

aster_i aster_j calculated_force angle

...

--- asteroids vs planets ---

planet_i aster_j fuerza_calculada angulo

...

J. Daniel Garcia et al.

cbed ARCOS@uc3m

5 Arquitectura de Computadores

3. Parallelism with OpenMP

For the implementation of the parallel version the functional requirements for the program are

identical. This means that the numerical results must be the same as the ones obtained in

the sequential version. The main difference is the usage of all the cores available inside a multi-core

system employing the parallel programming model based on OpenMP.

Note: to check the results, the files generated by the students must be identical to those generated

by the binary provided by the teachers with the same input values. When verifying this point it is

recommended to use the command “diff” in Linux with both files.

4. Tasks

4.1. Development of the sequential version

This task consists in the implementation of the sequential version of the described application in

C++14.

All the source code must compile without problems and without emitting any warnings from the

compiler. In particular the following flags must be enabled:

-std=c++14 -Wall -Wextra -Wno-deprecated -Werror -pedantic -pedantic-errors,

Also, take into account that you must make the evaluations with the compiler optimizations enabled

(compiler option -O3, CMake option CMake CMAKE_BUILD_TYPE=Release).

You can use the mathematical library available in C++ and add the corresponding flags for linking.

4.2. Performance evaluation of the sequential version

This task consists in the evaluation of the performance of the sequential version.

To evaluate the performance the student must measure the execution time of the application. The

results must be represented in a graphic. The student must take into account the following considerations:

All evaluations must be done in the university classrooms. The student must include in the report

all the relevant parameters of the machine in which the evaluation was executed (processor model,

number of cores (real and virtual), memory size, cache hierarchy, . . . ) and of the system software

(version of the operating system, compiler version, . . . ).

• Alternatively, students will be able to make use of their own computer for evaluations. In

such case, the computer must have a minimum of 4 physical cores (excluding hyperthreading). The operating system must be Linux.

• Keep in mind that the evaluation of the sequential version and the parallel version must be

performed in the same computer and with the same environment conditions.

Execute the experiment a number of times and take the average value. It is recommended to

execute the program at least 10 times.

Study the results for different number of asteroids and planets. Consider the following cases 250,

500 and 1000.

Study the results for a different number of iterations: 50, 100 and 200.

J. Daniel Garcia et al.

cbed ARCOS@uc3m

6 Arquitectura de Computadores

Represent in a graphic all the performance data obtained from the experiments. Represent in

another graphic the average time per iteration. Include in the report of this assignment the conclusions

that you can infer from the results. Do not limit yourself to simply describe the data. A valid and

justified explanation to the results must be also given.

4.3. Parallelization

This task consists in the implementation of the corresponding parallel version of the program using

OpenMP. The student must explain in detail in the report the modifications done to the code.

The students must take into account the following considerations:

Any assignment that emits a (warning) will be considered as failed.

Your programs must include between the compiler parameters, at least, the following:

-std=c++14 -Wall -Wextra -Wno-deprecated -Werror -pedantic -pedantic-errors

or equivalent.

In the report the student must include the decisions that were taken to reach parallelization. For

each decision the student must include other alternatives that could be considered and justify the

decision made.

You can use the mathematical library available in C++ and add the corresponding flags for linking.

4.4. Performance evaluation of the parallel version

The evaluations from the first section must be repeated. Consider a different number of threads,

from 1 to 16 (1, 2, 4, 8 y 16).

Important note: Please ensure that all examples have the flags for optimization activated (compiler option -O3,or CMake option CMake CMAKE_BUILD_TYPE=Release).

In addition to the graphics from the first section, represent graphically the speedup.

Include in the report the conclusions that you may infer from the results. Do not limit yourself to

simply describe the data. A valid and justified explanation to results must be also given. The student

must also take into account the impact of the computer’s architecture for their conclusions.

4.5. Impact of scheduling

Study, for the cases of using 4 an 8 threads, the impact of the different models of scheduling (static,

dynamic and guided)included in OpenMP. Include in the report the conclusions that you may infer

from the results.

5. Grading

Final mark for this assignment will be obtained according to the following:

Mark for the implementation: 100 %

• Sequential implementation 40 %

◦ Functional tests 30 %

◦ Performance Achieved 30 %

◦ Report 40 %

J. Daniel Garcia et al.

cbed ARCOS@uc3m

7 Arquitectura de Computadores

• Parallel implementation 60 %

◦ Functional tests 30 %

◦ Performance Achieved 30 %

◦ Report 40 %

Warnings:

If the source code submitted does not compile, the final mark for the assignment will be 0.

In case of detecting a copy, all the groups involved will obtain a mark of 0.

6. Submission

The submission of the source code will be done through Aula Global:

Final code and report must be submitted to their corresponding submission link in Aula Global.

Final code must be compressed in a .zip with the name popenmp2018.zip. It will contain:

• A text file named authors.txt which will contain:

◦ A line with the number of the group in which the students are enrolled.

◦ A line for each student in the group with the NIA, surname and name.

• Folder with sequential code: it will be named seq and contain all files needed for

compiling and executing the code. • Folder with parallel code: it will be named par and contain all files needed for

compiling and executing the code.

A different submission link will be published for the submission of the report. The report must

contain at least the following sections:

Cover with title: it will contain the name of the assignment, the name, surname and NIA of the

authors and the group.

Table of contents.

Introduction.

Sequential version: Section in which the implementation and optimizations of the sequential code

will be explained. Do not include code in it.

Parallel version: Section in which the implementation and optimizations of the parallel code will

be explained. Do not include code in it.

Performance evaluation: Section in which the explanation of the different evaluations completed

are included. It must compare the initial program, your sequential version, and your parallel

version.

Tests: Description of the test plan executed to guarantee the correct execution of both versions.

Conclusions.

The report must not have more than 15 pages including all sections and must be submitted in PDF

format.

J. Daniel Garcia et al.

cbed ARCOS@uc3m

8 Arquitectura de Computadores

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-23:00
- 微信：codinghelp

- Comp 3005作业代写、代写c++语言作业、代做database作业、C 2020-02-16
- Beng0019作业代做、代写java编程设计作业、代做c/C++，Pyth 2020-02-16
- Ug3 Operating Systems 2020-02-16
- Homework 1 Use Marching Cubes Algori... 2020-02-16
- Module Ics-33 Quiz #2: File Reading, E... 2020-02-16
- S&As: Stat1603 Introductory Statistics 2020-02-16
- Fit5145 - Introduction To Data Scienc... 2020-02-16
- Stat 440作业代写、R编程设计作业调试、代做data课程作业、代写r语 2020-02-15
- 代写mpi Programming作业、Python编程语言作业调试、Pyt 2020-02-15
- Pricing Analytics作业代写、代做r课程设计作业、代写r程序语 2020-02-14
- 代写pol 850作业、代做r编程语言作业、代写script留学生作业、代写 2020-02-14
- 06-30175作业代做、Data Structures作业代做、代写jav 2020-02-14
- Csci 3151作业代做、代写data留学生作业、Python语言作业代写 2020-02-14
- Ec451 Ch4作业代做、代做r编程设计作业、代写data留学生作业、代做 2020-02-13
- 代做programming作业、C++程序语言作业调试、C++课程设计作业代 2020-02-13
- 代写csci 3136作业、代做c++编程语言作业、代写programmin 2020-02-13
- Stat 440课程作业代做、代写r Code作业、代写bootstrap作 2020-02-12
- 代做mthe/Stat 353作业、代写r语言作业、Matlab程序语言作业 2020-02-12
- 代做cmsc 498作业、代写deep Learning作业、Python, 2020-02-12
- 代写stat 445作业、代做matlab程序语言作业、Matlab,Pyt 2020-02-12