首页 >
> 详细

Program Assignment #2

Due day: NOV. 16, 2021

Problem 1: Matrix-Matrix Multiplication

In the first hands-on lab section, this lab introduces a famous and widely-used example

application in the parallel programming field, namely the matrix-matrix multiplication.

You will complete key portions of the program in the CUDA language to compute this

widely-applicable kernel.

In this lab you will learn:

‧ How to allocate and free memory on GPU.

‧ How to copy data from CPU to GPU.

‧ How to copy data from GPU to CPU.

‧ How to measure the execution times for memory access and computation

respectively.

‧ How to invoke GPU kernels.

Your output should look like this:

Input matrix file name:

Setup host side environment and launch kernel:

Allocate host memory for matrices M and N.

M:

N:

Allocate memory for the result on host side.

Initialize the input matrices.

Allocate device memory.

Copy host memory data to device.

Allocate device memory for results.

Setup kernel execution parameters.

# of threads in a block:

# of blocks in a grid :

Executing the kernel...

Copy result from device to host.

GPU memory access time:

GPU computation time :

GPU processing time :

Check results with those computed by CPU.

Computing reference solution.

CPU Processing time :

CPU checksum:

GPU checksum:

Record your runtime with respect to different input matrix sizes as follows:

Matrix Size GPU Memory

Access Time

(ms)

GPU

Computation

Time (ms)

GPU

Processing

Time (ms)

Ratio of

Computation Time

as compared with

matrix 128x128

8 x 8

128 x 128 1

512 x 512

3072 x 3072

4096 x 4096

What do you see from these numbers?

Problem 2: Matrix-Matrix Multiplication with Tiling and Shared Memory

This lab is an enhanced matrix-matrix multiplication, which uses the features of

shared memory and synchronization between threads in a block. The device shared

memory is allocated for storing the sub-matrix data for calculation, and threads share

memory bandwidth which was overtaxed in previous matrix-matrix multiplication lab.

In this lab you will learn:

‧ How to apply tiling on matrix-matrix multiplication.

‧ How to use shared memory on the GPU.

‧ How to apply thread synchronization in a block.

Your output should look like this.

Input matrix file name:

Setup host side environment and launch kernel:

Allocate host memory for matrices M and N.

M:

N:

Allocate memory for the result on host side.

Initialize the input matrices.

Allocate device memory.

Copy host memory data to device.

Allocate device memory for results.

Setup kernel execution parameters.

# of threads in a block:

# of blocks in a grid :

Executing the kernel...

Copy result from device to host.

GPU memory access time:

GPU computation time :

GPU processing time :

Check results with those computed by CPU.

Computing reference solution.

CPU Processing time :

CPU checksum:

GPU checksum:

Record your runtime with respect to different input matrix sizes as follows:

Matrix Size GPU Memory

Access Time

(ms)

GPU

Computation

Time (ms)

GPU

Processing

Time (ms)

Ratio of

Computation Time

as compared with

matrix 128x128

8 x 8

128 x 128 1

512 x 512

3072 x 3072

4096 x 4096

What do you see from these numbers? Have they improved a lot as compared to the

previous matrix-matrix multiplication implementation?

Problem 3: Matrix-Matrix Multiplication with Tiling and Constant Memory

This lab is an enhanced matrix-matrix multiplication, which uses the features of

constant memory and synchronization between threads in a block. Allocate constant

memory for matrices M and N.

Record your runtime with respect to different input matrix sizes as follows:

Matrix Size GPU Memory

Access Time

(ms)

GPU

Computation

Time (ms)

GPU

Processing

Time (ms)

Ratio of

Computation Time

as compared with

matrix 128x128

8 x 8

128 x 128 1

512 x 512

3072 x 3072

4096 x 4096

What do you see from these numbers? Have they improved a lot as compared to the

previous matrix-matrix multiplication implementation?

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-21:00
- 微信：codinghelp

- 代写 Lab 2: Threads 2022-05-10
- 辅导assessment 1. Present Your Client ... 2022-05-10
- 5Cce2sas辅导、Python，Java程序辅导 2022-05-10
- 代写brae Webb编程 2022-05-09
- 辅导csci 3110 Assignment 1 2022-05-09
- Mth2222 Assignment 2代写 2022-05-09
- Cse3bdc Assignment 2022辅导 2022-05-08
- 辅导cis 468、辅导java，Python编程 2022-05-08
- Comp Sci 4094/4194/7094 Assignment 3 D... 2022-05-07
- Cs 178: Machine Learning & Data Mining... 2022-05-07
- Data7703 Assignment 4 2022-05-07
- 讲解assignment 2: Databases 2022-04-25
- 辅导ait681 Static Analysis 2022-04-25
- Cse121 & Cse121l 编程辅导、辅导c++程序语言 2022-04-25
- 辅导iti1120 Bject-Oriented Programming 2022-04-25
- Cmt304语言辅导、辅导c++，Python编程 2022-04-25
- 辅导comp/Engn4528 Computer Vision 2022-04-24
- 辅导fin 2200 Bloomberg Investment Proj... 2022-04-24
- 辅导bism 7255 Uml Assignment 2022-04-23
- 讲解comp202 Programming Assignment 2022-04-23