Operating Systems, Assignment 5
This assignment includes two programming problems, one in C and one in CUDA for the ARC system
that you made an account for on assignment 4. As always, be sure your code compiles and runs on the EOS
Linux machines before you submit (or on the arc system for the second program). Also, remember that
your work needs to be done individually and that your programs need to be commented and consistently
indented. You must document your sources and, if you use sources, you must still write at least half your
code yourself from scratch.
1. (40 pts) We’re going to implement a multi-threaded Unix client/server program in C using TCP/IP
sockets for communication. The server will let users manage a calendar of reservations for a week. The
program will display the calendar like the following example. Each day of the week has times from
8:00 am up to (but not including) 5:00 pm. People can reserve blocks of time during any day, as long
as those times have not previously been reserved by someone else. Each reserved time is marked with
the first letter of the name of the person who reserved it. For example, alice reserved a block early in
the day on Monday and cindy reserved a block later in the morning.
8:00 9:00 10:00 11:00 12:00 1:00 2:00 3:00 4:00
Sunday | | | | | | | | |
Monday | aaaaaa | cccccc| | | | |
Tuesday | | | ccccccaaaaaa| bbbbbbbb | |
Wednesday | | | | | | | | |
Thursday | | | cccccc| | | | |
Friday | bbbbbbbbb| | | | | | aaaaaa
Saturday | | | | | | | | |
I’m providing you a skeleton of the server to help get you started, scheduleServer.c. You’ll complete
the implementation of this server, adding support for multi-threading and synchronization, and building
some representation for maintaining the schedule and supporting user commands. You won’t actually
need to write a client program for this assignment; for the client, we’re going to use a general-purpose
program for network communication, nc.
Once started, your server will run perpetually, accepting connections from clients until the user kills it
with ctrl-C. The program we’re using as a client, nc, will take the server’s hostname and port number
as command-line arguments. The sample execution at the end of this problems shows how to run nc
like this. After connecting to the server, nc will just echo to the screen any text sent by the server and
send any text the user enters to the server.
Schedule Representation
The schedule records time in 10-minute slots. There’s a slot for every 10-minute interval from 8:00 am
up to (but not including) 5:00 pm. For example, the first slot covers 8:00 am up to the end of 8:09.
The next slot covers 8:10 to the end of 8:19, and so on. If any time within one of these slots has been
reserved, it will be marked with the first letter of the name of the person who reserved it. For example,
if bill reserved times from 9:00 9:45, then he will get five slots, the ones from 9:00 - 9:09, 9:10 - 9:19,
9:20 - 9:29, 9:30 - 9:39 and 9:49.
The schedule will have slots for all days from Sunday up to Friday. When the calendar is printed out
by the report command, it will be displayed as shown above. The times for each slot are printed across
the top. The days of the week are printed over on the left, each right aligned in a 9-character field. A
report of the reservations for that day will follow, after a blank space. If a time slot has been reserved,
print the initial of the person who reserved it. Otherwise, print a vertical bar for the start of any hour,
or a blanks space otherwise.
1
Client / Server Interaction
When a client first connects to the server, it will prompt the user for a username, using the prompt,
“username> ”. After the user enters a name, the server will repeatedly prompt for a command using
the prompt “cmd> ”. In the starter code, the server just echoes each command back to the client, but
you’re going to modify it to support the commands described below.
The username can be any string of 1 to 10 lower-case letters. We’ll assume everyone has a different
lower-case letter at the start of their name, their initial. So, if a user gives the name bob and, later, a
user gives the name betty, then the program will think these are the same user. That’s OK.
The person with a name starting with ’a’ is considered the administrator. They can use the ’clear’
command, but other users can’t.
• reserve day H:MM H:MM
This requests a reservation on the given day, going from the given start time (the first time given),
up to but not including the given end time (the second time given). This attempts to reserve all
the slots that overlap any time from the given start time up to but not including the given end
time. If any of these slots are already reserved by someone else (i.e., a different initial), then the
reserve command is invalid.
It’s OK if the times for a reservation overlap with times of a previous reservation by the same
user. This should just extend the reservation.
The day name can be given with capital or lower-case letters, but the command is considered
invalid if it doesn’t match the name for a day of the week.
A single reservation applies to just one day; you can’t give a range of times that span multiple
days. The given start time must be between 8:00 and 12:59 or between 1:00 and 5:00. The minute
must be between 00 and 59 (inclusive), but it can be given as a single digit (this is just to make
the time easier to parse with fscanf()). For example, 3:1 would be the same as 3:01. The end time
must be after the start time (so, really, the end time could not be 8:00 and the start time could
not be 5:00). For parsing the time, remember that you can include literal characters that fscanf()
should match in the input. So, you can parse a time as an integer, followed by a ’:’, followed by
another integer.
• clear day H:MM H:MM
The clear command can only be run by a person who’s name starts with ’a’. This will erase a
reservation by any user for the given time range on the given day. The validity requirements for
the day and times are the same as with the reserve command.
• report
This command prints out the schedule, in the format shown above.
• quit
This is a request for the server to terminate the connection with this client. This behavior has
already been implemented for you, but see the “NC Hanging” section below; we seem to get
different behavior on different nc clients, depending on the OS and installation.
Invalid Client Input
If the client sends an invalid user name at the start (one that’s too long or contains something other
than capital letters), the server will print a line saying “Invalid username” and terminate the connection
with the client right away. See the note below about the behavior of nc on some systems. Like the
quit command described above, this also causes a problem when the user enters an invalid username.
After successfully entering the username, if the client sends an invalid command, the server will print
out a line saying “Invalid command”, ignore that input line and then prompt for the next command.
2
So, a bad username terminates the connection right away, but bad commands after that don’t terminate
the connection.
You can assume each user command is given on a single line of input. You don’t have to worry about
dealing with multiple commands given on the same line, or a single command spread over multiple
input lines. We won’t test your program with inputs like that. This should make the user input easier
to parse.
NC Client
For this program, we won’t need to write a separate client program. The server just needs a client that
can open a socket connection, send user input over the socket and write anything it reads from the
socket to the terminal. There are standard programs that do this kind of thing. In fact, the predecessor
to ssh, ’telnet’, did exactly this. It’s no longer installed on the university machines, but we can use a
the program ’nc’ to do some of the same thing.
You can run nc as follows. Here, hostname is the name of the system where you’re running your
server. The port-number is the unique port number your server is using for listening for connections
(see below).
nc hostname port-number.
Client/Server Interaction
Client/server communication is all done in text. In the server, we’re using fdopen() to create a FILE
pointer from the socket file descriptor. This lets us send and receive messages the same way we would
write and read files using the C standard I/O library. Of course, we could just read and write directly
to the socket file descriptor, but using the C I/O functions should make sending and receiving text a
little bit easier.
The partial server implementation just repeatedly prompts the client for commands and terminates
the connection when the client enters “quit”. You will extend the server to handle the commands
described above. There’s a sample execution below to help demonstrate how the server is expected to
respond to these commands.
Multi-Threading and Synchronization
In the starter code, the server uses just the main thread to accept new client connections and to
communicate with the client. It can only interact with one client at a time. You’re going to fix this
by making it a multi-threaded server. Each time a client connects, you will create a new thread to
handle interaction with that client. When it’s done talking to the client, that thread can close the
socket connection to the client and terminate. Use the pthread argument-passing mechanism to give
the new thread the socket associated with that client’s connection.1
With each client having its own thread on the server, there could be race conditions when threads try
to access any state stored in the server (e.g., when they modify the schedule or report it to a user).
You’ll need to add synchronization support to your server to prevent potential race conditions. You
can use POSIX semaphores or POSIX monitors for this. That way, if two clients enter a command
at the same time, the server will make sure their threads can’t access shared server state at the same
time.
1Remember that it’s easy to make a mistake in how you pass an argument to a new thread. Be careful here.
3
Detaching Threads
Previously, we’ve always had the main thread join with the threads it created. Here, we don’t need to
do that. The main thread can just forget about a thread once it has created it. Each new thread can
communicate with its client and then terminate when it’s done.
For this to work properly, after creating a thread, the server needs to detach it using pthread detach().
This tells pthreads to free memory for the given thread as soon as it terminates, rather than keeping
it around for the benefit of another thread that’s expected to join with it.
Buffer Overflow
In its current state, the server is vulnerable to buffer overflow where it reads commands and the
username from the client. If the user types in a long username, they can crash the server (try it and
see). You’ll need to fix this vulnerability in the existing code and make sure you don’t have potential
buffer overflows in any new code you add.2 Your server only needs to be able to handle the commands
listed above. If the user tries to type in a really long username, a really long command or a really long
day name, you should detect this, ignore it and print the “Invalid command” message.
Input Flushing
Because of the way we’re creating our file pointer, it looks like buffered input is discarded when the
server starts writing output to the socket. This ends up being fairly convenient, but its something to
keep in mind as you’re writing the code to parse and respond to user commands. For example, you’ll
need to read all parts of the user’s command before you start printing a response to it (otherwise,
you’ll lose the parts you haven’t read yet). Likewise, imagine the the user types in a really long day
name in a reserve command. Once you’ve detected that the day name is too long, you can print out
the “Invalid command” message and the rest of their command will be discarded. You wont’ have to
write code to read and discard the rest of the command.
Port Numbers
Since we may be developing on multi-user systems, we need to make sure we all use different port
numbers our servers can listen on. That way, a client written by one student won’t accidentally try to
connect to a server being written by another student. To help prevent this, I’ve assigned each student
their own port number. Visit the following URL to find out what port number your server is supposed
to use.
people.engr.ncsu.edu/dbsturgi/class/info/246/
NC Hanging
I haven’t really seen nc hang, but I’ve seen what might look like nc hanging. If the server terminates
the socket connection to the nc client, I’ve seen nc continue to run. This is what happens if you type in
a bad username at the start of a client connection, or if you type in the quit command. The server will
close the client’s socket connection and let the client’s thread terminate, but the nc client won’t exit.
When this happens, you can press ctrl-C to terminate the nc client. This works, but it’s inconvenient
to have to take this extra step to end the nc client. There are other client programs that automatically
exit when the socket connection is closed (putty for example), but it’s not clear that there are better
alternatives installed on the NC State machines.
2As you know, a buffer overflow is really bad in a server. It could permit an attacker anywhere in the world to gain access
to the system the server is running on.
4
The nc program seems to work fine on my MacBook, but, on the university machines it exhibits this
behavior. if you’re running nc on a university machine, be prepared to type ctrl-C to get nc to exit,
even after the server closed the socket connection.
Sad Truths about Sockets on EOS Linux Hosts
Since we’re using TCP/IP for communication, we can finally run our client and server programs on
different hosts. You should be able to run your server on one host and then start nc on any other
Internet-connected system in the world, giving it the server’s hostname and your port number on the
command line. However, if you try this on a typical university Linux machine, you’ll probably be
disappointed. These systems have a software firewall that blocks most IP traffic. As a result, if you
want to develop on a university system, you will still need to run the client and server on the same
host. If you have your own Linux machine, you should be able to run your server on it and connect
from anywhere (depending on how your network is set up).
If you’d like to use a machine in the Virtual Computing Lab (VCL), you should be able to run commands
as the administrator, so you can disable the software firewall. I had success with a reservation for a
CentOS 7 Base (64 bit VM) system. After logging in, uploading and building my server, I ran the
following command to permit TCP connections via my port. Then, if I ran a server on the virtual
machine, I was able to connect to it, even from off campus.
sudo /sbin/iptables -I INPUT 1 -p tcp --dport my-port-number -j ACCEPT
Also, if your server crashes, you may temporarily be unable to bind to your assigned port number,
with an error message “Can’t bind socket.” Just wait a minute or two and you should be able to run
the server again.
Implementation
You will complete your server by extending the file, scheduleServer.c from the starter. You’ll need
to compile as follows:
gcc -Wall -g -std=gnu99 scheduleServer.c -o scheduleServer -lpthread
Sample Execution
You should be able to run your server with any number of concurrent clients, each handled by its own
thread in the server. Below, we’re looking at three terminal sessions, with the server running in the
left-hand column and clients running in the other two columns (be sure to use our own port number,
instead of 28123). I’ve interleaved the input/output to illustrate the order of interaction with each
client, and I’ve added some comments to explain what’s going on. If you try this out on your own
server, don’t enter the comments, since the server doesn’t know what to do with them.
$ ./scheduleServer
# Run one client and reserve a few times
$ nc localhost 28123
username> alice
cmd> reserve Monday 8:00 12:00
cmd> reserve Tuesday 8:00 12:30
5
# Run another client and make
# a couple of reservations.
$ nc localhost 28123
cmd> reserve tuesday 12:00 1:30
cmd> reserve wednesday 12:00 1:30
# Ask for a time alice already has
cmd> reserve tuesday 12:00 1:30
Invalid command
# See what happened
cmd> report
8:00 9:00 10:00 11:00 12:00 1:00 2:00 3:00 4:00
Sunday | | | | | | | | |
Monday aaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbb | | |
Tuesday aaaaaaaaaaaaaaaaaaaaaaaaaaa | | | |
Wednesday | | | | bbbbbbbbb | | |
Thursday | | | | | | | | |
Friday | | | | | | | | |
Saturday | | | | | | | | |
# Try to make a conflicting reservation
cmd> reserve Wednesday 8:00 12:30
Invalid command
# We’re the admin. We can erase other reservations.
cmd> clear Wednesday 12:00 12:30
cmd> report
8:00 9:00 10:00 11:00 12:00 1:00 2:00 3:00 4:00
Sunday | | | | | | | | |
Monday aaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbb | | |
Tuesday aaaaaaaaaaaaaaaaaaaaaaaaaaa | | | |
Wednesday | | | | | bbbbbb | | |
Thursday | | | | | | | | |
Friday | | | | | | | | |
Saturday | | | | | | | | |
# Now, we can make our reservation (but this is probably
# an abuse of administrator rights)
cmd> reserve Wednesday 8:00 12:30
cmd> report
8:00 9:00 10:00 11:00 12:00 1:00 2:00 3:00 4:00
Sunday | | | | | | | | |
Monday aaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbb | | |
Tuesday aaaaaaaaaaaaaaaaaaaaaaaaaaa | | | |
Wednesday aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbb | | |
Thursday | | | | | | | | |
Friday | | | | | | | | |
Saturday | | | | | | | | |
# We can overwrite our own reservation
6
cmd> reserve Monday 1:00 2:30
cmd> report
8:00 9:00 10:00 11:00 12:00 1:00 2:00 3:00 4:00
Sunday | | | | | | | | |
Monday aaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbb | |
Tuesday aaaaaaaaaaaaaaaaaaaaaaaaaaa | | | |
Wednesday aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbb | | |
Thursday | | | | | | | | |
Friday | | | | | | | | |
Saturday | | | | | | | | |
# Try some invalid commands, then exit.
cmd> fsdklfjks
Invalid command
cmd> reserve Friday 7:30 8:30
Invalid command
cmd> reserve Friday 8:99 9:45
Invalid command
cmd> clear Monday 1:00 2:00
Invalid command
cmd> reserve lunes 3:00 4:59
Invalid command
cmd> quit
cmd> quit
Submitting your Work
When you’re done, submit your scheduleServer.c source file using the Homework 5 link in Moodle.
2. (40 points) For this problem, you’re going to implement the distancing program from assignments 1,
2 and 3 on a GPU, with lots and lots of threads. We’re going to use the CUDA framework on the
arc cluster at NC State. You know about arc; you should have activated your arc account as part of
assignment 4. Arc has a collection of about 100 hosts running Linux, each with a general-purpose CPU
and a GPU serving as a highly parallel co-processor.
Arc Filesystem
Arc doesn’t share user accounts or filesystems with other university machines. To connect to it, you’ll
need to ssh in from another university system (e.g., remote.eos.ncsu.edu), just like you did when you
activated your arc account for assignment 4. To successfully connect, you’ll need to use the same
machine where you made the your key pair on assignment 4, or some machine that has access to AFS
if you made your key pair on a machine with AFS.
Due Date Distribution
The arc system has a lot of nodes, but it doesn’t have enough for all of us to have one at the same
time. To make sure you get a chance to complete this problem, I’m implementing a soft due date for
this assignment. You can earn up to 8 points of extra credit by completing this problem early. Or,
if you submit your solution late, the late penalty will be applied gradually rather than all at once. I’m
hoping this will help to make sure we aren’t all trying to use the arc system on the same evening. Arc
7
has a lot of compute nodes, but if we’re all trying to use it at the same time, some users are going to
be disappointed.
The following table shows the extra credit or late penalty based on how early or late you turn in your
solution.
Bonus / Penalty Submission Time
+8 At least 96 hours early
+6 At least 72 hours early
+4 At least 48 hours early
+2 At least 24 hours early
-2 At most 24 hours late
-4 At most 48 hours late
-6 At most 72 hours late
Work Partitioning
For this assignment, we’re going to divide work statically among the threads. You can divide up the
work however you want, and, unlike on homework 3, you will have access to the whole list of locations
before any of your worker threads start running. Each thread has a unique index, i. It computes this
index based on the grid and block dimensions right after the thread starts up. You could have thread
number i compare the i
th location on the list with other locations, counting (and optionally reporting)
all the pairs that are too close. Then, when a worker is done, it will copy the count of pairs it finds to
element i of an array allocated in CUDA global memory. After all the threads are finished, code from
main will copy all the counts to host memory and add up the results to get an overall total.
For previous assignments, we made each process or thread responsible for several locations on the list.
We didn’t want to create too many workers. This won’t be a problem on a GPU. It can handle lots
of threads at once, and the CUDA software will take care of sharing the processing elements among
threads if we ask for more threads than the GPU can execute at once.
I’m providing you with a skeleton, distancing.cu, to get you started. This program reads the list of
locations into an array on the host, pList.
8
You’ll need to add code to allocate two arrays on the device, one array for holding a copy of the host’s
list of people and another array of integers for holding the counts computed by each thread.
Then you’ll need code to copy the list of people from the host over to the device, so threads on the
GPU can access it.
Now that the device is ready, you’ll need to complete the implementation of the checkDistance() kernel.
Each thread will be responsible for checking for some pairs of locations. When it’s done, it will store
its total count in the i
th index of the results array in device memory.
When you run this kernel from the host, it will start a thread for each location and wait for them to
finish computing their counts.
9
Then, the host will need to allocate host memory (malloc) and copy the results into it.
Then, the main thread on the host can quickly add up the counts from all the threads and print out a
total count in the format:
Count: 39
Thread Implementation
You get to write most of the checkDistance() function, the code that actually runs on the GPU. I’ve
written some starter code for this function. Right now, each thread just computes a unique index and
10
makes sure it has a value to work on3 and then returns. You’ll need to add parameters to the kernel
function for passing in pointers to the input and output arrays, testing pairs of locations and counting
the ones that are too close.
Working on Arc
To access arc, you’ll have to log in from another university machine, where you’ve stored the key pair
that you used to activate your arc account (probably remote.csc.ncsu.edu). Use your ssh client (e.g.,
putty or mobaXterm or ssh) to connect to remote.csc.ncsu.edu, then you can get to arc from there.
Uploading your Files
The nodes on the arc cluster share a filesystem, but they don’t use the university AFS. To get files
onto the machine, you’ll need to upload them from another university machine. You can just upload
your files to one of the university Linux machines and then transfer them from there using the sftp
program. From one of the remote Linux machines (not from ARC), commands like the following should
let you upload files. Starting from a directory containing your distancing.cu and the input files:
sftp arc.csc.ncsu.edu
sftp> put distancing.cu
Uploading distancing.cu ...
sftp> put input-1.txt
Uploading input-1.txt
sftp> put input-2.txt
Uploading input-2.txt
sftp> bye
Logging in on Arc
After logging in to one of the machines at remote.eos.ncsu.edu, you should be able to enter the following
to log in to the head node on arc:
ssh arc.csc.ncsu.edu
Editing, Compiling and Running
When you log in to arc, you connect to a head node that doesn’t even have a GPU. From the head
node, you’ll need to connect to one of the compute nodes. Arc uses batch processing software to assign
jobs to the compute nodes. The following commands let you connect from the head node to one of
the compute nodes. Each time you want to work on arc, you should just need to run one of these
commands. It will give you an interactive prompt on one of the compute nodes as soon as one is
available. The comments below tell you what kind of GPU you’ll get on each node. Depending on how
busy the system is, you may have to wait a little while to get a prompt. Also be sure you run this
from an a shell prompt (not from inside sftp command described above):
3
If the block size doesn’t evenly divide the number of people, there may be some threads in the last block that don’t actually
have anything to do.
11
# there are 13 nodes with an nVidia RTX 2060
srun -p rtx2060 --pty /bin/bash
# there are 17 nodes with an nVidia rtx2070.
# Once, when I tried to use one of the rtx2070 systems, it failed
# on my first call to a cuda function. Later, when I logged back
# in to another rtx2070 system, it worked fine. This makes me think
# there may have been one of these nodes that had a bad GPU or maybe
# it wasn’t configured correctly. If you have this problem, you may
# want to try another system.
srun -p rtx2070 --pty /bin/bash
# there are 2 nodes with an nVidia RTX 2080
srun -p rtx2080 --pty /bin/bash
# there are 21 nodes with an nVidia RTX 2060 Super
srun -p rtx2060super --pty /bin/bash
# there are 2 nodes with an nVidia RTX 2080 Super
srun -p rtx2080super --pty /bin/bash
# there are still some systems with the older, nVideo gtx480 graphics
# card. If you can’t connect to one of the newer systems above, you
# could try one of these older systems instead.
srun -p gtx480 --pty /bin/bash
Once you have a prompt on a compute node, you should still be able to see the files you uploaded. Arc
used to require you to enter the following command before you could do any work with CUDA, but it
seems like this is no longer required. If you try to run the nvcc compiler and you get a complaint that
your shell can’t find a program with that name, you could try running this command to see if that
fixes it.
module load cuda
For editing your program, you can make edits in AFS and just use sftp to copy the modified file to
arc every time you make a change. If you’re comfortable editing directly on Linux machine, it may be
easier for you to edit your source file directly on the arc machine. Editors vim and emacs are there,
but it doesn’t look like the really simple editors like pico or nano are installed (vim and emacs are
better editors anyway, but it takes a little space in your brain to be ready to use them).
Once you’re ready to build your program, you can compile as follows:
nvcc -I/usr/local/cuda/include -L/usr/local/cuda/lib64 distancing.cu -o distancing
When you’re ready to run, be sure you’re on a compute node. You can run your program just like you
would an ordinary program. This version of the program will still have the optional report flag, but it
won’t need to be told how many threads to use. It just makes a thread for every person on the list.
As usual, if report is enabled, then your threads will report pairs of nearby people as they find them,
so they may get printed in any order.
12
$ ./distancing report < input-2.txt
9 13 5.297
5 6 5.561
6 10 5.906
13 14 5.616
3 7 5.215
3 4 5.462
Count: 6
The time command is available when you want to measure the execution time for your program
(especially on the larger inputs).
$ time ./distancing < input-4.txt
Count: 1260629
real ?m?.???s
user ?m?.???s
sys ?m?.???s
Recording Execution Time
At the top of the prime.cu file, I’ve put comments for recording running times for this program on the
inpput-4.txt file. Once your program is working, time its execution on this input file and enter the
(real) execution time you got in this comment. Also, report what type of compute node you were using
(i.e., what type of GPU it has). Don’t use the report flag when you’re measuring execution time; that
would just slow it down. You’ll probably notice that start-up time seems to be significant, so pushing
work out to the GPU pays off most with the largest input files. The different compute nodes on arc
may have different capabilities (at least, that’s been the case in the past), so don’t worry if you get
different execution times from some of your classmates. Either way, I hope you’ll find these runtimes
to be faster than what you got on a general-purpose CPU.
Logging Out
When you’re ready to log out, you’ll have several levels to log out from. From a compute node, you
can type exit to fall back to the head node of the arc cluster. Typing exit from there will drop you
back to whatever machine you connected from (probably remote.eos.ncsu.edu). One more exit should
get you out.
Submitting your Work
When you’re done, submit a copy of your working program, distancing.cu, with the runtime filled
in at the top. Use the Homework 5 link in Moodle.