CS3214 Spring 2020 Project 1 - “Extensible Shell”

 CS3214 Spring 2020 Project 1 - “Extensible Shell”

Due Date: See website for due date (Late days may be used.)
This project must be done in groups of 2 students. Use Piazza and the grouper app to find
a partner (URL).
1 Introduction
This assignment introduces you to the principles of process management and job control
in a Unix-like operating system. In addition, the assignment will give you insights into
the design and use of extensible systems.
This is an open-ended assignment. In addition to implementing the required functional￾ity, we encourage you to define the scope of this project yourself.
2 Base Functionality
A shell receives line-by-line input from a terminal. If the user inputs a built-in command,
the shell will execute this command. Otherwise, the shell will interpret the input as the
name of a program to be executed, along with arguments to be passed to it. In this case,
the shell will fork a new child process and execute the program in the context of the
child. Normally, the shell will wait for a command to complete before reading the next
command from the user. Such programs are said to run as “foreground” jobs. If the user
appends an ampersand ‘&’ to a command, the command is started in the “background”
and the shell will return to the prompt immediately.
The shell provides job control. A user may interrupt foreground jobs, send foreground jobs
into the background, and vice versa. At a given point in time, a shell may run zero or more
background jobs and zero or one foreground jobs. If there is a foreground job, the shell
waits for it to complete before printing another prompt and reading the next command.
In addition, the shell informs the user about status changes of the jobs it manages. For
instance, jobs may exit, or terminate due to a signal, or be stopped for several reasons.
At a minimum, we expect that your shell has the ability to start foreground and back￾ground jobs and implements the built-in commands ‘jobs,’ ‘fg,’ ‘bg,’ ‘kill,’ and ‘stop.’ The
semantics of these commands should match the semantics of the same-named commands
in bash or tcsh. The ability to correctly respond to ˆC (SIGINT) and ˆZ (SIGTSTP) is ex￾pected, as are informative messages about the status of the children managed. Like bash
or tcsh, you should use consecutively numbered small integers to enumerate your jobs.
For the minimum functionality, the shell need not support pipes (|), I/O redirection
(< > >>), nor the ability to run programs that require exclusive access to the terminal
(e.g., vim).
Created by G. Back (gback@cs.vt.edu)
CS3214 Spring 2020 Project 1 - “Extensible Shell”
We expect most students to implement pipes, I/O redirection, and managing the control￾ling terminal to ensure that jobs that require exclusive access to the terminal obtain such
access. Beyond that, esh’s extensibility, described in Section 7 should allow for plenty of
creative freedom.
3 Strategy
You will need to use fork(), a variant of exec*(), and the waitpid() system calls.
3.1 Signal Handling
You will need to catch SIGCHLD to learn about when the shell’s child processes change
status. Since child processes execute concurrently with respect to the parent shell, it is
impossible to predict when a child will exit (or terminate with a signal), and thus it is
impossible to predict when this signal will arrive. In the worst case, a child may have
terminated by the time the parent returns from fork()!
You will need to block the signal in those sections of your code where you access data
structures that are also needed by the handler that is executed when this signal arrives.
For example, consider the data structure used to maintain the current set of jobs. A new
job is added after a child process has been forked; a job may be removed when SIGCHLD
is received. To avoid a situation where the job has not yet been added when SIGCHLD
arrives, or - worse - a situation in which SIGCHLD arrives while the shell is adding the job,
the parent should block SIGCHLD until after it completed adding the job to the list. If the
SIGCHLD signal is delivered to the shell while the shell blocks this signal, it is marked
pending and will be received as soon as the shell unblocks this signal.
Use sigprocmask(2) to block and unblock signals. To set up signal handlers, use the sigac￾tion(2) system call. Set sa flags to SA RESTART. The mask of blocked signals is inherited
when fork() is called. Consequently, the child will need to unblock any signals the parent
blocked before calling fork().
3.2 Process Groups
Each process in Unix is part of a group. Process groups are treated as an ensemble for
the purpose of signal delivery and when waiting for processes. Specifically, the kill(2),
killpg(2), and waitpid(2) system calls support the naming of process groups as possible
1Note the idiosynchracies of the API: kill(-pid, sig) does the same as killpg(pid, sig). Make sure to use
the correct call.
Created by G. Back (gback@cs.vt.edu)
CS3214 Spring 2020 Project 1 - “Extensible Shell”
Each process group has a designated leader, which is one of the processes in the group.
To create a new group with itself as the leader, a process simply calls setpgid(0, 0). The
id of a process group is the process id of the leader. Child processes inherit the process
group of their parent process initially. They can then form their own group if desired, or
their parent process can place them into a different process group via setpgid().
In addition to signals and waitpid, process groups are used to manage access to the ter￾minal, as described next.
3.3 Managing Access To The Terminal
Running multiple processes on the same terminal creates a sharing issue: if multiple pro￾cesses attempt to read from the terminal, which process should receive the input? Sim￾ilarly, some programs - such as vi - output to the terminal in a way that does not allow
them to share the terminal with others. 2
To solve this problem, Unix introduced the concept of a foreground process group. Each
terminal maintains such a group. If a process in a process group that is not the foreground
process group attempts to perform an operation that would require exclusive access to a
terminal, it is sent a signal: SIGTTOU or SIGTTIN, depending on whether the attempted
use was for an output (write) or input (read) operation. The default action taken in re￾sponse to these signals is to suspend the process. If that happens, the process’s parent (i.e.,
your shell) can learn about this status change by calling waitpid(). WIFSTOPPED(status)
will be true in this case. To allow this process to continue, its process group must be made
the foreground process group of the controlling terminal via tcsetpgrp(), and then the
process must be sent a SIGCONT signal. The state of the terminal must be saved when
the process was suspended and restored when it is continued. (The shell will typically
take this action in response to a ’fg’ command issued by the user.)
Signals that are sent as a result of user input, such as SIGINT or SIGTSTP, are also sent to
a terminal’s foreground process group by the operating system.
3.4 Pipes and I/O Redirection
To implement pipes, use the pipe(2) system call. A pipe must be set up by the parent
shell process before a child is forked. Forking a child will inherit the file descriptors that
are part of the pipe(). The child must then redirect its stdout/stdin file descriptor to the
pipe’s input or output end as needed using the dup2(2) system call.
Note that all processes that are part of a pipeline are children of the shell, e.g., if a user
runs a | b then the process executing b is not a child process of the process executing
the program a.
2Note that regular output via write(2) does not require exclusive access, unless the terminal’s ’tostop’
flag is set. The terminal will simply interleave such output.
Created by G. Back (gback@cs.vt.edu)
CS3214 Spring 2020 Project 1 - “Extensible Shell”
Generally, a pipeline of commands is considered one job. All processes that form part of
a pipeline should thus be part of the same process group.
Although the parent shell process creates pipes for each pair of communicating children
before they are forked, it will not itself write to the pipes or read from the pipes it creates.
Therefore, you must make sure that the parent shell process closes the file descriptors
referring to the pipe’s ends after each child was forked. This is necessary for two reasons:
first, in order to avoid leaking file descriptors. Second, to ensure the proper behavior of
programs such as /bin/cat if the user asks the shell to execute them. To see why, we must
first discuss what happens to file descriptors on fork(), close(), and exit().
Each file descriptor represents a reference to an underlying kernel object. Upon fork(),
both the child and the parent process have access to any object the parent process may
have created (i.e., open files or other kernel objects). Closing a file descriptor in the (par￾ent) shell process affects only the current process’s access to the underlying object. Hence
when the parent shell closes the file descriptor referring to the pipe it created, the child
processes will still be able to access the pipe’s ends, allowing it to communicate with the
other commands in the pipeline.
The actual object (such as a pipe or file) is closed only when the last process that has an
open file descriptor referring to the object closes that file descriptor. If you fail to close
the pipe’s file descriptors in the parent process (your shell), you compromise the cor￾rect functioning of programs that rely on taking action when their standard input stream
reaches end of file. For instance, the /bin/cat program will exit if its standard input
stream reaches EOF, which in the case of a pipe happened iff all descriptors pointing to
the pipe’s output end are closed. So if cat’s standard input stream is connected to a pipe
for which the shell still has an open file descriptor, cat will never “see” EOF for its stan￾dard input stream and appear stuck.
Lastly, note that when a process exits for whatever reason, including a signal, all file
descriptors it had open are closed by the kernel as if the process has called close() before
Additional information can be found in the GNU C library manual, available at http://
www.gnu.org/s/libc/manual/html_node/index.html. Read, in particular, the
sections on Signal Handling and Job Control.
4 Use of Git
You will use Git for managing your source code. Git is a distributed version control
system in which every working directory contains a full repository, and thus the system
can be used independently of a (centralized) repository server. Developers can commit
changes to their local repository. However, in order to share their code with others, they
must then push those commits to a remote repository. Your remote repository will be
hosted on git.cs.vt.edu, which provides a facility to share this repository among
Created by G. Back (gback@cs.vt.edu)
CS3214 Spring 2020 Project 1 - “Extensible Shell”
group members. For further information on git in general you may browse the official
Git documentation: http://git-scm.com/documentation, but feel free to ask ques￾tions on the forum as well! The use of git (or any distributed source code control system)
may be new to a fair number of students, but it is a prerequisite skill for most program￾ming related internships or jobs.
You will use a departmental instance of Gitlab for this class. You can access the instance
with your SLO credentials at https://git.cs.vt.edu/.
The provided base code for the project is available on Gitlab at https://git.cs.vt.edu/cs3214-
One team member should fork this repository by viewing this page and clicking the fork
link. This will create a new repository for you with a copy of the contents. From there
you must view your repository settings, and set the visibility level to private. On the
settings page you may also invite your other team member to the project so that they can
view and contribute.
Group members may then make a local copy of the repository by issuing a git clone
command. The repository reference can be found on the project page
such as git@git.cs.vt.edu:teammemberwhoclonedit/cs3214-esh.git To clone
over SSH (which you may need to do on rlogin), you will have to add an SSH pub￾lic key to your profile by visiting https://git.cs.vt.edu/profile/keys. If you
are unsure on how to do this you may view the provided documentation here: https:
If updates or bug fixes to this code are required, they will be announced on the forum. You
will be required to use version control for this project. For grading purposes, you may
need to give teaching staff read permissions to your repository so that they can assign
a portion of the project credit for making proper use of a version control facility, which
includes continous checkins of intermediate milestones.
4.1 Code Base
The code contains a command line parser that implements the following grammar:
cmd_line : cmd_list
cmd_list :
| pipeline
| cmd_list ’;’
| cmd_list ’&’
| cmd_list ’;’ pipeline
| cmd_list ’&’ pipeline
pipeline : command
| pipeline ’|’ command
command : WORD
Created by G. Back (gback@cs.vt.edu)
CS3214 Spring 2020 Project 1 - “Extensible Shell”
| input
| output
| command WORD
| command input
| command output
input : ’<’ WORD
output : ’>’ WORD
| ’>>’ WORD
Look at the provided esh.c main function to see how to invoke the parser. If a com￾mand line is semantically correct, the parser code will create a esh command line data
structure, which refers to a list of esh pipeline structures. Each esh pipeline corre￾sponds to a job. It may consist of one or more individual commands that form a pipeline.
Each command is represented as a esh command structure. Study the definitions of these
By default, the provided code will read a line, parse it, and dump the parsed command
line to stdout.
The file esh-sys-utils.c contains a number of utility functions for dealing with ttys
and signals. We strongly recommend you use these functions rather than directly calling
the functions described in the textbook.
5 Testing
We will provide a test driver to test your project, and tests for the basic and advanced
functionality. The tests may be found on rlogin in
/web/courses/cs3214/spring2020/projects/eshtests/. The basic and advanced
tests are also in the Gitlab repository that you forked to start the project. If updates to the
tests come out you will have to pull from the remote repository to update your local copy.
6 Static and Dynamic Analysis Tools
While we encourage you to utilize the normal debugging practices (such as using gdb,
strace, and printf), we have developed analysis tools designed to flag common errors
that students encounter when programming this project.
These analysis tools—EshMD and ShellTrace—use static and dynamic analysis to reason
about your code.
Static analysis involves looking at your source code without running it to find paths that
Created by G. Back (gback@cs.vt.edu)
CS3214 Spring 2020 Project 1 - “Extensible Shell”
could potentially lead to a buggy execution. A static analysis performs symbolic execu￾tion to reason about what the possible states your program can reach are.
Dynamic analysis runs your code on various test cases and looks that the behavior of your
program to detect bugs. This differs from static analysis because it only looks at your how
your code runs for that specific input/test.
Together, the analysis tools will look at your submitted program and point to locations in
your code where your shell is not operating properly. This can be a very useful tool to
determine the reasons your shell is crashing or why a test is failing.
You can submit your project for analysis using these tools by running (in your src folder):
make analysis
or by using the course website (URL).
You can submit your code for analysis as many times as you want, whenever in the de￾velopment cycle you want to. This might be useful to catch bugs as you implement more
There will be a small survey at the end to share your thoughts and experiences using the
analysis tools to help debug this project.
7 Plug-Ins
It is often impossible to anticipate the future uses and needs of a system or application.
Extensible architectures address this problem by allowing the loading of plug-ins that
provide additional functionality or enhance built-in functionality.
When started with the ’-p dir’ flag, ’esh’ will dynamically load shared libraries contained
in the directory ’dir.’ Multiple -p flags may be provided. Each shared library must define
a strong global symbol named esh module, which shall refer to an instance of struct
esh plugin. This struct contains information about the plug-in, including a set of func￾tion pointers to invoke the plug-in’s functionality.
Multiple plug-ins may be loaded; a plug-in may specify its rank relative to others. Your
shell should invoke the plug-ins’ functions in increasing rank order. If plug-ins share
the same rank, their execution order is not defined. Some functionality (e.g., built-ins)
requires that invocation stop if a plug-in provides this functionality.
You will need to make some modifications to your shell to be able to host plugins. We
recommend that you first be able to host plugins on your shell before attempting to write
Here are some ideas for plug-ins:
• Change current directory (cd)
Created by G. Back (gback@cs.vt.edu)
CS3214 Spring 2020 Project 1 - “Extensible Shell”
• Glob expansion (e.g., *.c)
• Setting and unsetting environment variables
• Timing commands: ”time” or time-outs.
• Aliases
• Shell variables
• pushd, popd, etc.
• Command-line history (perhaps using’s GNU History library)
• Backquote substitution
• Smart command-line completion
• Embedding applications: scripting languages, web servers, etc.
A side-note on Unix philosophy - in general, Unix implements functionality using many
small programs and utilities. As such, built-in commands are often only those that must
be implemented within the shell, such as cd. In addition, essential commands such as
’kill’ are often built-in to make sure an operator can execute those commands even if no
new processes can be forked. Your plug-ins should generally stay with this philosophy
and implement only functionality that is not already available using Unix commands or
that would be better implemented using separate programs. If in doubt, ask.
You will note that the functions to read from the terminal and to parse the command
line are invoked indirectly as function pointers that are part of esh shell. Advanced
plug-ins may replace those if desired.
8 Honor Code
You will receive credit for every plug-in you write, and for every plug-in written by others
which your shell can successfully load and run. You should publish plug-ins you have
developed on the forum.
It is ok to sit together and debug a situation that arises if a plug-in written by one group
does not run successfully in another group’s shell.
However, you may not share any code - electronically or otherwise - for the shell or a plug-in
- across groups. To allow others access to your plug-ins, we provide a shared place to
which to copy them. Create a directory with your SLO id in
/web/courses/cs3214/spring2020/projects/student-plugins For each plu￾gin you wish to share, create a subdirectory within that directory, e.g. gback/cd, gback/glob,
etc. In that subdirectory, copy the .so file, but do not include the corresponding .c file. In
addition, provide a description of the plugin as a .txt file and a Python test for the plugin,
as described below.
Created by G. Back (gback@cs.vt.edu)
CS3214 Spring 2020 Project 1 - “Extensible Shell”
In addition, note that the code contained in the plug-ins you load will run with the full
privileges of the user executing the shell. In practice, this setup requires that you trust the
provider of the plug-in. The “Acceptable Use of Information Systems” policy, published
at http://www.vt.edu/about/acceptable-use.html, applies. If you are in doubt
whether a plug-in you’ve written would violate this policy, please ask first.
9 Grading
Rubrics. This project will account for 140 points. 50 points will be assigned for passing
the base tests. 50 points for advanced tests, and up to 20 additional points can be earned
through plug-ins.
You may earn points for plug-ins only if you pass at least 50% of advanced tests. At
least two of the advanced functionalities (IO redirection, pipes, exclusive access) should
be implemented and sufficiently tested before we award credit for plug-ins. This rules
ensures that you focus on the core of the assignment before writing plug-ins.
10 points are awarded for correct use of version control, and 10 points for documenta￾tion. In addition, deductions may be taken for deficiencies in coding style and lack of
Coding Style. Your coding style should match the style of the provided code. You
should follow proper coding conventions with respect to documentation, naming, and
You must check the return values of all system calls and library functions, with the sole
exception of malloc(3). (Production code would need to check for those as well; this is a
simplification for this project.) This includes calls such as kill(2) and close(2).
Submission. You must submit a design document, README.txt, as an ASCII document
using the following format to describe your implementation:
Student Information
How to execute the shell
Important Notes
Created by G. Back (gback@cs.vt.edu)
CS3214 Spring 2020 Project 1 - “Extensible Shell”
Description of Base Functionality
jobs, fg, bg, kill, stop, \ˆC, \ˆZ >
Description of Extended Functionality
I/O, Pipes, Exclusive Access >
List of Plugins Implemented
(Written by Your Team)
(Written by Others)
For each plugin that you implement, include the following files with this naming stan￾dard:
• group# pluginName.so (the shared library of your plugin)
• group# pluginName readme.txt (the readme file of your plugin)
• group# pluginName test.py (the .py test file that tests your plugin)
The TA will assign credit only for the functionality for which test cases and documentation exist.
Additional points may be awarded for plugins that implement advanced and useful functionality.
You must submit a .tar.gz file of your ’src’ directory, which contains a Makefile. Please
use the submit.pl script or web page and submit as ’p1’. Only one group member may
submit. You need to run ’make clean’ on your directory before you create your tarball.
Make sure to also delete all temporary folders and files (i.e. clean your submission to
pertinent files).
You can earn up to 20 additional points for plugins. You can score up to 6 points for
running the reference plugins created by the course staff; partial credit will be awarded
for the number of reference plugins that successfully execute on your shell. You can write
up to 3 plugins (for a grade) to be run by other students. For each of the plugins written
you can score up to 2 points based on the quality of the plugin, documentation, and tests.
You can recieve 1 point for every plugin written by another group that you are able to run
in your shell, up to a maximum of 8 points. Please keep in mind that the emphasis is on
Created by G. Back (gback@cs.vt.edu)
CS3214 Spring 2020 Project 1 - “Extensible Shell”
mastering process control in Unix - make sure your shell passes all tests before attempting
Good Luck!
Created by G. Back (gback@cs.vt.edu)
