讲解 CSMDE21、Python程序设计辅导

DEPARTMENT OF COMPUTER SCIENCE
SUMMATIVE COURSEWORK ASSIGNMENT BRIEF
KEY INFORMATION
• Module title: Data Security & Ethics
• Module code: CSMDE21
• Lecturer responsible: Dr Martin Lester
• Type of assignment (coursework/online test): Coursework — Data Security
• Individual/group assignment: Individual
• Weighting of the assignment: 50%
• Page limit/word count: None, but roughly equivalent 4 pages A4
• Expected hours spent on this assignment: 8 hours in practicals + 2 hours independently
• Items to be submitted: .zip or .tar.gz archive as described
• Work to be submitted on-line via Blackboard Learn by: 12:00 midday on Fri 24th Mar 2023
• Work will be marked and returned by: Tue 18th Apr 2023
PLAGIARISM
By submitting this assignment, you are certifying that it is all your own work. Any sentences, figures,
tables, equations, code snippets, artworks and illustrations in this report must be original and must
not have not been taken from any other person’s work, except where explicitly acknowledged, quoted,
and referenced. You understand that failing to follow this requirement will be considered plagiarism.
Plagiarism is a form of academic misconduct and will be penalised accordingly. The University’s
Statement of Academic Misconduct is available on the University web pages.
LATE SUBMISSION
If your work is submitted after the deadline, 10% of the maximum possible mark will be deducted for
each working day (or part thereof) it is late. A mark of zero will be awarded if your work is submitted
more than 5 working days late. It is strongly recommended that you hand work in by the deadline as a
late submission of one piece of work can have an impact on other work.
If you believe that you have a valid reason for failing to meet a deadline, then you should complete an
Exceptional Circumstances Form and submit it to the Student Support Centre before the deadline, or
as soon as is practicable afterwards, explaining why.
1
1. ASSIGNMENT DESCRIPTION
SUMMARY
This assignment is split into two parts.
1. You will be supplied with disk images for three virtual machines, called company, router
and internet. Connect the virtual machines using a virtual network, with router serving as
the router joining company to internet. Use the tools installed on internet to scan company.
Then configure a firewall on router to block potential attacks from internet and to prevent
unintentional or undesired leaks of data to internet. Submit the firewall configuration and
evidence that the firewall works.
You must not connect the virtual machines to the University network or the public Internet. If you port scan the University network, you may be subject to disciplinary procedures.
2. You will be supplied with a type-checker for a subset of Python. You must implement a security
type system. Submit the source code file you changed and an image showing a type derivation
for a short program.
TASK DESCRIPTION (PART 1)
Scenario
You are an employee of a company and have been asked to secure the company’s network by configuring
a firewall on the router that connects the company’s network to the Internet. As well as your own
machine within the company network (company), you have access to some other machine elsewhere
on the Internet (internet), which you can use to scan the company’s network.
Setup
Download the disk images for company, router and internet. Unzip them; each should be about 60
MB in size uncompressed. Create a new virtual machine from each disk image. The operating system
type/version in VirtualBox should be set to Linux 2.x/3.x/4.x (32-bit).
Important: Before you start the virtual machines, you must ensure they are connected to an internal
virtual network. For each machine, go to Settings, select Network and look at the Adapter tabs:
• company:
– Adapter 1: Change “Attached to:” from “NAT” to “Internal Network” and enter the network
name inside.
• internet:
– Adapter 1: Change “Attached to:” from “NAT” to “Internal Network” and enter the network
name outside.
• router:
– Adapter 1: Change “Attached to:” from “NAT” to “Internal Network” and enter the network
name inside.
– Adapter 2: Tick “Enable Network Adapter”. Change “Attached to:” from “NAT” to
“Internal Network” and enter the network name outside.
All the virtual machines run a minimal version of Linux called TinyCore Linux. This keeps the disk
images and memory requirements small, so you can easily run all of them at once on a single machine.
Most changes made to the virtual machines will not persist across reboots. Unless you have specifically
2
tried to backup any changes to the configuration or files stored, they will disappear as soon as you
reboot the virtual machine. This means that you do not have to worry too much about destroying your
setup. However, you will want to keep a separate record of anything interesting you do.
One way you can do this is using the menu option View > Take Screenshot. But for the purposes of this
coursework, it is better to set up a serial port. With each virtual machine in turn, go to Settings, select
Serial Ports and look at the Port 1 tab. Tick “Enable Serial Port”, set Port Mode to “Raw File” and
enter the name of a file (such as company.txt, internet.txt or router.txt) you would
like to store your work in. Then, when using the virtual machine, you will be able to copy output from
a command into the file by redirecting it to /dev/ttyS0. For example, to save the output of the
command ifconfig, you would type ifconfig > /dev/ttyS0.
Start all three virtual machines. You will be automatically logged in as a normal user on company,
router and internet. The virtual machines have no GUI installed, so you will have to enter commands
directly from the command line. Many commands will only work when logged in as the administrator
(root). You can execute a single command as administrator by prefixing it with sudo. Or you can
switch to root for a longer period (until you type exit) by typing sudo su. This is not normally
recommended from a security perspective for two reasons. Firstly, sudo can log every command
used, so it can be reviewed and potentially audited later; with sudo su, only the su command gets
logged. Secondly, being logged in as root when it is not necessary increases the risk that you will
accidentally destroy or damage data by carelessly entering the wrong command or by running malware.
However, as there is no need for your work to be audited and no interesting data for you to destroy,
you may safely work as root for this coursework.
Check that everything is set up correctly by typing sudo ping internet from company and by
typing sudo ping company from internet. You should see a series of ping responses. You can
terminate the ping command (and many other commands) by pressing Ctrl and C simultaneously
(often abbreviated as Ctrl+C).
When you have finished working on a virtual machine, you can shut it down cleanly by typing sudo
poweroff or choosing the menu option Input > Keyboard > Insert Ctrl-Alt-Del.
Subtask 1: Port scanning (8%)
Before you start to configure the firewall to defend the network, you should scan it to find out what
you need to defend.
1. From internet, use the supplied port scanner to collect a list of open ports (corresponding to
network services) running on company. Try to find an option that will show open ports and
versions of software listening on those ports, while keeping the output relatively brief (at most
25 lines).
• In a file called 01-scan.txt, log the command you typed and the output it produced.
2. In reality, there would likely be many machines on the company network running different
services and you might configure the firewall to treat some of them differently. What command
could you use to scan the whole network range to which company is attached, not just one
machine?
• Add the command you would use to the end of 01-scan.txt.
Before you move to the next task, you should log the state of the firewall:
• From router, run sudo ufw show added and log the output in 01-firewall.sh.
• From router, run sudo ufw status verbose and log the output in 01-firewall.txt.
3
Subtask 2: Blocking incoming connections (16%)
On router, use the firewall configuration command ufw to turn on the firewall and set the following
policy for handling incoming connections from internet to company:
1. By default, incoming connections should be denied.
Then, if someone accidentally starts running an insecure server program or malware on the
company network, no-one outside will be able to connect to it.
2. Connections to encrypted HTTP (HTTPS) should be allowed.
This will allow the company’s website to be visible from the Internet.
3. Connections to unencrypted HTTP should be rejected, rather than denied.
Various organisations are currently encouraging webservers to offer (HTTPS) only and some
browsers now default to this. But this is not universal. If a visitor to a site tries to connect to
HTTP, but the site operator only wants to offer HTTPS, perhaps the best option is to configure the
webserver to redirect to the HTTPS version of the site. However, if changing the configuration
of the webserver is not possible, it may be friendlier to reject the connection (so an error appears
immediately) rather than denying it (which may make the visitor think the site is offline).
4. Connections to the SSH server should be allowed, but rate-limited.
This will allow encrypted login from outside, but will thwart dictionary attacks that attempt to
guess passwords by trying many likely options from a list.
• Repeat your scan from subtask 1 and log it in 02-scan.txt.
• Log the firewall rules and status in 02-firewall.sh and 02-firewall.txt, as in subtask 1.
You can compare the results of your scan with those from task 1 to check that you have implemented
points 1-3 correctly. (Not for credit: Think about how to check point 4.)
Note that the output from ufw show added is in the same format that you used to set up the firewall,
so you can refer back to it if you need to restore the firewall after rebooting.
Subtask 3: Blocking outgoing connections (4%)
Set the following firewall policy for connections from company to internet:
1. By default, outgoing connections should be allowed. Users of a network often get upset or
frustrated when their network connections are blocked. In some situations, such as in a school
computer room for children, this may be appropriate, but it is otherwise unproductive. Annoyed
users may attempt to circumvent overzealous controls, for example by using a commercial VPN,
or by tethering their computer to a mobile phone, with the result that reasonable controls in the
firewall are also circumvented.
Connections to telnet should be rejected. Telnet is an unencrypted remote login protocol, which
has almost entirely been replaced by SSH. A networked eavesdropper can easily see usernames
and passwords sent over a telnet connection. Nonetheless, some systems still offer telnet login
and some users may try telnet first, even when SSH is available, out of habit. Such habits are
difficult to break other than by force.
• From company, scan internet, using the same technique as in subtask 1. Log the output in
03-scan.txt.
• Log the firewall rules and status in 03-firewall.sh and 03-firewall.txt, as in subtask 1.
4
Subtask 4: Locking down the router (12%)
Set the following firewall policy for connections to and from router:
1. By default, all connections should be denied.
2. SSH connections from company should be allowed, but rate-limited. This will allow configuration of the router and firewall.
3. HTTPS connections from router to internet should be allowed. The router might occasionally
need to download operating system updates from the Internet, including security patches. It will
need to make outgoing connections to do this.
• Log the firewall rules and status in 04-firewall.sh and 04-firewall.txt, as in subtask 1.
In a real situation, you probably would not be able to login to router directly, as routers and other
servers often run in a physically inaccessible location, such as a data centre, server room or cupboard,
with no attached keyboard or monitor. In that case, you would have to be careful not to block yourself
from accessing the router during this task.
Subtask 5: Final checks (8%)
1. Check that company and router still respond to ping from internet. Blocking ping may have
some minor security benefits, but this has to be weighed against the cost of making it harder to
monitor and diagnose faults in the network.
Check that company is still able to download files via FTP from internet. FTP was once the
main method of downloading large files (such as software) from the Internet. Google recently
removed FTP support from the Chrome browser, but it is still fairly common. While many
common network protocols operate using a single TCP connection from client to server on a
fixed port, this is not universally true. Some protocols make connections in both directions and
use a range of ports, which makes it harder to control them with a firewall.
If either of these checks fails, change the firewall configuration so that they pass.
• In a file called 05-checks.txt, describe in English what you did to check the above.
• Log the firewall rules and status in 05-firewall.sh and 05-firewall.txt, as in
subtask 1.
2. After several weeks, you realise that router has not downloaded any operating system updates,
despite updates being available and a weekly automatic update process being scheduled. Why
might this be?
• At the bottom of 05-checks.txt, suggest what the problem might be and how you
might change the firewall to fix this.
Hints
Here are some programs that are available, some of which will be useful:
ls less nano ifconfig
nmap links ssh ftp
telnet nc sudo su
mv cp rm ufw
5
If you run a program with the option -h or --help (for example, ls --help), it will often
display a short summary of how to use it. Normally, you would be able to use the program man to
display a more detailed manual for any program, but these “manpages” are not included in TinyCore
Linux to keep the disk image small. Instead, you can view manpages from Debian Linux here:
https://manpages.debian.org/
Remember that cd is used to change the current working directory, although you probably will not
need to do so.
6
TASK DESCRIPTION (PART 2)
Introduction
This part of the coursework will guide you through adding support for an Information Flow Control
type system to a type-checker for a subset of Python. The type system will enforce noninterference
in functions whose names begin with h or l, treating variables whose names begin with h as High.
Only the most basic imperative programming features will be handled, namely assignment, while,
if, local variables and arithmetic expressions. Breaking the typing rules (for example, by writing a
High value to a Low variable) or attempting to use other features (such as function calls) will result in
a type-checking error.
Setup
The practical is based around modifying a type-checker. The type-checker is written in Python, so you
will need a Python 3 installation. It uses a parser generated by the Java parser generator ANTLR, but
the generated Python parser and the accompanying ANTLR Python runtime library are provided, so
you should not need to run ANTLR or install Java for the coursework.
Download and extract the archive python-ifc.tar.gz from Blackboard.
The Security Type-Checker
While Python is a dynamically-typed language, it is possible to write statically typed programs and
type-check them using a separate tool. This reduces the likelihood of run-time type errors.
The type system used for type-checking depends on the tool. In this coursework, you are provided
with an incomplete security type-checker, which you need to finish. In order to keep the parsing and
type-checking as simple as possible, the type-checker recognises only a very minimal subset of Python.
It will be sufficient for the test cases included, but you should not expect to be able to run it on ordinary
Python programs.
First, check that the type-checker is working correctly on your computer. Run python3
python-ifc.py simple.py. The program should finish with the message OK: parsing
and type-checking successful. This tells you that the sample program simple.py was
parsed correctly. If you got some other error, your environment is not set up correctly.
Look at the source code for simple.py. You might notice that it looks stylistically different from
other Python programs you have encountered; this is necessary to stick to the restrictive grammar.
Have a look at the grammar in Python3Parser.g4 and satisfy yourself that the test program you
compiled was syntactically valid. You will not need to modify the grammar, but you will need to
understand how different language features correspond to different syntactic classes in the grammar.
Look through the source file python-ifc.py and familiarise yourself with its structure. To add
support for the information flow control type system, you will only need to modify the parts marked
with comments saying “FILL IN HERE”.
Type-checking is implemented in SecurityTypeCheckerListener using ANTLR’s Listener API. ANTLR
provides a class called ParseTreeWalker that traverses a syntax tree from top to bottom and from left to
right. The Listener API allows a class to define functions that are called when the ParseTreeWalker first
enters a syntax tree node (from its parent node) and when it exits a syntax tree node (after traversing
all its children). These functions have names beginning enter. . . and exit. . . respectively.
The idea is that the type-checker maintains a stack of types, which it uses during type-checking. The
convention it should follow is that, after exiting a syntax tree node for a statement, block or expression,
7
the top of the stack must contain its type. There is an exit. . . function for each kind of syntax tree
node, which pops off the types of any child statements, blocks or expressions, checks that they are
compatible with whatever the node does, then pushes the type of the node onto the stack. If ever it is
impossible to give a node a valid type, the function handling the node should raise a SecurityException.
At the moment, most of the functions in SecurityTypeCheckerListener are empty stubs. You will need
to fill in the stubs in order to implement the typing rules. There is also a class called Level, which
represents the levels in a security lattice. You will need to fill in a couple of stubs there too.
If you are unsure what rule to implement, you can consult the lecture slides or (for example) Andrew
Myers’ Proving noninterference for a while-language using small-step operational semantics. You
will need to adapt the rules to Python syntax. You will also need to consider how to adapt the rules for
commands, which are written for use “bottom-up”, from conclusion to premises. The type-checker will
work in the opposite direction, starting from the syntax tree leaf nodes and working towards the root.
Subtask 1: Defining LUB and GLB operations (8%)
The security type system supposes that we have a known lattice of security levels. We will restrict
ourselves to the simplest interesting lattice, with just two levels, Low and High, with Low less than
High.
The class Level represents this within the compiler. It defines two constants, Low and High, for Low
and High security respectively. It also defines a comparison method le(l1, l2), which returns true
if l1 is less than or equal to l2 in the security lattice. Look at the definition to make sure it makes
sense to you.
There are two stubs of functions that you need to complete here: lub(l1, l2) and glb(l1,
l2).
1. The function lub(l1, l2) should return the least upper bound of l1 and l2. That is, it
should return the lowest possible level l3 such that le(l1, l3) && le(l2, l3).
2. Conversely, glb(l1, l2) should return the greatest lower bound of l1 and l2. That is, it
should return the greatest possible level l3 such that le(l3, l1) && le(l3, l2).
If the definitions of lub() and glb() sound overly technical, you might find it helpful to think of
lub() as being similar to max() and glb() as being similar to min().
Subtask 2: Type-checking expressions (16%)
Look at the class SecurityTypeCheckerListener. The class already has code to activate security typechecking only in functions whose names begin with l or h. If the function name begins with l, it
expects that value returned by the function is Low.
You will see that SecurityTypeCheckerListener has stub functions for different Python statements and
expressions. The stub methods should compile fine, but will give an error if used at run time. Begin by
completing the stubs for expressions.
1. Finish the stubs for the constants true and false and for numerical constants.
2. Finish the stub for local variable lookup.
3. Finish the stub for binary operations (such as +, * and &&).
4. Finish the stubs for logical NOT (!) and grouping using parentheses.
Hints:
8
• If you have a parse tree context (ctx) for a node that, according to the Python grammar, has one
name as a child (and possibly other, non-name children), you can get the name as a string with
ctx.name().getText().
• If you have a string (id) naming a variable, you can get the security level of the variable with
varLevel(id).
• Use self.types.pop() to pop a security type (Level) off the type stack and
self.types.append(l) to push a security type (l) onto it.
• You can use Level.lub() and Level.glb() if you think you need them.
• Where there is a choice, the type-checker should give an expression the lowest possible level.
After you implement type-checking for each kind of expression, run the type-checker on a test program
from the directory tests/. Check that the outcome is what you would expect: either a type-checking
error or success, depending on the test program. There is no need to run the test program.
Subtask 3: Type-checking statements (16%)
Now you can complete the stubs for Python statements and blocks.
1. Finish the stub for assignment to a variable.
2. Finish the stub for while loops.
3. Finish the stub for if statements. Think carefully about how to pick the type for the whole
statement.
4. Finish the stub for blocks (multiple statements in an if or while). Think carefully about what
the type of an empty block would be, if it were allowed.
Hints:
• If you have a parse tree context (ctx) for a node that, according to the Python grammar, can
have multiple statements as children, you can get them as a list with ctx.stmt().
• Remember that stacks are first-in last-out (FILO), and that ANTLR’s ParseTreeWalker visits a
node’s children from left to right.
• You can use Level.le() if you think you need it.
• Where there is a choice, the type-checker should give a statement the highest possible level.
As in the previous exercise, test the type-checker after each step.
Subtask 4: Type-checking by hand (8%)
In the lecture, you saw how to use the rules of a type system to type-check a program by drawing a
type derivation.
1. Draw a type derivation for the following program (in the simple imperative language shown on
the slides):
if l = 10 then h := l * 2 else skip
You should assume that l is Low and h is High.
2. Check that your derivation allocates the most general types possible. Expressions should be Low
if possible. Commands should be High if possible.
You can draw your type derivation by hand using a pen and paper and scan it, or draw it on a computer
using any suitable program (such as Inkscape, MS Paint, Xournal or LaTeX with Bussproofs package).
Save your drawing as a PNG, JPEG or PDF called derivation.png, derivation.jpeg or
derivation.pdf.
9
ADDITIONAL INFORMATION
Resources supplied or required
You can download the disk images company.vdi, router.vdi and internet.vdi, as well as
the source code archive python-ifc.tar.gz, from Blackboard.
For this coursework, you can either work on your own computer, or on the workstations in the computer
rooms in the Polly Vacher Building (available through a remote access service).
If you use your own computer, for part 1 you will first need to install Oracle VM VirtualBox, which
you can download from www.virtualbox.org or (for Linux users) through your distribution’s package
manager (for example, sudo apt-get install virtualbox). For part 2 you will need a
Python installation.
Practicals
There are 8 x 1-hour practicals timetabled for completing this coursework: 4 for Part 1 and 4 for Part 2.
You are advised to attend these practicals, as this will be the easiest way to obtain support. The lecturer
and a student demonstrator will be available to help. Please ask us if you are stuck. We will not tell
you exactly what to do, but we will try to provide you with guidance.
You may also wish to discuss the coursework with other students in practicals. This is both permitted
and encouraged, but please remember that this is individual coursework. Any output from a virtual
machine or code you submit must be generated by you. Every sentence submitted in answer to a
question must be written by you.
To stay on target to submit by the deadline, you are advised to follow this schedule:
• Week 2: Set up virtual machines in practical. Do part 1, subtask 1.
• Week 3: Do subtask 2.
• Week 4: Do subtasks 3 and 4.
• Week 5: Do subtask 5. Ensure part 1 is complete after the practical.
• Week 7: Set up compiler in practical. Do part 2, subtask 1.
• Week 8: Do subtask 2.
• Week 9: Do subtask 3.
• Week 10: Do subtask 4. Ensure part 2 is complete after the practical.
Please ask questions about coursework in practicals if you can, but you are also welcome to ask
questions during the lecturer’s drop-in office hours. Support for part 1 will not be available after
the end of week 5.
10
2. ASSIGNMENT SUBMISSION REQUIREMENTS
FRONT PAGE
Enter the following information alongside your submission in Blackboard:
• Module code: CSMDE21
• Assignment report title: Data security
• Student Number (for example, 25098635):
• Date of completion:
• Actual time spent on the assignment (hours):
We will use information about how long you spent on the assignment when we review and balance
coursework between modules for later years. An exact answer is not necessary, but please try to give a
reasonable approximation.
ASSIGNMENT CONTENT
You should submit your work as a .zip or .tar.gz archive through Blackboard, following the
instructions at the submission point.
Make sure you include every file you were asked to create in part 1, both Java files you edited
in part 2, and your typing derivation (which could be a .png, .jpeg or .pdf). Here is a list of
the expected files, assuming your typing derivation is a .png:
01-firewall.sh 01-firewall.txt 01-scan.txt
02-firewall.sh 02-firewall.txt 02-scan.txt
03-firewall.sh 03-firewall.txt 03-scan.txt
04-firewall.sh 04-firewall.txt
05-firewall.sh 05-firewall.txt 05-checks.txt
python-ifc.py
derivation.png
You may put the files in separate directories within the archive for tidiness if you would like, but this is
entirely optional. You may also include a file called readme.txt if you feel the need to include any
other information with your submission, but this is not expected.
11
3. ASSESSMENT CLASSIFICATIONS
This coursework assesses your ability to: scan a system using a port scanner; configure a firewall; and
implement an information flow control type system.
You will gain credit for:
• providing logs to demonstrate successful configuration of a firewall;
• answering technical questions about port scanning and firewall configuration;
• successfully implementing information flow typing rules;
• writing or drawing a correct typing derivation tree;
• following instructions about the format of your submission.
Your assignment will be marked according to the mark scheme outlined in Section 4. The mark scheme
is designed so that the mark obtained in this way will correspond to the following qualitative degree
classification descriptions:
Degree Classification Description
First Class (>= 70%) Excellent
Upper Second Class (60-69%) Good
Lower Second Class (50-59%) Satisfactory
Third Class (40-49%) Poor
Pass (35-39%) Very Poor
Fail (0-34%) Inadequate
12
4. MARKING SCHEME
There are 12 points to answer in part 1 and 12 points to answer in part 2. Each numbered point is
explained in detail in the task description (Section 2) and summarised below. 4% will be awarded for a
correct answer to each point. 2% will be awarded for a reasonable but incomplete or incorrect attempt.
A final 4% will be awarded for submitting an archive in the correct format (.zip or .tar.gz) and
containing the correct files.
Part 1
• Subtask 1: Port scanning (8%)
1. Host port scanning (4%)
2. Subnet port scanning (4%)
• Subtask 2: Blocking incoming connections (16%)
1. Default policy (4%)
2. HTTPS OK (4%)
3. HTTP blocked (4%)
4. Rate-limited SSH (4%)
• Subtask 3: Blocking outgoing connections (4%)
1. Telnet only blocked (4%)
• Subtask 4: Locking down the router (12%)
1. Default policy (4%)
2. Rate-limited incoming SSH (4%)
3. Outgoing HTTPS OK (4%)
• Subtask 5: Final checks (8%)
1. Ping and FTP (4%)
2. OS updates (4%)
Part 2
• Subtask 1: Defining LUB and GLB operations (8%)
1. Least upper bound (4%)
2. Greatest lower bound (4%)
• Subtask 2: Type-checking expressions (16%)
1. Booleans (4%)
2. Variables (4%)
3. Binary operations (4%)
4. NOT and grouping (4%)
• Subtask 3: Type-checking statements (16%)
1. Assignment (4%)
2. While loops (4%)
3. If (4%)
4. Blocks (4%)
• Subtask 4: Type-checking by hand (8%)
1. Derivation correct (4%)
2. Types most general (4%)
Submission
• Correct submission format (4%)
13