首页 >
> 详细

CSCI 3151: Assignment 2

In this assignment you will:

a) review and extend your understanding of vector algebra and derivatives of functions of multiple

variables.

b) experiment building and evaluating various machine learning models on different data sets. You will

learn how to handle the practicalities of running machine learning algorithms, and critically assess their

performance on the given data sets. You will also practice digging into the sklearn documentation and

online resources.

Start working on the assignment as soon as you receive it.

Use the discussion group on Brightspace to post questions you have as new threads. You will get

feedback from classmates and me (the instructor), and you will collect points for class participation.

Q1 [3]. Digital Probabilities.

In this question you will explore various properties of random variables using Python generators for

them.

a) Write a Python program that throws n times a single fair die with f faces, numbered 1..f, with

probability of each face equal to 1

f

, and returns the numeric average s of the values of the face up over

the n throws. Run this experiment m times.

(i) Compute the experimental (sample) mean and experimental (sample) variance of s based on the

data from m runs, as a function of m

(ii) Plot the histogram of s, as a discrete function over the interval 0, 1, 2, ..., n. Discuss the shape of

the resulting histogram as a function of m.

(iv) What are the theoretical values of the mean and variance of s? Explain your answer.

(iii) Plot the absolute difference of the experimental mean and variance of s from their theoretical values

as a function of m. Discuss the result.

b) Building on the program in (a) for n = 2,

(i) estimate the probability of event A (both throws resulted in an even number), of event B (at least

one throw resulted in an even number) and of the conditional probability of A given B, as a function of

m. Estimating the probability is equivalent to counting the frequency of occurrence of the event in the

number of runs m scaled appropriately to a value in the interval [0, 1].

(ii) Compute the theoretical probabilities of P(A) and P(A|B).

(ii) Plot the absolute difference of the computed values from their theoretical values as a function of m.

(iii) Formulate the estimation of the probabilities in part (i) in the context of the discussion of estimation

in the lecture

c) Consider event C (first throw is even) and D (second throw is ≤ 3). Are these two events independent?

(i) Prove your answer theoretically.

(ii) Intuitively verify your theoretical answer by computing the required probabilities as frequencies,

and see if the independence condition is approximately satisfied.

d) Given a population, we define two events, H = “Have a headache”, and F = “Coming down with

Flu”. The associated probabilities are P(H) = 1/8, P(F) = 1/30, P(H|F) = 1/3.

i) Calculate theoretically the P(HF), P(HF¯), P(HF¯ ), P(H¯F¯)

ii) Build a generative model of a population of m persons, according to the above probabilities. Clearly

justify your approach.

1

iii) Using your population data, estimate using frequencies P(H), P(F) and P(H|F), and plot as a

function of m.

Q2 [3]. Optimization in action

In this question you will implement gradient descent in Python.

Consider a function of two variables, z = f(x, y) = (x − 2)2 + (y − 3)2

. Implement in Python based

on first principles the gradient descent algorithm for estimating the minimum of this function. Pick a

random initial point (x0, y0), and update it by making a small step of size α in the opposite direction

of the gradient of f(x, y) calculated at (x0, y0). Iterate the computation. The new point at each step

is (xi+1, yi+1) = (xi

, yi) − α∇f(xi

, yi). Organize your code to be as general as possible (with respect

to choice of function). Follow good programming practices: add liberal comments, use good naming

conventions for variables, use matrices and vectors, instead of for loops on their scalar elements. Plot

on the (x, y) plane the trajectories of the points (xi

, yi) until convergence for different values of α.

Convergence is defined as the condition |(xi+1, yi+1) − (xi

, yi)| < . Select a meaningful value of .

Discuss the speed of convergence as α varies.

Q3 [3]. Analog Probabilities

In this question, you will review probability basics. The recommended format for your solution is

as markdown cells in your notebook, formatted as markdown text and equations using LaTeX. Use

an online equation editor, like http://www.sciweavers.org/free-online-latex-equation-editor,

click on ”Convert” to view the formatted equation, and copy-paste the resulting LaTeX code into your

markdown cell, enclosed in $...$. Or write them by hand and scan them (use the CamScanner app on

your cell phone for this). Save as image on Google Drive or One Drive and link to it from a markdown

cell in your notebook.

a) A box contains three fair coins and one biased coin. For the biased coin, the probability that any

flip will result in a head is 1/3. Al draws two coins from the box, flips each of them once, observes an

outcome of one head and one tail and returns the coins to the box. Bo then draws one coin from the

box and flips it. The result is a tail. Determine the probability that neither Al nor Bo removed the

biased coin from the box.

b) A box contains N items, K of which are defective. A sample of M items is drawn from the box at

random. What is the probability that the sample includes no defective items if the sample is taken:

i) with replacement

ii) without replacement

c) Show from first principles that the expected value of the sum of two random variables is equal to the

sum of the expected values of the random variables, i.e. E(x + y) = E(x) + E(y).

d) Show from first principles that the variance of a random variable σ

2

x = E{[x − E(x)]2} = E(x

2

) −

[E(x)]2

e) Show from first principles that the expected value of the product xy is equal to the product of

the expected values of the individual random variables x and y, E(xy) = E(x)E(y), if x and y are

independent.

2

Marking the assignment

Refer to the rubric that will be used for marking the assignment.

Submitting the assignment

1. Your assignment as a single .ipynb file including your answers to both the math and the experimental

questions, in the correct order, should be submitted before the deadline on Brightspace.

Use markdown syntax to format your answers

For equations, you can either

a. Format them using latex (enclose latex code in $...$ for inline equations and $$...$$ for

displayed equations). For a quick reference of latex syntax, visit here.

b. Write them with neat handwriting, scan into a png file, and include the png file in your notebook,

using this syntax: ![alt text](imageURL)

Consider the CamScanner app on your mobile phone for scanning.

2. You can submit multiple editions of your assignment. Only the last one will be marked. It is

recommended to upload a complete submission, even if you are still improving it, so that you have

something into the system if your computer fails for whatever reason.

3. IMPORTANT: PLEASE NAME YOUR PYTHON NOTEBOOK FILE AS:

Milios-Evangelos-Assignment-1.ipynb

A 10% penalty to the assignment mark will be applied for a misnamed notebook file, i.e. your

mark will be multiplied by 0.9.

4. In addition to your .ipynb file, please upload a blank rubric file, which you download from this

URL. A 10% penalty to the assignment mark will be applied for uploading a zip file, instead of

two separate files (notebook + rubric).

5. The markers will enter your marks and their overall feedback in the rubric file on Brightspace, and

they will upload your Python notebook file with comments on specific cells, as a new markdown

cell below the cell being commented on.

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-23:00
- 微信：codinghelp

- 代做csci 4155作业、代做python课程作业、代写date留学生作业 2020-02-21
- 代写csi4142作业、代做electrical作业、Sql编程语言作业代做 2020-02-21
- Cse 482作业代做、代写data Analysis作业、代写java语言 2020-02-21
- Sample Prediction作业代做、代写data课程作业、代写r编程 2020-02-20
- 代写lr留学生作业、代做r程序设计作业、代写sample Predictio 2020-02-20
- Stats 102A作业代做、代做r课程设计作业、代写r程序语言作业、代写s 2020-02-19
- 625.664留学生作业代做、代写python语言作业、代做java，C/C 2020-02-19
- 代做csi 410留学生作业、代写database Systems作业、代做 2020-02-18
- 代写computer Networks作业、Python编程设计作业调试、P 2020-02-18
- Sta490y L0201 2020-02-17
- Comp 3005作业代写、代写c++语言作业、代做database作业、C 2020-02-16
- Beng0019作业代做、代写java编程设计作业、代做c/C++，Pyth 2020-02-16
- Macm 316 – Computing Assignment 2 2020-02-17
- Reverse Polish Notation (Or Postfix No... 2020-02-17
- Price Predictions Project 2020-02-17
- Ug3 Operating Systems 2020-02-16
- Homework 1 Use Marching Cubes Algori... 2020-02-16
- Module Ics-33 Quiz #2: File Reading, E... 2020-02-16
- S&As: Stat1603 Introductory Statistics 2020-02-16
- Fit5145 - Introduction To Data Scienc... 2020-02-16