首页 >
> 详细

CSCI 3151: Assignment 2

In this assignment you will:

a) review and extend your understanding of vector algebra and derivatives of functions of multiple

variables.

b) experiment building and evaluating various machine learning models on different data sets. You will

learn how to handle the practicalities of running machine learning algorithms, and critically assess their

performance on the given data sets. You will also practice digging into the sklearn documentation and

online resources.

Start working on the assignment as soon as you receive it.

Use the discussion group on Brightspace to post questions you have as new threads. You will get

feedback from classmates and me (the instructor), and you will collect points for class participation.

Q1 [3]. Digital Probabilities.

In this question you will explore various properties of random variables using Python generators for

them.

a) Write a Python program that throws n times a single fair die with f faces, numbered 1..f, with

probability of each face equal to 1

f

, and returns the numeric average s of the values of the face up over

the n throws. Run this experiment m times.

(i) Compute the experimental (sample) mean and experimental (sample) variance of s based on the

data from m runs, as a function of m

(ii) Plot the histogram of s, as a discrete function over the interval 0, 1, 2, ..., n. Discuss the shape of

the resulting histogram as a function of m.

(iv) What are the theoretical values of the mean and variance of s? Explain your answer.

(iii) Plot the absolute difference of the experimental mean and variance of s from their theoretical values

as a function of m. Discuss the result.

b) Building on the program in (a) for n = 2,

(i) estimate the probability of event A (both throws resulted in an even number), of event B (at least

one throw resulted in an even number) and of the conditional probability of A given B, as a function of

m. Estimating the probability is equivalent to counting the frequency of occurrence of the event in the

number of runs m scaled appropriately to a value in the interval [0, 1].

(ii) Compute the theoretical probabilities of P(A) and P(A|B).

(ii) Plot the absolute difference of the computed values from their theoretical values as a function of m.

(iii) Formulate the estimation of the probabilities in part (i) in the context of the discussion of estimation

in the lecture

c) Consider event C (first throw is even) and D (second throw is ≤ 3). Are these two events independent?

(i) Prove your answer theoretically.

(ii) Intuitively verify your theoretical answer by computing the required probabilities as frequencies,

and see if the independence condition is approximately satisfied.

d) Given a population, we define two events, H = “Have a headache”, and F = “Coming down with

Flu”. The associated probabilities are P(H) = 1/8, P(F) = 1/30, P(H|F) = 1/3.

i) Calculate theoretically the P(HF), P(HF¯), P(HF¯ ), P(H¯F¯)

ii) Build a generative model of a population of m persons, according to the above probabilities. Clearly

justify your approach.

1

iii) Using your population data, estimate using frequencies P(H), P(F) and P(H|F), and plot as a

function of m.

Q2 [3]. Optimization in action

In this question you will implement gradient descent in Python.

Consider a function of two variables, z = f(x, y) = (x − 2)2 + (y − 3)2

. Implement in Python based

on first principles the gradient descent algorithm for estimating the minimum of this function. Pick a

random initial point (x0, y0), and update it by making a small step of size α in the opposite direction

of the gradient of f(x, y) calculated at (x0, y0). Iterate the computation. The new point at each step

is (xi+1, yi+1) = (xi

, yi) − α∇f(xi

, yi). Organize your code to be as general as possible (with respect

to choice of function). Follow good programming practices: add liberal comments, use good naming

conventions for variables, use matrices and vectors, instead of for loops on their scalar elements. Plot

on the (x, y) plane the trajectories of the points (xi

, yi) until convergence for different values of α.

Convergence is defined as the condition |(xi+1, yi+1) − (xi

, yi)| < . Select a meaningful value of .

Discuss the speed of convergence as α varies.

Q3 [3]. Analog Probabilities

In this question, you will review probability basics. The recommended format for your solution is

as markdown cells in your notebook, formatted as markdown text and equations using LaTeX. Use

an online equation editor, like http://www.sciweavers.org/free-online-latex-equation-editor,

click on ”Convert” to view the formatted equation, and copy-paste the resulting LaTeX code into your

markdown cell, enclosed in $...$. Or write them by hand and scan them (use the CamScanner app on

your cell phone for this). Save as image on Google Drive or One Drive and link to it from a markdown

cell in your notebook.

a) A box contains three fair coins and one biased coin. For the biased coin, the probability that any

flip will result in a head is 1/3. Al draws two coins from the box, flips each of them once, observes an

outcome of one head and one tail and returns the coins to the box. Bo then draws one coin from the

box and flips it. The result is a tail. Determine the probability that neither Al nor Bo removed the

biased coin from the box.

b) A box contains N items, K of which are defective. A sample of M items is drawn from the box at

random. What is the probability that the sample includes no defective items if the sample is taken:

i) with replacement

ii) without replacement

c) Show from first principles that the expected value of the sum of two random variables is equal to the

sum of the expected values of the random variables, i.e. E(x + y) = E(x) + E(y).

d) Show from first principles that the variance of a random variable σ

2

x = E{[x − E(x)]2} = E(x

2

) −

[E(x)]2

e) Show from first principles that the expected value of the product xy is equal to the product of

the expected values of the individual random variables x and y, E(xy) = E(x)E(y), if x and y are

independent.

2

Marking the assignment

Refer to the rubric that will be used for marking the assignment.

Submitting the assignment

1. Your assignment as a single .ipynb file including your answers to both the math and the experimental

questions, in the correct order, should be submitted before the deadline on Brightspace.

Use markdown syntax to format your answers

For equations, you can either

a. Format them using latex (enclose latex code in $...$ for inline equations and $$...$$ for

displayed equations). For a quick reference of latex syntax, visit here.

b. Write them with neat handwriting, scan into a png file, and include the png file in your notebook,

using this syntax: ![alt text](imageURL)

Consider the CamScanner app on your mobile phone for scanning.

2. You can submit multiple editions of your assignment. Only the last one will be marked. It is

recommended to upload a complete submission, even if you are still improving it, so that you have

something into the system if your computer fails for whatever reason.

3. IMPORTANT: PLEASE NAME YOUR PYTHON NOTEBOOK FILE AS:

Milios-Evangelos-Assignment-1.ipynb

A 10% penalty to the assignment mark will be applied for a misnamed notebook file, i.e. your

mark will be multiplied by 0.9.

4. In addition to your .ipynb file, please upload a blank rubric file, which you download from this

URL. A 10% penalty to the assignment mark will be applied for uploading a zip file, instead of

two separate files (notebook + rubric).

5. The markers will enter your marks and their overall feedback in the rubric file on Brightspace, and

they will upload your Python notebook file with comments on specific cells, as a new markdown

cell below the cell being commented on.

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-23:00
- 微信：codehelp

- 代写dataset留学生作业、代做c++,Java，Python程序语言作业 2020-04-01
- Comp 8042作业代做、代写c/C++程序语言作业、代做g++课程设计作 2020-04-01
- 代写cs304留学生作业、代做c++编程设计作业、代写c/C++课程作业、D 2020-04-01
- Cs544留学生作业代做、Programming作业代写、R编程设计作业代做 2020-04-01
- Csc73010作业代写、代做programming作业、Java语言作业代 2020-04-01
- Logistic Regression作业代做、代写java，Python语 2020-04-01
- Envx3002作业代写、代做statistics课程作业、代做r语言作业、 2020-03-31
- 代做cs2034留学生作业、代写data Analytics作业、Pytho 2020-03-31
- Csi3131作业代做、Java编程设计作业调试、Java语言作业代做、代写 2020-03-31
- Webcms3作业代写、代做data Service作业、C/C++语言作业 2020-03-31
- Stat7017 Final Project 2020-03-29
- Cs3214 Spring 2020 Project 1 - “Extens 2020-03-29
- Co3090/Co7090 Distributed Systems And ... 2020-03-29
- Hw2: Sql 2020-03-29
- Hw1: 5 Points Entity-Relational (Er) 2020-03-29
- Math 104A Homework #3 2020-03-29
- Comp 250 Assignment 2 2020-03-29
- Cs 570课程作业代写、Program作业代做、C++语言作业代写、代做j 2020-03-29
- Comp-424作业代做、代写intelligence作业、Python，C 2020-03-29
- Database作业代做、代写cap Theorem作业、代写java程序语 2020-03-29