首页 >
> 详细

IE 6200 Midterm 2 Project

190 points

Fall 2019

For your final project you will find a dataset online and complete an extensive statistical

analysis. The material relevant to this final project will be covered through the December

2nd lecture but I recommend you find your dataset and complete most of the analysis this

week as you already have the tools to complete a large portion of this project.

1 Logistics

Due: December 6th, 9:00 pm PST - No Exceptions

2 Grading

Grading of the project: The point distribution for the project will be as follows:

50% Writing, professionalism, organization, clarity and completeness.

50% Statistical analysis.

You will need to turn in:

1. A complete, well formatted, professional report in pdf format. The report should not

contain code or raw R output. The report can be completed in R Markdown just use

the appropriate formatting to render the report so that it looks good. The appendix

should contain a copy of your code and a bibliorgraphy that includes sources including

the source of your data

2. An .R file that contains all code and allows re-production of the analysis by pushing

”run.”

3. The complete dataset in csv or excel format that can be loaded and run with your R

script. We will run your R script to ensure that it loads the data, runs and matches

the analysis presented in your report.

3 Dataset Requirements

To complete this final project you’ll need to use a dataset where the samples are independent

and the sample size is at least 30. The dataset should contain at least two quantitative

variables and at least 3 categorical variables. One of the categorical variables should have

at least 3 categories. The dataset cannot have been used previously in class, homeworks or

any other setting for this course. You cannot use the same dataset as any other student. As

soon as you know which dataset you’ll be using, post a link to it in the discussion area on

Blackboard.

4 Analysis Objectives

You will complete all of the following tests using your chosen dataset.

1. One sample t-test

(a) traditional statistical tools

(b) bootstrap methods

2. One sample test of proportion

(a) traditional statistical tools

(b) bootstrap methods

3. Two sample t-test for difference in means

(a) traditional statistical tools

(b) bootstrap methods

4. Two sample test for difference in proportions

(a) traditional statistical tools

(b) bootstrap methods

5. ANOVA

(a) traditional statistical tools

6. Chi-square goodness of fit OR test of association (you can do both but you only need

to pick one).

(a) traditional statistical tools

2

5 Report

The analysis should contain the following parts.

Part 1. Introduction: Describe the dataset that you have chosen. Describe why you chose

this dataset and why is of interest to you. You will want to consider which questions can be

answered with your chosen dataset and how you should frame the questions for each of the

tests.

• You will need to be provide an introduction to your data and question that draws the

reader in - why should I care about your analysis?

• You must use a dataset that has at least 30 data points.

• Try to find information about the sampling strategy used for your data, you should

summarize it. You should describe your concerns about the sampling strategy and

surface any questions you have about the methodology.

• You will need to describe why this dataset is of interest to you.

• You will need to describe what each variable measures, the type of the variable and

the scale of the variable.

• Data must be included in the appendix and it should also uploaded as a csv or excel

format so that it can be used to re-create your analysis.

Part 2. Exploratory analysis and data visualization: Familiarize the reader with your data

using visual tools.

• Provide an exploratory analysis of your data.

• Provide at least 4 different types of graphs that help the reader understand important

aspects of your dataset. Graphs must have all appropriate titles, labels and legends.

For each graph provide a description of why it is relevant to your study and the

question you are trying to answer. It should be clear that the visualization adds color

and interest to the statistical analysis and question of interest.

Part 3. Statistical analysis: You must complete all of the tests listed in the analysis objectives

sections using traditional statistical methods and bootstrap methods. You should

compare the results of the two methods for each test.

• You will also need to include and discuss these points for each test in your report.

– Finalize your question of interest.

– What is the statistical test you are going to use?

– What is the population parameter you would like to make inference to?

– What is the test statistic (aka sample statistic)?

– What is your null hypothesis? State clearly in words and using correct mathematical

notation.

– What is your alternate hypothesis? State clearly in words and using correct

mathematical notation.

• Explain your choice of statistical methodology and why it is the right choice in answering

your question.

• Confirm that the requirements to use the statistical method have been met. If they are

not met, explain why they are not met and what the impact will be in your analysis.

• Provide the results of your analysis in the context of your problem, complete with

correct units and interpretation.

• You must include a histogram of the sampling distribution of your statistic along with

a description of the distribution.

• You must include a histogram of the null distribution.

• You must provide a confidence interval and interpret it.

Part 4. Discussion: The discussion should gracefully conclude your analysis.

• Summary of your findings.

• Implications of your findings.

• Extensions and limitations.

• Further questions, next steps.

Part 5. Appendix: The Appendix should contain a copy of all of your code. It should also

contain a bibliography that cites any resources used including the dataset source. Use proper

citation guidelines.

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-23:00
- 微信：codinghelp2

- Tsp课程作业代写、代做algorithms留学生作业、代做java，C/C 2020-06-23
- Kit107留学生作业代做、C++编程语言作业调试、Data课程作业代写、代 2020-06-23
- Sta302h1f作业代做、代写r课程设计作业、代写r编程语言作业、代做da 2020-06-22
- 代写seng 474作业、代做data Mining作业、Python，Ja 2020-06-22
- Cmpsci 187 Binary Search Trees 2020-06-21
- Comp226 Assignment 2: Strategy 2020-06-21
- Math 504 Homework 12 2020-06-21
- Math4007 Assessed Coursework 2 2020-06-21
- Optimization In Machine Learning Assig... 2020-06-21
- Homework 1 – Math 104B 2020-06-20
- Comp1000 Unix And C Programming 2020-06-20
- General Specifications Use Python In T... 2020-06-20
- Comp-206 Mini Assignment 6 2020-06-20
- Aps 105 Lab 9: Search And Link 2020-06-20
- Aps 105 Lab 9: Search And Link 2020-06-20
- Mech 203 – End-Of-Semester Project 2020-06-20
- Ms980 Business Analytics 2020-06-20
- Cs952 Database And Web Systems Develop... 2020-06-20
- Homework 4 Using Data From The China H... 2020-06-20
- Assignment 1 Build A Shopping Cart 2020-06-20