首页 > > 详细

讲解留学生Prolog|讲解SPSS|辅导R语言程序|讲解R语言编程

Group project and presentation
Marks
Report – 15%
Presentation – 15%
Project Task Marks
1 BostonHousing.csv data contains these variables:
In the data set, the outcome variable is the MEDV (The median value of owneroccupied
homes in $1000s). The average MEDV is assumed to be approximately
normally distributed as a function of the covariates.
As a data analyst for a real estate company, you are given the task to model the
variables that are related to the MEDV. The findings are useful to help your
company understand the factors that may influence the median value of
homes. In the long run, the company plan to exercise the understanding from
your model to select and choose the house that can bring in the maximum
profit.
Write and present a report that includes the objectives of the analysis, the
methods for analysis, the results from the analysis, the interpretation and the
conclusion from the analysis.
Refer to attachment 1 for the format of report and presentation.8%
A major book store collected data to understand the factors that may influence
the purchase of a new book ‘The art history of Florence’. The dataset was
named CharlesBookClub.csv. It contain these variables :
Variables R, F and M refer to:
R = recency, time since last purchase
F = frequency, number of previous purchases from the company over a period
M = monetary, amount of money spent on the company’s products over a
Period
And the codes of variable Recency, Frequency and Monetary were coded as
below:
Recency:
0–2 months (Rcode = 1)
3–6 months (Rcode = 2)
7–12 months (Rcode = 3)
13 months and up (Rcode = 4)
Frequency:
1 book (Fcode = l)
2 books (Fcode = 2)
3 books and up (Fcode = 3)
Monetary:
$0–$25 (Mcode = 1)
$26–$50 (Mcode = 2)
$51–$100 (Mcode = 3)
$101–$200 (Mcode = 4)
$201 and up (Mcode = 5)
7%
The outcome of the variable is Florence and it is coded as either 1 (the Art
History of Florence was bought) and 0 if it was not.
Note: You need to convert this variable to a factor variable before analysis in R.
Use the dataset to run three separate regression models:
a. The full set of predictors in the dataset (exclude Seq and ID)
b. A subset of predictors that you judge to be the best
c. Only the R, F, and M variables
Write and present a report that includes the objectives of the analysis, the
methods for analysis, the results from the analysis, the interpretation and the
conclusion from the analysis.
Refer to attachment 1 for the format of report and presentation.
Attachment 1:
Instruction for the writing of the report:
1. Write a short report with for each project.
2. The report should contain these headings: Introduction, Methods, Results and Discussion
(IMRaD) format.
3. Introduction
a. Briefly write the introduction for the project
b. State the rationale of doing the analysis
c. List the appropriate study objectives
4. Methods
a. Describe the plan of data wrangling, data visualization
b. Describe the plan for statistical analysis
5. Results
a. Present the results of the analysis
b. Write the interpretation of the analysis
6. Discussion
a. Discuss the application of your results
b. Summarize your findings
7. The report should not be longer than 4 pages of A4 paper (max 2 pages for project 1 and
max 2 pages for project 2).
R codes *expand the codes or use other relevant codes as required
library(tidyverse)
library(broom)
Project 1:
summary()
mymod <- lm(outcome ~ covariates, data = yourdata)
tidy(mymod)
Project 2:
yourdata <- yourdata %>% mutate(outcome = factor(outcome))
mymod2 <- glm(outcome ~ covariate, family = binomial, data = yourdata)
tidy(mymod2)

联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!