首页 > > 详细

讲解 ETB1100 A Regression Analysis辅导 留学生数据结构程序

ETB1100

Assignment:  A Regression Analysis

The Value of Linear Relationships for Decision Making in Business

Learning Objectives (LO):

LO1:  Understand how to use Excel to draw a random sample of data

LO2:  Develop a simple linear regression model using EXCEL

LO3:  Understand simple linear regression analysis including assessing the validity of the model and interpreting the findings.

LO4:  Develop the ability to analyze and interpret multiple linear regression models using ChatGPT as a collaborative tool for extending statistical knowledge.

LO5:  Describe the business implications of your multiple linear regression analysis.

Submission Details:

   This assignment is marked out of 72 and worth 10% of the assessment for this unit.

•    It is designed to test learning Objective 4: “Interpret and evaluate relationships between variables for business decision-making, using the concept of correlation and simple linear regression. 

Due Date: 11:55pm, Sunday 13th October, 2024.

•   You  must  submit  your completed  assignment  (including  the Assignment  Coversheet, correctly filled in AND signed), on-line via the Moodle site for this unit.

•    Name the soft copy of your assignment as follows:

   Student ID_Surname_Initial.doc

(this should include all tables, charts, exhibits etc produced using EXCEL)

•    DO NOT submit any EXCEL files (You should have already copied any relevant EXCEL output and pasted it into your Word document).

•   SUBMIT ONLY ONE FILE.

•    Upload this file on Moodle any time PRIOR to the deadline.  (After this time, the upload link will be closed).

   You will find the upload link in the ASSESSMENTS section on Moodle.

   Click on the “ Click Here to Upload Assignment” link to upload.

•   Once you have uploaded and saved, the following message will appear momentarily, “File uploaded successfully.”


To confirm your upload was successful, you will then see your uploaded file’s name.

•   A penalty of up to 5% of the marks earned may apply for each day an assignment is late unless  an  extension  of  time  has  been  sought.     Extensions  will  only  be  granted  for substantive reasons at the discretion of the Chief Examiner and must be applied for before the assignment is due.

•  Please retain your own copy of the assignment until after the publication of final results for this unit.

Beyond the Haze: Decoding the True Impacts of Vaping

Using Regression Analysis to Investigate the Consequences of Vaping.

https://theconversation.com/vaping-now-more-common-than-smoking-among- young-people-and-the-risks-go-beyond-lung-and-brain-damage-223125

The Assignment Brief:

With the increasing popularity of vaping comes the need for a critical review of the real health consequences hidden behind the enticing clouds of vapour.  To this end, you have been provided with some relevant data and are tasked with analysing it using correlation and regression analysis.

There are SIX parts to this analysis:

1.  Background

(AI and Generative AI tools are required to be used in this Part).

2.  Sample Acquisition [LO1]

(AI and Generative AI tools must NOT be used in this Part because it requires students to demonstrate human knowledge and skill in using EXCEL).

3.  Model Development - Correlation Analysis & Simple Linear Regression [LO2]

(AI and Generative AI tools must NOT be used in this Part because it requires students to demonstrate human knowledge and skill in using EXCEL).

4.  Model Validation and Interpretation-Simple Linear Regression [LO3]

(AI and Generative AI tools must NOT be used in this Part because it requires students to demonstrate human knowledge and skill in using EXCEL).

5.  Analysis Extension to Multiple Linear Regression [LO4]

(AI and Generative AI tools may be used selectively within this Part as per explanation provided).

6.  Conclusions and Business Implications [LO5]

(AI and Generative AI tools may be used selectively within this Part as per explanation provided).

The data comprises 100 observations across six variables that provide a comprehensive view of individual vaping habits and their potential impacts on lung health in Australia.

The variables are labelled

  Lung Function Score (Y):

•     Definition: A numerical score ranging from about 40 – 150 that represents the lung health of the individual, with higher scores indicating poorer lung function.

•     Unit: Score (no specific unit, higher scores indicate worse health)

Scores closer to 40 suggest better lung health, scores approaching or exceeding 100 suggest significantly impaired lung function)

  Nicotine Concentration (mg/mL) (X1):

•     Definition: The concentration of nicotine in the e-liquid used in the vaping device.

•     Unit: Milligrams per millilitre (mg/mL)

  Years of Vaping (X2):

•     Definition: The total number of years the individual has been using vaping products.

•     Unit: Years

  Daily Usage Frequency (X3):

•     Definition: The average number of times per day the individual uses vaping products.

•     Unit: Times per day

  Number of Flavors Used Regularly (X4):

•     Definition: The number of different flavours of e-liquid that the individual uses on a regular basis.

•     Unit: Count (no specific unit)

  Age of Vaping Initiation (X5):

•     Definition: The age at which the individual first started using vaping products.

•     Unit: Years

and can be found in the data file labelled Vaping Health Impact.xlsx under the ASSIGNMENTS heading in the ASSESSMENTS section on Moodle, and is to be used to answer the questions listed here.

Assume that the population from which this data was drawn, was approximately normally distributed.

NOTE:   All relevant EXCEL output must be copied and pasted into a single document (.docx) for submission.

DATA PREPARATION AND 6EXCEL HYGIENE,.

In all lecture examples involving the use of EXCEL, as well as the solutions to tutorial questions, I have been very particular about how to format the data, clean up the output (e.g. adjust to four decimal places, label everything, edit the charts etc etc).  This is because this ‘EXCEL hygiene’ is essential in the workplace and also highly valued.   Generating output is easy, consistently ensuring it is clearly labelled and easy to identify, understand and track, is more difficult, simply because it takes more time.  This time is worth the investment and will be expected in your report.

PART ONE:  Background                                                                     (4 marks)

(AI and Generative AI tools are required to be used in this Part).

Write an introductory paragraph about vaping in Australia with the objective of providing some context for this assignment.  You are required to use generative AI software, such as Chat GPT.


It  must  be  strictly  no  more  than  200 words,  and you  must  provide  the  prompt  and  prompt refinements you used, and of course, footnote your source.  Screenshots of ChatGPT output are acceptable.

Guideline for footnoting source:

“ChatGPT's Explanation of … … … … … … .," Generated by ChatGPT-3.5, OpenAI, September 15, 2021, [URL of the Chat or Platform]

Q1 NOTE 1:  Prompts will be assessed using the 3C’s criteria (0.5 mark each):

1.  Clarity: Clear prompts prevent confusion and guide AI effectively

2.  Context:  Rich context avoids incomplete or inaccurate outputs

3.  Creativity:  Open-ended prompts encourage diverse and innovative content

Q1 NOTE 2:  Response will be assessed using the following criteria (0.5 mark each):

1.  Relevance:  Does the response directly address the prompt's intent and context?

2.  Coherence:  Is the response logically organized and structured, ensuring easy comprehension?

3.  Completeness:  Does the response cover all relevant aspects of the prompt or leave critical gaps?

4.  Accuracy:  Are the facts, information, and details presented in the response correct and reliable?

5.  Appropriateness:  Is the tone, style, and language of the response suitable for the intended audience?

PART TWO [LO1]:  Sample Acquisition:                                            (5 marks)

(AI and Generative AI tools must NOT be used in this Part because it requires students to demonstrate human knowledge and skill in using EXCEL).

Begin your analysis by using the Random Sampling procedure demonstrated in both the lecture and tutorials in Week 10, to select a RANDOM SAMPLE of 80 observations from your data and copy and paste all EIGHT variables (Observation Number, Lung Function, Nicotine, Years of Vaping,  Daily  Usage,  Flavours,  Initiation Age,  Random  Number))  into  a  separate worksheet labelled, ‘Sample_80’, in columns B - I respectively.  In column A, you are to number the rows (1- 80) and label this column ‘Count’ .

Include a screenshot of this ‘Sample_80’ worksheet here to demonstrate you have sampled correctly.  Label this as EXHIBIT 1 and include a relevant title.


PART  THREE  [LO2]:    Model  Development-Correlation  Analysis  &  Simple Linear Regression                                     (22 marks)

(AI and Generative AI tools must NOT be used in this Part because it requires students to demonstrate human knowledge and skill in using EXCEL).

(a) Use EXCEL to produce a correlation matrix for all variables (dependent and independent), remembering to follow the approach demonstrated in Lecture 10.   (3 marks)

(b) Now use this correlation matrix to identify which independent variable has the strongest relationship with Lung Function (Y) to be used later in a regression model.  State which variable this is and what evidence led you to choose it.   (3 marks)

(c) To investigate if a linear relationship is a reasonable assumption, use EXCEL s scatterplot option to produce a graph of these two variables.   Include the line of best fit (DO  NOT INCLUDE R2  it is not to be discussed here).

Label this graph as EXHIBIT 3 with a relevant title and remember to optimise its presentation via the various formatting options available.   (4 marks)

(d) Based ONLY on the scatterplot you produced as EXHIBIT 3, does a linear relationship seem reasonable?  If so, is it a positive or negative slope?  Provide evidence for your answer and interpret what this means in context of this question.   (4 marks)

Regardless of your answer in (d), now assume that a linear relationship is reasonable.

(e) Using the Regression Analysis procedure in EXCEL, produce a simple linear regression model of Y vs X1, with the following requirements:

•   Select 99% Confidence Level in the Output Options.

•    Report all values to 4 decimal places where relevant.

•    Provide the Summary Output labelled as EXHIBIT 4 with an appropriate title.  (4 marks)

(f)  Based on this output, state the equation of this regression model (correct to 4 decimal places), remembering to define the variables.  (4 marks)

PART  FOUR  [LO3]:     Model  Validation  and  Interpretation-Simple  LineaRegression                                                                                              (18 marks)

(AI and Generative AI tools must NOT be used in this Part because it requires students to demonstrate human knowledge and skill in using EXCEL).

Before  interpreting  this  model,  it  is  first  essential  to  determine  whether  or  not  it  is  a  true representation of the relationship that exists in the population between Lung Function (Y) and the independent variable (X1) you selected in (b).  To do this, a hypothesis test of significance is required.

(a) Using a 5% level of significance, determine whether or not this relationship between Lung Function  (Y)  and  the  independent  variable  (X1)  you  selected  in  (b),  is  a  statistically significant, linear relationship.   Ensure that you clearly state your hypothesis, show ALL steps, ALL working AND interpret your conclusion IN CONTEXT of this question.  (6 marks)

Assuming now that the model you have identified is statistically significant, it is time to interpret the model.

(b) State and provide an interpretation of the Y interceptb0 and the slope coefficientb1 .  (7 marks)

(c) State  and  interpret  the  coefficient  of  determination  for  this  model,  in  context  of  this question.  (5 marks)

PART FIVE [LO4]:  Analysis Extension to Multiple Linear Regression   (15 marks)

(AI and Generative AI tools may be used selectively within this Part as per free text explanation provided).

(a) Using the Regression Analysis procedure in EXCEL, now include ALL FIVE independent variables in the regression against Lung Function (Y) to produce a Multiple Linear Regression model.

Label the regression output as EXHIBIT 5 with a relevant title, and remember to optimise its presentation via the various formatting options available  (TIP:  it  is  not  ‘user-friendly’  for management if you leave any scientific notation in the output).   (3 marks)

(b) From what you have learned about Simple  Linear Regression analysis, discuss what the Multiple Linear Regression output you produced in EXHIBIT 5 tells you?

SPECIAL INSTRUCTIONS:

Through our examination of Simple Linear Regression, we covered most of what you need to know to understand a Multiple Linear Regression model – however, not everything.

Use ChatGPT to  help you fill  in the gaps for this  Multiple  Linear  Regression  part of the assignment.  This does not mean you should use ChatGPT to do everything – if that was my intention, I would have said that.  Instead, I want to see how you can work WITH ChatGPT as your assistant, not boss.

For this part, you will be assessed on how you interact with ChatGPT, what prompts you use and then how you refine those prompts, review and add to the responses, draw your own conclusions from the responses etc.   You will  be assessed  less on the accuracy of your discussion and more on your intellectual engagement with the ChatGPT process and output.

If you choose poorly and use ChatGPT to do everything, with little to no involvement from yourself, you will be penalised heavily and likely score zero for this Multiple Linear Regression part of the assignment.

My whole intention is to provide the opportunity for you to harness the power of ChatGPT whilst remaining in the driver’s seat.   This will be a critically important experience for you, should you choose to accept it with integrity.

GUIDANCE TO FOLLOW:

1.  Even before calling on help from ChatGPT, you should comment on the things you already know about: R-Square and the p-values for each of the coefficients.

2.  Next, you should be curious about how to interpret each of the coefficients, now that there is more than one – ask for help from ChatGPT

3.  And what about the ‘Adjusted R-Square’ – is that relevant and why?   Ask for help from ChatGPT – remember, the better your question, the better the answer!

4.  And, is the ‘Significance F’ value relevant?  Ask for help from ChatGPT

Be sure to include in your answer whether or not the multiple linear regression model is better than the simple linear regression model and be able to explain why?  How?   (12 marks)


PART SIX [LO5]:  Conclusions and Business Implications              (5 marks) 

(AI and Generative AI tools may be used selectively within this Part as per explanation provided).

Time To Deliver Your Expert Opinion on This Matter

Now, referring ONLY to the Multiple Linear Regression analysis, that is, what you found and discussed in PART FIVE, in 200 words or less, describe what the business implications of your findings might be?  There will be many possible correct answers to this question, but the ones you present, must be consistent with your findings and context.

If you choose to get some help from ChatGPT, as always, you must provide the prompt and any prompt refinements you used, and of course, footnote your source.

PRESENTATION:                                                                                 (3 marks)

There are 3 marks available for presentation.  These marks will be awarded for things such as:   Easy to read; logical flow of answers; cohesive report; answers clearly labelled; appropriate font  size, borders, colour choice, labelling of graphs, care in spelling, grammar and punctuation.

ASSIGNMENT TOTAL = 4 + 5 + 22 + 18 + 15 + 5 + 3 = 72 MARKS

 

 


联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!