辅导INMR77 解析asp编程、asp编程解析

Informatics MSc Programme Area

Henley Business School

University of Reading

Assessed Coursework Set Front Page

Module code: INMR77

Module name: Business Intelligence and Data Mining

Lecturer responsible: Dr Yin Leng Tan

Work to be handed in by:

Full time students: 26 May 2020

Part time students: 15 June 2020

Assignment Specification

The module is assessed 100% through this coursework assignment.

The aim of this coursework is to assess your understanding of business intelligence and ability

to perform data mining tasks by applying concepts, methods and techniques learned during

the lectures and practical sessions.

The coursework is carried out individually. Students are required to produce an individual

report for the tasks as set out below. The complete report should not exceed 20 pages of A4

(with a variation of 20%) with a minimum font size of 10, including tables and diagrams but

excluding references and appendices. An appendix can be used to include more detailed

materials to back up main body points but will not be assessed. In addition, you are also

required to submit the supplementary materials of your output from SAS Enterprise Miner via

blackboard by the specified deadline.

Case Study - Airbnb and Inside Airbnb

Airbnb - Holiday Lets, Homes, Experiences Places (airbnb.co.uk)

Airbnb is an online marketplace for arranging or offering lodging i.e. temporary

accommodation, primarily homestays, or tourism experiences. It was founded in August 2008

and has 12,736 employees as of 2019.

Service overview: Airbnb provides a platform for hosts to accommodate guests with short-term

lodging and tourism-related activities. Guest can search for accommodation using filters such

as location, price, and specific types of homes. Before booking, users must provide personal

and payment information. Some hosts also require a scan of government-issued identification

before accepting a reservation. Hosts provide prices and other details for their rental or listing

e.g. number of guests included in the price, type of property, type of room, number of

bathrooms, number of bedrooms, number of beds and type of bed, minimum number of nights

for a reservation, and amenities. In addition, Airbnb also provides a review system where hosts

and guests can leave reviews about their experience, and rate each other after a stay. By

October 2019, two million people were staying with Airbnb each night.

Cancellation policy: Airbnb allows hosts to choose between five types of cancellation policies,

made to protect both hosts and guests. Options include: strict_14_with_grace_period,

moderate, flexible, super_strict_30, super_strict_60.

(see https://www.airbnb.co.uk/home/cancellation_policies for definition for each categories)

Security Deposits: some reservations include a security deposit, which can be required by either

Airbnb or the host. This helps build trust for both guests and hosts. Some hosts require a

security deposit for their listing. If you are a guest and you are booking a listing with a host with

host-required security deposit, you will be shown the amount before you make your

reservation. The amount is set by the host, not Airbnb. In this case, no authorisation hold will

be placed, and you will only be charged if a host makes a claim on the security deposit.

(see https://www.airbnb.co.uk/help/article/140/how-does-airbnb-handle-security-deposits

Sources: Wikipedia, Airbnb.co.uk

Further information of Airbnb, please visit: https://www.airbnb.co.uk/

Inside Airbnb – adding data to the debate (http://insideairbnb.com/index.html)

Inside Airbnb is an independent, non-commercial set of tools and data that allows an individual

to explore how Airbnb is really used in cities around the world. It was set up by Murray Cox and

John Morries in 2016.

Airbnb claims to be part of the “sharing economy” and disrupting the hotel industry. However,

data shows that the majority of Airbnb listings in most cities are entire homes, many of which

are rented all year round – disrupting housing and communities. For example, local residents

and governments are more concerned with people who are not present when the rental takes

place and those who have multiple listing on the site, as opposed to a user who is renting a

spare room.

By analysing publicly available information about a city’s Airbnb’s listings, Inside Airbnb

provides filters and key metrics so user can see how Airbnb is being used to compete with the

residential housing market. With Inside Airbnb, user can ask fundamental questions about

Ainrbnb in any neighbourhood, or across the city as a whole, such as:

• how many listings are in my neighbourhood and where are they?

• how many houses and apartments are being rented out frequently to tourists and not

to long-term residents?

• how much are hosts making from renting to tourists (compare that to long-term

rentals)?

• which host are running a business with a multiple listings and where are they?

These questions (and the answers) get to the core of the debate for many cities around the

world, with Airbnb claiming that their hosts only occasionally rent the homes in which they live.

In addition, many city or state legislation or ordinances that address residential housing, short

term or vacation rentals, and zoning usually make reference to allowed use, including:

• how many nights a dwelling is rented per year

• minimum nights stay

• whether the host is present

• how many rooms are being rented in a building

• the number of occupants allowed in a rental

• whether the listing is licensed

The Inside Airbnb tool or data can be used to answer some of these questions. Some

understanding of how the Airbnb platform is being used will help clear up the laws as they

change.

Source: insideairbnb.com

Further information of Inside Airbnb, please visit: http://insideairbnb.com/index.html

Airbnb in Greater Manchester, UK

Dataset: Airbnb_man_reduced.csv (available to download on blackboard), two additional

datasets man_reviews.csv, and man_calander.csv are also provided for information only.

Description of the dataset: The Airbnb data for Greater Manchester is made available by Inside

Airbnb. The original data set was downloaded from the website in November 2019. The

number of variables however is reduced from the original data set. There are 4,848 listings in

the data set with a total of 57 variables. Each row represents a single listing and contains

information about the host of the property, the property’s characteristics and overall rating of

the property, and its associated features by guests. Table 1 shows the name, description, and

type of the 57 variables.

Table 1: variable name and description of the variable for the dataset.

# Variable Name Description Variable Type

1. listing_id Unique identifier for each Airbnb

listing

Numeric

2. listing_url url of the listing Text

3. description Description of the listing Text

4. house_rule Description of house rules Text

5. host_id Unique identifier of the host Numeric

6. host_url url of the host Text

7. host_name Name of the host Text

8. host_since Date since the host is a member Date

9. host_about Description of the host Text

10. host_response_time How quickly the host responds to

inquiries. 5 categories: within a day,

with an hour, a few days or more,

within a few hours, N/A

Categorical

11. host_response_rate Rate at which host responded to

inquiries (percentage value)

Numeric

12. host_is_superhost Is the host a superhost (1 = Yes, 0 =

No)

Binary

13. host_identity_verified Whether the host is verified or not (1 =

Yes, 0 = No)

Binary

14. neighbourhood_cleased Name of the neighbourhood (41

categories)

Categorical

15. borough Name of the borough (10 categories) Categorical

16. property_type Type of the property (30 categories) Categorical

17. room_type Type of the room. 4 categories: Entire

home/apt, Private room, shared room,

hotel room

Categorical

18. accomodates Number of people that can be

accommodated

Numeric

19. bathrooms Number of bathrooms Numeric

20. bedrooms Number of bedrooms Numeric

21. beds Number of beds Numeric

22. bed_type Type of bed. 6 categories: Real Bed,

Pull-out Sofa, Futon, Couch, Airbed

Categorical

23. amenities List of amenities included Text

24. price Price per night (in GBP) Numeric

25. weekly_price Price per week (in GBP) Numeric

26. monthly_price Price per month (in GBP) Numeric

27. Security_deposit Amount of host-required security

deposit.

Numeric

28. cleaning_fee One-time fee charged by host to cover

the cost of cleaning their space.

Numeric

29. guest_included Number of quests included in the price Numeric

30. extra_people Additional charge per person (GBP) Numeric

31. minimum_nights Minimum number of nights for a

reservation

Numeric

32. maximum_nights Maximum number of nights for a

reservation

Numeric

33. calendar_updated Calendar last updated by the host (70

categories)

Categorical

34. has availability Weather the host has availability or

not (1 = Yes, 0 = No)

Binary

35. availability_30 Number of days available for the next

30 days

Numeric

36. availability_60 Number of days available for the next

60 days

Numeric

37. availability_90 Number of days available for the next

90 days

Numeric

38. availability _365 Number of days available for the next

365 days

Numeric

39. number_reviews number of reviews in total Numeric

40. first_review Date of first review Date/Time

41. last_review Date of last review Date/Time

42. review_scores_rating Overall rating of the property

(percentage value)

Numeric

43. review_scores_accuracy Rating for the accuracy of the

description

Numeric

44. review_scores_cleanliness Rating for the cleanliness of the

property

Numeric

45. review_scores_checkin Rating for the check in experience Numeric

46. review_scores_communication Rating for the host communication

with guests

Numeric

47. review_scores_location Rating for the location of the property Numeric

48. review_scores_value Rating for the value of the property Numeric

49. instant_bookable Whether the property can be booked

in an instance (1 = Yes, 0 = No)

Binary

50. cancellation_policy The cancellation policy for the host. 5

categories:

strict_14_with_grace_period,

moderate, flexible, super_strict_30,

super_strict_60

Categorical

51. require_guest_profile_picture Whether guest profile picture is

required or not (1= Yes, 0 = No)

Binary

52. require_guest_phone_verificati

Whether guest phone verification is

required or not (1= Yes, 0 = No)

Binary

53. host_listings_count The number of listings of the host Numeric

54. host_listings_count_entire_ho

mes

The number of listings of the entire

home

Numeric

55. host_listings_count_private_ro

oms

The number of listings of private

rooms

Numeric

56. host_listings_count_shared_roo

The number of listing of shared rooms Numeric

57. reviews_per_month Number of reviews per month for the

property

numeric

The local government and residents would like to know how Airbnb is used in the region and

seek your help on this. They would particularly like to know how many of the listings/hosts are

offering lodging and not running as a business i.e. temporary accommodation, primarily

homestays, or tourism experiences and, as opposed to hosts offering long term let with

multiple listing with no owner present (likely to be running a business) which could be illegal.

You goals are to:

a) identify clusters of listings based on different (or a combination) set of variables e.g.

host’s characteristics, listings/property’s characteristics and availability, and reviews

from guests so as to provide insights to the local government and residents.

Note: The are many measurements could be used to differentiate the two e.g. single

listing vs multiple listings although a host may list separate rooms in the same

apartment, or multiple apartments or entire homes. Availability is another measure,

likewise, occupancy. You are asked to justify the variables/measurements used for your

clustering tasks. Greater Manchester uses the following parameters for the

measurements:

• a high availability metric and filter of 60 days per year

• a frequent rented filter of 60 days per year

• a review rate of 50% for the number of guests marking a booking who leave a

review

• an average booking of 3 nights unless a higher minimum nights is configured

for a listing

• a maximum occupancy rate of 70% to ensure the occupancy model does not

produce artificially high results based on the available data (see

http://insideairbnb.com/greater-

manchester/?neighbourhood=filterEntireHomes=falsefilterHighlyAv

ailable=falsefilterRecentReviews=falsefilterMultiListings=false

b) select what you think is the best segmentation/clustering based on the results obtained

in a) and comment on the characteristics. E.g. clusters that best separate between those

are genuine lodging vs those could be illegal i.e. running as a business.

c) develop a classification model to identify those are genuine listings/host vs those could

be considered illegal based on your results obtained in b).

Useful information/websites:

• Clampter (2014) Airbnb in NYC: The Real Numbers Beind the Sharing Story – available

at https://skift.com/2014/02/13/airbnb-in-nyc-the-real-numbers-behind-the-sharing-

story/

• Inside Airbnb http://insideairbnb.com/index.html

What to deliver in the final report:

You report should include the following sections:

1. Introduction: This should include background of Airbnb and Inside Airbnb,

opportunities and challenges of the sharing economy to the business (Airbnb), home

owners (hosts), local residents and governments, and guests/tourists, and how

business intelligence and data mining could be used to address the opportunities and

challenges for the various stakeholders. It should also outline how the report is

structured. Justify your answer with examples/data and findings from literature and

related work in this area.

2. Model building and Results Discussion

a) Identify clusters of listings

In this section, you should discuss the purpose of the data mining tasks, the data

mining process, including data exploration and data preparation/preprocessing,

and approaches taken e.g. variables used for the clustering. You are expected to

justify and discuss any action/decision you made during the data mining process

and models building, make references to your output in SAS Enterprise Miner

within your report where necessary.

Note: In deciding what k to use (and also how many variables to include), the

following factors should be considered: How distinct are the clusters? Is good

separation achieved? How consistent are they? If cluster#1 shows low values on

one measure, does it also show low value on other measures. How simple are they

to describe? Simple clusters are more interpretable by domain knowledge experts,

easier to take action on, and are more likely to be statistically stable and not the

result of random chance.

b) Discuss what is the best segmentation/clustering based on the results obtained

from the process in a). You should discuss what you think is the best

segmentation and comment on the characteristic of these clusters. Consider how

this information could be used by local government and residents. Use

screenshots and/or make references to your output in SAS Enterprise Miner to

illustrate important and interesting findings where necessary.

c) Develop a classification model that classify the data into these segments.

In this section, you should discuss the purpose of the data mining, including the

target segment/cluster, the data mining process, including data

preparation/preprocessing, and rationale and approaches taken e.g. variables used

for the model building. You are expected to justify and discuss any action/decision

you made during the data mining process and models building, as well as model

evaluation, make references to your output in SAS Enterprise Miner within your

report where necessary.

3. Conclusion, critical evaluation and suggestion for improvement

In this section, you are required to conclude and provide a summary of your key

findings, and discuss the limitations of your data models/mining/analyses and

suggestion for improvement by taking into consideration current research issues in

data mining.

The criteria used for grading assignment:

Aspects/Criteria % Range Descriptors

Introduction

(ILO-1, ILO3, ILO5)

70% and

above

A highly effective introduction, setting context and

indicating content that will follow.

Wide background reading; novel examples and use of

relevant literature/sources in supporting the

arguments/viewpoints.

60-69% A very good introduction, setting context and indicating

content that will follow.

Good background reading; generally very good use of

examples and relevant sources/literature in supporting

the arguments/viewpoints.

50-59% Adequate introduction incorporating one or more of

the above, yet lacking in clarity in some area(s).

Good use examples and sources/literature in

supporting the arguments/viewpoints.

49% and

below

A basic introduction with a narrow or limited reference

to defining the area, setting the context and indicating

content that will follow.

Little evidence of appropriate reading or ability to

synthesise information. No or little examples given.

Model Building,

Results Discussion

and Model

Evaluation

(ILO2, ILO3, ILO4,

ILO6)

70% and

above

Novel and originality. A coherent, well focused, original

approaches in the model building, entirely relevant to

the tasks with excellent support and justifications for

the variables, techniques used for the modelling.

Excellent discussion and interpretation of the obtained

results/analysis with original insights.

Excellent model evaluations and comparisons provided

with clear evidence of critical analysis of findings.

60-69% A generally clear and coherent discussion with good

support or justification for the model building, which is

directly relevant to the tasks. Clear rationale for the

approaches taken.

Very good discussion and interpretation of the

obtained results/analysis.

Very good model evaluations and comparisons

provided with some critical analysis of findings.

50-59% Reasonable attempt of the modelling but prone to

being descriptive or narrative; little rationale for the

approaches taken or justification of the variable used.

Generally relevant to the stated tasks.

Reasonable discussion and interpretation of the

obtained results/analysis.

Reasonable discussion of model evaluations and

comparisons though with little evidence of critical

analysis of findings.

49% and

below

Little discussion and evidence of model building.

Failure to understand the purpose of the task.

Little discussion and interpretation of the obtained

results/analysis.

Little or no discussion of model evaluations and

comparisons

Conclusion, critical

evaluation and

future

improvements

(ILO1, ILO5 and

ILO6)

70% and

above

Comprehensive and extremely well discussed with

original insights drawing from the analyses conducted

and suggestion for future improvements.

69-69% Very well discussed with interesting insight, drawing

from the results/analyses conducted. Very good critical

evaluation and suggestion for future improvement.

50-59% Reasonably discussed but prone to being descriptive

with little critical analysis based on the results/analyses

conducted. Generally relevant to the stated tasks.

Some critical analysis but prone to being descriptive or

narrative; evidence supports the conclusion, but not

always very directly /clearly. The question is not fully

addressed.

49% and

below

Largely descriptive. The discussion is limited in scope

and/or relevance. The question is only partially

addressed.

联系我们

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-21:00
微信：codinghelp

热点文章

辅导 fit1013 digital futures... 2025-09-30
讲解 cs 131 – problem set 5... 2025-09-30
辅导 geos2821, 2025. major p... 2025-09-30
讲解 psyc7901 – research de... 2025-09-30
讲解 psyc7901: evaluating re... 2025-09-30
辅导 computer homework 3讲解... 2025-09-30
辅导 32513 31005 advanced da... 2025-09-30
讲解 quantitative economics ... 2025-09-30
辅导 intermediate microecono... 2025-09-30
讲解 finm1416 workshop 2 ass... 2025-09-30
辅导 ct017-3-1-introduction ... 2025-09-30
讲解 che 301: chemical engin... 2025-09-30
讲解 mine3003 & mine5003... 2025-09-30
讲解 indigenous data soverei... 2025-09-30
讲解 assignment 1辅导 c/c++语... 2025-09-30
辅导 comm 87 - statistical a... 2025-09-30
辅导 mathematics调试数据库编... 2025-09-30
辅导 stats 2da3 fall 2025 as... 2025-09-30
讲解 mark 304 tourism market... 2025-09-30
辅导 met ad 571 assignment 4... 2025-09-30

热点标签

msinm014/msing014/msing014b

联系我们 - QQ: 99515681 微信：codinghelp

程序辅导网！