The total mark for this assignment is 100.
This homework assignment accounts for 3% of the total marks for this course.
This homework is due in your lecture class in Week 5 (either 27 March or 28 March as
appropriate).
Please refer to the document “Homework Submission Requirements” to prepare your
submission.
The following three questions refer to four data sets regarding the company Orion Star:
employee_master, customer, orders, and products. All of them can be found in your OrionDB folder.
Question 1 (35%)
Summer is coming! Orion Star would like to target at teenager customers. Before the company
launches a marketing campaign, your manager asks you to provide the list of products whose
customers are teenagers.
Write a query to list the products and for each product find the average age of the customers (when
they made the orders) that falls between 13 and 19 years old. Your solution should look like the
following table. Sort your data based on customer’s age in increasing order. Submit the first and last
10 rows if it has more than 20 rows.
Question 2 (35%)
Supplier Eclipse Inc is a partner of Orion Star. To encourage employees to sell Eclipse products, Orion
Star is paying a commission to each employee who has sold Eclipse products. The commission is 5%
of the total profit that an employee has helped the company to make from selling Eclipse products.
Write a query to list the employees and their total commissions. Your result should look like the
following table. Sort your data based on Employee_ID in increasing order. Submit the first and last
10 rows if it has more than 20 rows.
Question 3 (30%)
Supplier Eclipse Inc is recalling their products. Eclipse asks your assistance to provide the list of
customers who bought their products delivered in 2008 and whose first three digits of product ID
are 220.
Write a query to list the information of the customers who should be notified for the recall. Your
result should look like the following table. Sort your data based on Customer_ID in increasing order.
Submit the first and last 10 rows if it has more than 20 rows.
Homework #3 Solutions
Question 1 (35%)
proc sql;
select p.Product_Name, mean((o.Order_Date-c.Birth_date)/365.25) as
age 'Customer_Avg_Age' format=5.2
from orion.orders as o
inner join
orion.customer as c
on o.Customer_ID=c.Customer_ID
inner join
orion.products as p
on p.Product_ID = o.Product_ID
group by p.Product_Name
having 13 <= calculated age <= 19
order by calculated age;
quit;
First 10 rows
Last 10 rows
Many of you put the condition 13 <= calculated age <= 19 in the WHERE clause and had
a wrong answer. What’s the difference? FROM and WHERE tell SAS where to search data. If you put
the above condition in WHERE, you are looking for the average age of the customers whose ages are
between 13 and 19. This is wrong because you want SAS to find the average age of all customers.
What will happen if you put this condition in the HAVING clause. Note that HAVING comes with
GROUP BY, and the GROUP BY clause is meaningful only when you use some summary function in
SELECT. Image there is a big table that SAS was working on, in each row you have one product
name, one customer who bought the product, and the customer’s age. Because many customers
bought the same product, you use the GROUP BY clause
to tell SAS for each
product name, perform the summary function, which is avg(.) of customers’ ages in this case. At the
end, the rows corresponding to the same product name were merged together. The resultant table
has all products and their corresponding average customer ages, some may be between 13 and 19,
some may not.
The HAVING clause asks SAS to display which portion of the resultant table. In this case we are only
interested in the products whose average customer ages are between 13 and 19. If you still have
questions, let me know.
Question 2 (35%)
proc sql;
select e.Employee_ID, sum(o.Profit)*0.05 'Commision'
format=dollar7.2
from orion.orders as o
inner join orion.products as p
on p.Product_ID = o.Product_ID
inner join orion.employee_master as e
on e.Employee_ID=o.Employee_ID
where p.Supplier_Name='Eclipse Inc'
group by e.Employee_ID
order by e.Employee_ID;
quit;
The result table (less than 20 rows) is given below.
Question 3 (30%)
proc sql;
select c.Customer_ID, p.Product_ID, c.Customer_Name,
c.Customer_Address from orion.orders as o
inner join
orion.products as p
on p.Product_ID=o.Product_ID
inner join
orion.customer as c
on o.Customer_ID=c.Customer_ID
where p.Supplier_Name='Eclipse Inc' and '01Jan2008'd <=
o.Delivery_Date <= '31Dec2008'd and 220 <= p.Product_ID/1000000000 <
221
order by c.Customer_ID;
quit;