Semester One Final Examinations, 2017 INFS7901 Database Principles

Page 1 of 12

This exam paper must not be removed from the venue

School of Information

Technology and Electrical

Engineering

EXAMINATION

Semester One Final Examinations, 2017

INFS7901 Database Principles

This paper is for St Lucia Campus students.

Examination Duration: 90 minutes

Reading Time: 10 minutes

Exam Conditions:

This is a School Examination

This is a Closed Book Examination - specified materials permitted

During reading time - write only on the rough paper provided

This examination paper will be released to the Library

Materials Permitted In The Exam Venue:

(No electronic aids are permitted e.g. laptops, phones)

Calculators - Any calculator permitted - unrestricted

One A4 sheet of handwritten or typed notes double sided is permitted

Materials To Be Supplied To Students:

1 x 14 Page Answer Booklet

1 x Multiple Choice Answer Sheet

Instructions To Students:

Additional exam materials (eg. answer booklets, rough paper) will be

provided upon request.

Semester One Final Examinations, 2017 INFS7901 Database Principles

Page 4 of 12

Question 3. (10 Marks) SQL

Answer the following questions using queries in SQL.

For this question, we will be using the schema that was introduced in Question 2:

• Student (sID, sName, age, GPA, sizeHS)

• College (cName, state, enrolment)

• Apply (sID, cName, major, decision)

a) (4 marks) Find IDs of students that are not from the smallest high school.

b) (6 marks) Find the id(s) and name(s) of the youngest student(s) that have been

accepted in the CS major in each of the states.

Semester One Final Examinations, 2017 INFS7901 Database Principles

Page 5 of 12

Module 2

Question 4. (10 Marks) Asymptotic Analysis

Assume that you are asked to make a recommendation between two algorithms (A and

B) that accomplish exactly the same task. After measuring the run times for various

inputs sizes, you determine that the average running time (for input of size n) of algorithm

A is TA(n) = 3n2, and the average running time of algorithm B is TB(n) = 4n + 100.

a) (2 marks) For n =10, which algorithm would run faster? Explain your answer

b) (2 marks) For n = 1000, which algorithm would run faster? Explain your

answer

c) (2 marks) Determine the big-o notation of each of the algorithms.

d) (4 marks) Which algorithm would you recommend? Refer to your answers

from part a, b, and c in your recommendation.

Semester One Final Examinations, 2017 INFS7901 Database Principles

Page 6 of 12

Question 5. (6 Marks) Sorting

In class, a variety of sorting algorithms were examined. For parts (a) to (c), choose

the best algorithm from the ones we have examined. Justify your answer (in a

sentence or two) referencing the running time of the algorithms.

a) (2 marks) You are a data analyst and you are trying to put an input of a few

million numbers into order. You are aware that the input is almost sorted and

there may only be a few numbers that you need to move around. Which

sorting algorithm would you use?

b) (2 marks) You are sorting a few million numbers in a cloud environment in

which read operations take O(1), constant time, and write operations take

O(n), linear time. Which sorting algorithm would you use?

c) (2 marks) You are part of an analysis team that is operating under very strict

and sensitive timelines, so you want to be able to guarantee that the sorting

algorithm that you use runs in O(n log n) in the absolute worst case. What

sorting algorithm would you use?

Question 6. (4 Marks) Sorting

An array is supplied to the partition routine used by quicksort(). After the very first call

to partition() the array contains the following numbers.:

what number(s) in the array can be the pivot? Note the actual number(s) is wanted,

not the position of the number(s) in the array.

Semester One Final Examinations, 2017 INFS7901 Database Principles

Page 7 of 12

Question 7. (10 Marks) Binary Search Trees

This question makes reference to the following Binary Search Tree:

a) (1 mark) What is the height of this tree?

b) (1 mark) Which node is the root

node?

c) (1 mark) What is the depth of node

19?

d. (1 mark) Which nodes are leaves?

e. (2 marks) When building a tree using a sequence of insertions (i.e. the insertion

technique from class) which one of the following is the appropriate sequence of

insertions to build the tree?

I. 23, 60, 17, 80, 40, 19, 10, 90, 21, 11, 6, 81

II. 23, 81, 90, 80, 40, 60, 21, 19, 11, 6, 10, 17

III. 23, 17, 10, 6, 11, 19, 21, 40, 60, 80, 90, 81

IV. 6, 11, 10, 17, 21, 19, 23, 40, 60, 81, 90, 80

V. 6, 10, 11, 17, 19, 21, 23, 40, 60, 80, 81, 90

VI. 6, 11, 10, 21, 19, 17, 40, 81, 90, 80, 60, 23

VII. 81, 6, 11, 21, 90, 10, 19, 40, 80, 17, 60, 23

VIII. 81, 90, 21, 11, 6, 80, 40, 19, 10, 60, 17, 23

Semester One Final Examinations, 2017 INFS7901 Database Principles

Page 9 of 12

Question 8. (6 Marks) Hashing

Consider a hash table of size 7 with hash function h(k) = k mod 7. Draw the contents

of the table after inserting, in the given order, the following values into the table: 9, 16,

27, 62, and 2.

a) (3 marks) When chaining is used to resolve collisions

Question 9. (4 Marks) Hashing

For good hashing performance, we need to have a good hash function. Can we create an ideal

hash function that would work well in all problems independent of the dataset? If so explain

how, and if not explain why?

Semester One Final Examinations, 2017 INFS7901 Database Principles

Page 10 of 12

Module 3

Question 10. (10Marks) Indexing

For this question, we will be using the schema that was introduced in Question two:

• Student (sID, sName, age, GPA, sizeHS)

• College (cName, state, enrolment)

• Apply (sID, cName, major, decision)

Suppose you know that the following queries are the most common queries in the

database and that all are roughly equivalent in frequency and importance:

1. Find name, age, GPA, and the high school size of student based on their SID

2. Find name of colleges in each state

3. Find students IDS with GPA higher than 3.6

4. Find SID of students that have applied to each college

These queries occur much more frequently than updates, so you should build whatever

indexes you need to speed up these queries. However, you should not build any un-

necessary indexes, as updates will occur (and would be slowed down by unnecessary

indexes). Given this information, decide which attributes should be indexed and whether

each index should be a clustered index or an unclustered index. Assume that both B+

trees and hashed indexes are supported by the DBMS and that both single- and multiple-

attribute index search keys are permitted.

Semester One Final Examinations, 2017 INFS7901 Database Principles

Page 11 of 12

Question 11. (10 Marks) Query Optimization

For this question, we will be using the schema that was introduced in Question two:

• Student (sID, sName, age, GPA, sizeHS)

• College (cName, state, enrolment)

• Apply (sID, cName, major, decision)

Consider the following query

Select s.sID

from Student s, College c, Apply a

Where s.sID = a.sID and c.cName=a.cName and

GPA>3.5 and c.cName='stanford' and decision='Y'

a) (5 marks) Draw the initial query tree for this query and explain briefly why this

plan is considered inefficient.

Semester One Final Examinations, 2017 INFS7901 Database Principles

Page 12 of 12

b) (5 marks) Transform the initial query into an equivalent final query tree that is

efficient to execute.

END OF EXAMINATION