Project: IS3230 Java Programming for Business
1
Project
Individual Assignment
Background
Axis-aligned Boxes Input
Query Points Input
OPEQ Algorithm
OPEQ Output Format
Visualization of 2D Inputs
Application Flow
Application Outputs
Submission
Group Project
Group Formation
Project Proposal
Java Code
Presentation Slide
Presentation
You can SELECT the course project to be
EITHER an individual assignment that is
designated by the course instructor OR a
group project that is designed by your group.
For individual assignment, the topic is
assigned by the course instructor. You are
only required to submit the java source code.
Presentation is NOT required.
For group project, each group is required to
submit the group members information,
project proposal, java source code and
presentation slide, and also give a
presentation for the project.
Individual Assignment [top]
To complete this assignment, you may need the knowledge of Java Fundamentals, Decision
structures, Loops, File I/O, Arrays, Methods, Classes, Data structure classes, Exception handling
and Graphical User Interface. Methods, Classes, Data structure classes and exception handling
are not compulsory for this individual assignment. However, with applying these knowledge, the
assignment can be completed in an easier, cleaner and higher scalability ways.
For reference only, this assignment can be completed in about 200 to 300 lines of code (the sum
of the number of lines in all Java source files). This is NOT a limitation. Just to let you know the
scale of this assignment.
Background: [top]
Data management is an important topic in Information System. One of the famous problems in
data management is called Orthogonal Point Enclosure Query (OPEQ). This individual
assignment requires you to build a software for solving the OPEQ problem in Java. The
OPEQ problem definition is as below:
Project: IS3230 Java Programming for Business
2
Given a set S of n axis-aligned boxes in a d-dimensional space, preprocess these n boxes
into a data structure so that the boxes in S containing a given query point q can be reported
efficiently.
Below is an example in 2D. In this example, there are 5 axis-aligned boxes in the 2D plane,
colored in different colors.
Suppose there is a query point in the position as shown in the black dot below.
The blue and yellow boxes should be output because these two boxes contains the query point.
Why this problem is important in data management?
Consider the following couple matching website example.
A couple matching website with n members. Each member submitted his/her age (e.g. 24), salary
(e.g. $32,000), preferred range of partner's age (e.g. 20 - 30) and preferred range of partner's
salary (e.g. $20,000 - $35,000). When the website wants to know a member with age 24 and
salary $32,000 is suitable to which members, the website needs to get all members preferences
that the preferred age range contains 24 (e.g. 20 - 30) and preferred salary range contains
$32,000 (e.g. $20,000 - $35,000). Imagine the x-axis represents salary and y-axis represents age.
The age and salary preferences of each member is then represented as an axis-aligned rectangle
in 2D. See the image below as an example. The age and salary of the member to be matched is
the query point. This is exactly the Orthogonal Point Enclosure Problem in 2D.
Project: IS3230 Java Programming for Business
3
Axis-Aligned Boxes Input: [top]
The axis-aligned boxes data that has to be input to the OPEQ problem is in a plain-text comma
separated value (.csv) file, named boxes.csv. Each line of the csv file represents an axis-aligned
box. For any line of the file, the first value represents the lower boundary of the box in the 1st
dimension, the second value represents the upper boundary of the box in the 1st dimension, the
third value represents the lower boundary of the box in the 2nd dimension, and so. In general, the
i-th value of the j-th line represents
the lower boundary of box j in the ceiling(i/2)-th dimension if i is odd, OR
the upper boundary of box j in the (i/2)-th dimension if i is even.
Below is an example (The file content for testing your submission is different with the
example below but in the same format):
20,30,20000,35000
18,22,15000.4,28000.5
30,45,18234.3,25835.7
In the above example, it represents 3 (because 3 lines) boxes in 2D (because every line has 4
values).
The lower and upper boundaries of box 1 in the 1st and 2nd dimensions are 20 and 30
respectively.
The lower and upper boundaries of box 1 in the 1st and 2nd dimensions are 20000 and 35000
respectively.
The lower and upper boundaries of box 2 in the 1st and 2nd dimensions are 18 and 22
respectively.
The lower and upper boundaries of box 2 in the 1st and 2nd dimensions are 15000.4 and 28000.5
respectively.
The lower and upper boundaries of box 3 in the 1st and 2nd dimensions are 30 and 45
respectively.
The lower and upper boundaries of box 3 in the 1st and 2nd dimensions are 18234.3 and 25835.7
respectively.
Project: IS3230 Java Programming for Business
4
You may assume the file contains no space, all values are numbers, number of values is even and
the same for every line, and the file is in correct csv format.
Query Points Input: [top]
The query points data that has to be input to the program for solving the OPEQ problem is in a
plain-text comma separated value (.csv) file, named query-points.csv. Each line of the csv file
represents a query point. For any line of the file, the first value represents the location of the point
in the 1st dimension, the second value represents the location of the query point in the 2nd
dimension, and so on. In general, the i-th value of the j-th line represents the location of the j-th
query point in the i-th dimension.
Below is an example (The file content for testing your submission is different with the
example below but in the same format):
17,21000.3
20,25000.5
45,20000.4
In the above example, it represents 3 (because 3 lines) query points in 2D (because every line
has 2 values).
The location of the first point in the 1st and 2nd dimensions are 17 and 21000.3 respectively.
The location of the second point in the 1st and 2nd dimensions are 20 and 25000.5 respectively.
The location of the third point in the 1st and 2nd dimensions are 45 and 20000.4 respectively.
You may assume the file contains no space, all values are numbers, number of values is the same
for every line, and the file is in correct csv format.
OPEQ Algorithm: [top]
You can use the simplest algorithm, called brute force algorithm, to solve the OPEQ problem. The
brute force algorithm simply checks all boxes one by one for every query point. If a box contains
the query point that you are current checking for, then put this box to some memory storage for
output later.
You are also welcome to solve the OPEQ problem by any other algorithms. Basically, no algorithm
will slower than the brute force algorithm. However, in case you implemented an algorithm that is
far slower than brute force, then some mark deduction may be given.
OPEQ Output Format: [top]
Suppose the boxes.csv and query-points.csv contains the content below as an example (Your
program may be tested with other content in the two files):
boxes.csv content
10.54,21.35,15.25,17.48,3.15,6.27
12.4,23.5,11.37,18.88,4.5,8.7
20.1,25,18,21,2.5,7.6
31.4,35.2,21.7,80.21,54,67
Project: IS3230 Java Programming for Business
5
33.5,36.5,54.5,70.8,45,85
21,24,17,20,3,8
query-points.csv content
11.3,16,4
13.2,16.8,6.1
22.1,17.7,4
21.6,18.5,5.1
After solving the OPEQ problem, the program should output the following content to a file named
results.txt.
Query Point 1:
1
Query Point 2:
1
2
Query Point 3:
6
Query Point 4:
2
3
6
The rule is that, for each query point, we write Query Point i to the file where i represents that this
is the i-th query point in the query-points.csv file (start from 1 and from top to bottom). After that,
we write the box ids that contains this query point, one line per box id, where the box id is the line
number of this box in the boxes.csv file. In the results.txt file, the box ids for each query point
has to be sorted in ascending order.
So, consider the given example above. It means the following.
The first box in boxes.csv (at line 1) is the only box that contains the first query point (at line 1 of
query-points.csv).
The first and second box in boxes.csv (at line 1 and 2) are the only two boxes that contains the
second query point (at line 2 of query-points.csv).
The sixth box in boxes.csv (at line 6) is the only box that contains the third query point (at line 3
of query-points.csv).
The second, third and sixth boxes in boxes.csv (at line 2, 3 and 6) are the boxes that contains the
fourth query point (at line 4 of query-points.csv).
Your program has to be able to output the results in this format after solving the OPEQ problem.
Visualization of 2D Inputs: [top]
If the input is not in 2-dimensional space, then prompt the user that Visualization only available
for 2D inputs. If the input is in 2-dimensional space, then a GUI window is opened that draws the
boxes and query points.
The window size has to be 1000px width and 1000px height excluding the title bar and window
boundaries, i.e. the actual window size is over 1000px width and over 1000px height when
including the title bar and window boundaries. The border width of rectangles (representing the
boxes) is 1px. The query point is represented by a square of 4px width and 4px height.
Project: IS3230 Java Programming for Business
6
The inputs has to be scaled and translated so that it fits the window size and centered. More
precisely, let BB denotes the tightest boundary box of the input boxes, i.e. the smallest rectangle
that encloses the boxes. See image below as an example, the black rectangle is the tightest
bounding box.
Suppose the width of BB is larger than the height of BB. Then BB is scaled so that the left (right)
boundary of BB touches the left (right) boundary of the window. The boxes and query points
inside BB are scaled accordingly. BB is also translated so that it is vertically centered, and thus
the boxes and query points inside BB are also translated accordingly. This is the case that BB's
width larger than BB's height. In the case of BB's height larger than BB's width, it is symmetrically
the same. Note again you can assume the query points must inside the tightest bounding box.
Image below shows some rectangles and query points inside the window content pane before
scale and translate. Recall that the black rectangle is the tightest bounding box which is not an
Project: IS3230 Java Programming for Business
7
input box.
In this example, the width of the tightest bounding box is larger than the height. After scale and
translate, it looks like below. Note again, the bounding box is just to illustrate the scale and
Project: IS3230 Java Programming for Business
8
translate process, it should not be displayed (see example in the application outputs section).
Note again you can assume the query points must inside the tightest bounding box.
Application Flow: [top]
The high-level flow of the application is shown as the flowchart below.
Project: IS3230 Java Programming for Business
9
Application Outputs: [top]
When the program runs, the boxes.csv and query-points.csv files are read, and the menu is
printed to screen. Note that you can assume the boxes.csv and query-points.csv files always
exist in the same directory as the application run but you cannot assume where your application is
placed for execution. In other words, if you used correct relative paths for reading the two files,
then you don't need to worry about the file doesn't exist.
Project: IS3230 Java Programming for Business
10
If option 4 is selected, a Goodbye message is printed to screen, and then the program terminates.
If option 3 is selected without selecting option 2 before, an empty file named results.txt is saved,
a file saved successfully message is printed to screen, and then the program terminates.
If the user input is non-integer, then prompt the user that the input has to be integer, display the
menu again, and ask for the user input again.
Project: IS3230 Java Programming for Business
11
Suppose option 1 is selected, but the input boxes and query points are not in 2D, e.g. the
boxes.csv and query-points.csv contains the following contents for 1D.
boxes.csv content
10.54,21.35
12.4,23.5
20.1,25
query-points.csv content
11.3
12.7
The program prompts the user that Visualization only available for 2D inputs, then display the
menu again and ask for the user input again.
Now suppose option 1 is selected with the following 2D content in the boxes.csv and querypoints.csv
files.
boxes.csv content
10.54,21.35,15.25,17.48
12.4,23.5,11.37,18.88
20.1,25,18,21
31.4,35.2,21.7,80.21
33.5,36.5,54.5,70.8
21,24,17,20
query-points.csv content
11.3,16
13.2,16.8
22.1,17.7
21.6,18.5
A GUI window appear for visualizing the boxes and query points (scaled and translated). In the
console window, it displays the menu again and ask for the user input.
Project: IS3230 Java Programming for Business
Note that you can still work with the program in the console window even though the GUI window
is opening. Close the GUI window will NOT terminate the program.
Now suppose the boxes.csv and query-points.csv file contains the following 3D content.
boxes.csv content
10.54,21.35,15.25,17.48,3.15,6.27
12.4,23.5,11.37,18.88,4.5,8.7
20.1,25,18,21,2.5,7.6
31.4,35.2,21.7,80.21,54,67
33.5,36.5,54.5,70.8,45,85
21,24,17,20,3,8
query-points.csv content
Project: IS3230 Java Programming for Business
13
11.3,16,4
13.2,16.8,6.1
22.1,17.7,4
21.6,18.5,5.1
If option 2 is selected, the program solves the OPEQ problem for all query points. The result is
kept in memory for later output. NO OUTPUT TO FILE AT THIS STEP. After solving the OPEQ
problem, then display the menu again and ask for the user input again.
If option 3 is selected after option 2 is selected, the OPEQ result obtained from the option 2 is
saved to the file named results.txt. A file successfully saved message is printed to screen, and
then the program terminates.
Submission: [top]