首页 > > 详细

辅导 CSE 6242 / CX 4242: Data and Visual Analytics Spring 2025 HW 2: Tableau, D3 Graphs, and Visualiza

CSE 6242 / CX 4242: Data and Visual Analytics

Spring 2025

HW 2: Tableau, D3 Graphs, and Visualization

"Visualization gives you answers to questions you didn't know you have" -Ben Schneiderman

Download the HW2 Skeleton before you begin

Homework Overview

Data visualization is an integral part of exploratory analysis and communicating key insights. This homework focuses on exploring and creating data visualizations using two of the most popular tools in the field; Tableau and D3.js. Questions 1-4 use data from BoardGameGeek, featuring games'ratings, popularity, and metadata, to showcase the uses and strengths of different types of visualizations. Question 5 shifts focus to a different dataset on wildlife trafficking incidents, offering an opportunity to apply visualization techniques to address a global issue.

Below are some terms you will often see in the questions:

•    Rating – a value from 0 to  10 given to each game. BoardGameGeek calculates a game's overall rating in different ways including Average and Bayes, so make sure you are using the correct rating called for in a question. A higher rating is better than a lower rating.

•    Rank – the overall rank of a boardgame from 1 to n, with ranks closer to 1 being better and n being the total number of games. The rank may be for all games or for a subgroup of games such as abstract games or family games.

The maximum possible score for this homework is 100 points. Students have the option to complete any 90 points’ worth of work to receive 100% (equivalent to 15 course total grade points) for this assignment.  They can earn more than 100% if they submit additional work. For example, a student scoring 100 points will receive 111% for the assignment (equivalent to 16.67 course grade points, as shown on Canvas).

Important Notes

A.  Submit your work by the due date on the course schedule.

a.   Every assignment has a generous 48-hour grace period, allowing students to address unexpected minor issues without facing penalties. You may use it without asking.

b.   Before the grace period expires, you may resubmit as many times as needed.

c.   TA assistance is not guaranteed during the grace period.

d.   Submissions during the grace period will display as "late" but will not incur a penalty. e.  We will not accept any submissions executed after the grace period ends.

B.  Always use the most up-to-date assignment (version number at the bottom right of this document). The latest version will be listed in Ed Discussion.

C.  You may discuss ideas with other students at the "whiteboard" level (e.g., how cross-validation works, use HashMap instead of array) and review any relevant materials online. However, each student must write up and submit the student's own answers.

D.  All incidents of suspected dishonesty, plagiarism, or violations of the Georgia Tech Honor Codewill be subject to the institute's Academic Integrity procedures, directly handled by theOffice of Student Integrity (OSI). Consequences can be severe, e.g., academic probation or dismissal, a 0 grade for assignments concerned, and prohibition from withdrawing from the class.

Submission Notes

A.  All questions are graded on the Gradescope platform, accessible through Canvas.

a.   Question 1 will be manually graded after the final HW due date and Grace Period.

b.  Questions 2-5 are auto graded at the time of submission.

B.  We will not accept submissions anywhere else outside of Gradescope.

C.  Submit all required files as specified in each question. Make sure they are named correctly.

D.  You may upload your code periodically to Gradescope to obtain feedback on your code. There are no hidden test cases. The score you see on Gradescope is what you will receive.

E.  You must not use Gradescope as the primary way to test your code. It provides only a few test cases and error messages may not be as informative as local debuggers. Iteratively develop and test your code locally, write more test cases, and follow good coding practices. Use Gradescope mainly as a "final" check.

F.  Gradescope cannot run code that contains syntax errors. If you get the “The autograder failed to execute correctly” error, verify:

a.   The code is free of syntax errors (by running locally)

b.  All methods have been implemented

c.   The correct file was submitted with the correct name

d.   No extra packages or files were imported

G.  When many students use Gradescope simultaneously, it may slow down or fail.  It can become even slower as the deadline approaches. You are responsible for submitting your work on time.

H.  Each submission and  its score will  be  recorded  and  saved  by  Gradescope.  By  default,  your  last submission is used for grading. To use a different submission, you MUST “activate” it (click the “Submission History” button at the bottom toolbar, then “Activate”).

Do I need to use the specific version of the software listed?

Under each question, you will see a set of technologies with specific versions - this is what is installed on the autograder and what it will run your code with. Thus, installing those specific versions on your computer to complete the question is highly recommended. You may be able to complete the question with different versions installed locally, but you are responsible for determining the compatibility of your code. We will not award points for code that works locally but not on the autograder.

Q1 [25 points] Designing a good table. Visualizing data with Tableau.

Goal

Design a table, a grouped bar chart, and a stacked bar chart with filters in Tableau.

Technology

Tableau Desktop

Deliverables

Gradescope: After selecting HW2 - Q1, click Submit Images. You will be taken to a list of questions for your assignment. Click Select Images and submit the following four PNG images under the corresponding questions:

● table.png: Image/screenshot of the table in Q1.1

● grouped_barchart.png: Image of the chart in Q1.2

● stacked_barchart_1.png: Image of the chart in Q1.3 after filtering data for Max.Players = 2

● stacked_barchart_2.png: Image of the chart in Q1.3 after filtering data for Max.Players = 4

Q1 will be manually graded after the grace period.

Setting Up Tableau

Install and activate Tableau Desktop by following “ HW2 Instructions” on Canvas. The product activation key is for your use in this course only. Do not share the key with anyone. If you already have Tableau Desktop installed on your machine, you may use this key to reactivate it.

If you do not have access to a Mac or Windows machine, use the 14-day trial version of Tableau Online:

1. Visithttps://www.tableau.com/trial/tableau-online

2. Enter your information (name, email, GT details, etc.)

3. You will then receive an email to access your Tableau Online site

4. Go to your site and create a workbook

If neither of the above methods work, use Tableau for Students. Follow the link and select "Get Tableau For Free". You should be able to receive an activation key which offers you a one-year use of Tableau Desktop at no cost by providing a valid Georgia Tech email.

Connecting to Data

1.   It is optional to use Tableau for Q1.1. Otherwise, complete all parts using a single Tableau workbook.

2.   Q1 will require connecting Tableau to two different data sources. You can connect to multiple data sources within one workbook by following the directions here.

3.   For Q1.1 and Q1.2:

a.   Open Tableau and connect to a data source. Choose To a File – Text file. Select the popular_board_game.csv file from the skeleton.

b.   Click on the graph area at the bottom section next to "Data Source" to create worksheets.

4.   For Q1.3:

a.  You will need a data.world account to access the data for Q1.3.  Add a new data source by clicking on  Data – New Data Source.

b.  When connecting to a data source, choose To a Server – Web Data Connector.

c.   Enter this URLto connect to the data.world data set on board games. You may be prompted to log in to data-world and authorize Tableau. If you haven’t used data.world before, you will be required to create an account by clicking “Join Now” . Do not edit the provided SQL query.

NOTE: If you cannot connect to data-world, you can use the provided csv files for Q1 in the skeleton. The provided csv files are identical to those hosted online and can be loaded directly into Tableau.

d.   Click  the  graph  area  at  the  bottom  section  to  create  another  worksheet,  and  Tableau  will automatically create a data extract.

Table and Chart Design

1.   [5 points] Good table design. Visualize the data contained in popular_board_game.csv as a data table (known as a text table in Tableau). In this part (Q1.1), you can use any tool (e.g., Excel, HTML, Pandas, Tableau) to create the table.

We are interested in grouping popular games into "support solo" (min player = 1) and

"not support solo" (min player > 1). Your table should clearly communicate information about these two groups simultaneously. For each group (Solo Supported, Solo Not Supported), show:

a.   Total number of games in each category (fighting, economic, ...)

b.   In each category, the game with the highest number of ratings. If more than one game has the same (highest) number of ratings, pick the game you prefer. NOTE:Level of Detail expressionsmay be useful if you use Tableau.

c.   Average rating of games in each category (use simple average), rounded to 2 decimal places.

d.  Average playtime of games in each category, rounded to 2 decimal places.

e.   In the bottom left corner below your table, include your GT username (In Tableau, this can be done by including a caption when exporting an image of a worksheet or by adding a text box to a dashboard. If you use Tableau, refer to the tutorialhere).

f.   Save the table as table.png. (If you use Tableau, go to Worksheet/Dashboard  Export  Image). NOTE: Do not take screenshots in Tableau since your image must have high resolution. You can take a screenshot If you use HTML, Pandas, etc.

Your learning goal here is to practice good table design, which is not strongly dependent on the tool that you use. Thus, we do not require that you use Tableau in this part. You may decide the most meaningful column names, the number of columns, and the column order. You are not limited to only the techniques described in the lecture. For OMS students, the lecture video on this topic is Week 4 - Fixing Common  Visualization Issues - Fixing Bar Charts, Line Charts. For campus students, reviewlecture slides 42 and 43.

2.   [10 points] Grouped bar chart. Visualize popular_board_game.csv as a grouped bar chart in Tableau. Your chart should display game category (e.g., fighting, economic,...) along the horizontal axis and game count along the vertical axis. Show game playtime (e.g., <=30, (30, 60]) for each category. NOTE: Do not differentiate between “support solo” and “non-support solo” for this question.

a.   Design a vertically grouped bar chart. For each category, show the game count for each playtime.

b.   Include clearly labeled axes, a clear chart title, and a legend.

c.   In the bottom left corner of your image, include your GT username. NOTE: In Tableau, this can be done by including a caption when exporting an image of a worksheet or by adding a text box to a dashboard. Refer to the tutorialhere.

d.   Save the chart as grouped_barchart.png (goto Worksheet/Dashboard  Export  Image.

a.   NOTE: Do not take screenshots in Tableau since your image must have high resolution.

The main goal here is for you to get familiarized with Tableau. Thus, we kept this open-ended, so you can practice making design decisions. We will accept most designs. We show one possible design in Figure 1.2, based on the tutorial from Tableau.

3.   [10 points] Stacked bar chart. Visualize the data.world dataset (or games_detailed_info_filtered.csv if using the local files in the skeleton) as a stacked bar chart. Showcase the count of games in different categories and the relationship between game categories, their mechanics, and max player size.

a.   Create a Worksheet with a stacked bar chart that shows game counts for each playing mechanic (sub-bars) for each game category. NOTE: This data contains duplicate rows, as each row represents a distinct game. Do not remove duplicate rows from the data.

b.   Display game counts along the vertical axis and category along the horizontal axis.

c.   Include clear axes labels, a clear chart title, and a legend.

d.   Create a Dashboard using the worksheet you created.


e.  Add a filter for the number of 'Max.Players' allowed in each game. Update the chart using this filter to generate the following chart images (Refer to the tutorial hereon how to add a filter in a dashboard. Make sure to add 'Max.Players' in the filter shelf in the Worksheet first, like this):

i.      Select "2 Players" only in the filter. Save the resulting chart as 'stacked_barchart_1.png'

ii.      Select "4 Players" only in the filter. Save the resulting chart as 'stacked_barchart_2.png'

iii.      Both images must include your GT username in the bottom left. This can be added using a

text box. Refer to the tutorial here.https://youtu.be/fRwQenvBJ6I

iv.      In each image, the filter must be visible. If you are using Tableau Online, you may need to

add your worksheet containing the chart to a dashboard and then download an image of the dashboard that contains both the filter and the chart.

Note: To save a dashboard image, goto Dashboard - Export Image. Do not submit screenshots. An example of a possible design is shown in Figure 1.3.

Optional Reading: The effectiveness of stacked bar charts is often debated—sometimes,they can be confusing, difficult

to understand, and may make data series comparisons challenging.

Figure 1.2: Example of a grouped bar chart. Your chart may appear different and can earn full credit if it meets all the stated requirements. Your submitted image should include your GT username in the bottom left.


Figure 1.3: Example of a stacked bar chart after selecting "4 Players" in Max.Players filter. Your chart may appear different and can earn full credit if it meets all the stated requirements. Your submitted image should include your GT username in the bottom left.

Important Points about Developing with D3 in Questions 2-5

1.  We highly recommend that you use the latest Chrome browser to complete this question. We will grade your work using Chrome v131 (or higher).

2.  You will work with version 5 of D3 in this homework. You must NOT use any D3 libraries (d3*.js) other than the ones provided in the lib folder.

3.   For Q3–5, your D3 visualization MUST produce a DOM structure as specified at the end of each question. Not only does the structure help guide your D3 code design, but it also enables your code to be auto-graded (the auto-grader identifies and evaluates relevant elements in the rendered HTML). We highly recommend you review the specified DOM structure before starting to code.

4.  You  need to  setup a  local  HTTP  server  in  the  root  (hw2-skeleton)  folder to  run your  D3 visualizations, as discussed in the D3 lecture (OMS students: the video "Week 5 - Data Visualization for the Web (D3) - Prerequisites: JavaScript and SVG". Campus students: see lecture PDF.). The easiest way is to usehttp.serverfor Python 3.x. (for more details, seelink).

5.  All d3*.js files in the lib folder must be referenced using relative paths , e.g., "../lib/<filename>" in your html files. For example, if the file "Q2/submission.html" uses d3, its header should contain:

<script. type="text/javascript" src="../lib/d3.v5.min.js"></script>

It is incorrect to use an absolute path such as:

<script. type="text/javascript" src="C:/Users/polo/hw2-skeleton/lib/d3.v5.min.js"></script>

6.   For questions that require reading from a dataset, use a relative path to read in the dataset file. For example,  suppose  a  question  reads  data  from  earthquake.csv,  the  path  should  simply  be "earthquake.csv" and NOT an absolute path such as "C:/Users/polo/hw2-skeleton/Q/earthquake.csv".

7.  You can and are encouraged to decouple the style, functionality and markup in the code for each question. That is, you can use separate files for CSS, JavaScript and html.

Q2 [15 points] Force-directed graph layout

Goal

Create a network graph shows relationships between games in D3. Use interactive features like pinning nodes to give the viewer some control over the visualization.

Technology

D3 Version 5 (included in the lib folder)

Chrome v131.0.0 (or higher): the browser for grading your code Python http server (for local testing)

Allowed Libraries

D3 library is provided to you in the lib folder. You must NOT use any D3 libraries    (d3*.js) other than the ones provided. On Gradescope, these libraries are provided for you in the auto-grading environment.

Deliverables

[Gradescope] Q2 .(html/js/css): The HTML, JavaScript, CSS to render the graph. Do not include the D3 libraries or board_games.csv dataset.

You will experiment with many aspects of D3 for graph visualization. To help you get started, we have provided the Q2.html file (in the Q2 folder) and anundirected graph dataset of boardgames, board_games.csv file  (in  the  Q2  folder).  The  dataset  for  this  question  was  inspired  by a  Reddit  post  about  visualizing boardgames as a  network, where the author calculates the similarity  between  board  games  based on categories and game mechanics where the edge value between each boardgame (node) is the total weighted similarity index. This dataset has been modified and simplified for this question and does not fully represent actual data found from the post. The provided Q2.html file will display a graph (network) in a web browser. The goal of this question is for you to experiment with the visual styling of this graph to make a more meaningful representation of the data.Hereis a helpful resource (about graph layout) for this question.

Note: You can submit a single Q2.html that contains all the css and js components; or you can split Q2.html into Q2.html, Q2.css, and Q2.js.

1.   [2 points] Adding node labels: Modify Q2.html to show the node label (the node name,e.g., the source) at the top right of each node in bold. If a node is dragged, its label must move with it.

2.   [3 points] Styling edges: Style. the edges based on the "value" field in the links array:

1.   If the value of the edge is equal to 0 (similar), the edge should be gray, thick, and solid (The dashed line with zero gap is not considered as solid).

2.   If the value of the edge is equal to 1 (not similar), the edge should be green, thin, and dashed.

3.   [3 points] Scaling nodes:

a.   [1.5 points] Scale the radius of each node in the graph based on the degree of the node (you may try linear or squared scale, but you are not limited to these choices).

Note: Regardless of which scale you decide to use, you should avoid extreme node sizes, which will likely lead to low-quality visualization (e.g., nodes that are mere points, barely visible, or of huge sizes with overlaps).

Note: D3 v5 does not support d.weight (which was the typical approach to obtain node degree in D3 v3). You may need to calculate node degrees yourself. Example relevant approach ishere.

b.   [1.5 points] The degree of each node should be represented by varying colors. Pick a meaningful color scheme (hint: color gradients). There should beat least 3 color gradations and it must be visually evident that the nodes with a higher degree use darker/deeper colors and the nodes with lower degrees use lighter colors. You can find example color gradients at Color Brewer.

4.   [6 points] Pinning nodes:

a.   [2 points] Modify the code so that dragging a node will fix (i.e., "pin") the node's position such that it will not be modified by the graph layout algorithm (Note: pinned nodes can be further dragged around by the user. Additionally, pinning a node should not affect the free movement of the other nodes). Node pinning is an effective interaction technique to help users spatially organize nodes during graph exploration. The D3 API for pinning nodes has evolved overtime. We recommend reading this post when you work on this sub-question.

b.   [1 points] Mark pinned nodes to visually distinguish them from unpinned nodes, i.e., show pinned nodes in a different color.

c.   [3 points] Double clicking a pinned node should unpin (unfreeze) its position and unmark it. When a node is no longer pinned, it should move freely again.

IMPORTANT:

1.   To pass autograder consistently for part 1 (which tests if a dragged node becomes pinned and retains its position), you may need to increase the radius of highly weighted nodes and reduce their label sizes, so that the nodes can be more easily detected by the autograder's webdriver mouse cursor.

2.   To avoid timeout errors on Gradescope, complete the double click function in part 3 before submitting.

3.   If you receive timeout messages for all parts and your code works locally on your computer, verify that you are indeed using the appropriate ids provided in the "add the nodes" section in the skeleton code.

4.   D3 v5 does not support the d.fixed method (it was deprecated after D3 v3). For our purposes, it is used as a Boolean value to indicate whether a node has been pinned or not.

5. [1 points] Add GT username: Add your Georgia Tech username (usually includes a mix of letters and numbers, e.g., gburdell3) to the top right corner of the force-directed graph (see example image). The GT username must be a <text> element having the id: "credit"

Figure 2: Example of Visualization with pinned node (yellow). Your chart may appear different and can earn full credit if it meets all the stated requirements.




联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!