Georgia Tech的Data and Visual Analytics的作业，这次是用D3.js这个库，在七个不同的场景中，根据数据画七种图。工作量巨大，断断续续写了一个星期吧。

## Q1. Designing a good table. Visualizing data with Tableau

Imagine you are a data scientist working with the United Nations High Commissioner for Refugees (UNHCR) and need to perform the following tasks to aid UNHCR’s understanding of persons of concern.

### Table

Create a table to display the details of the refugees (Total Population) in the year 2005 from the data provided in unhcr_persons_of_concern.csv. You can use any tool (e.g., Excel, HTML) to create the table. Keep suggestions from class in mind when designing your table (see lectures slides for what to and what not to do, but you are not limited to the techniques described). Describe your reason for choosing the techniques you use in explanation.txt in no more than 50 words​.

### Tableau

Visualize the demographic attributes (age, sex, country of origin, asylum seeking country) in the file unhcr_popstats_demographics.csv (in the folder Q1) for any given year in one chart. Tableau is a popular InfoViz tool and the company has provided us with student licenses. Go to this link and select “Get Started”. On the form, enter your Georgia Tech email address for “Business email” and “Georgia Institute of Technology” for “Organization”. The Desktop Key for activation is available in T­Square Resources as “Tableau Desktop Key”. This key is for your use in this course only. Do not share the key with anyone.
Provide a rationale for your design choices in this step in the file explanation.txt in no more than 50 words.

## Q2. Force­directed graph layout

You will experiment with many aspects of D3 for graph visualization. To help you get started, we have provided the graph.html file (in the folder Q2).

Modify the graph.html to show labels to the right of each node in the graph. If a node is dragged, its label must also move with the node. (You are welcome to split graph.html into graph.html, graph.js and graph.css.)

Color the links based on the “value” field in the links array. Assign the following colors:

``````If the value of the edge is >= 1.5 : assign Blue color to the link.
If the value of the edge is < 1.5 : assign Green color to the link.
``````

### Scaling node sizes

• Adjust the radius of each node in the graph based on the degree of the node.
• In explanation.txt​, using no more than 40 words, discuss which metric (possible metrics: scaling the radii linearly, scale the radii by the square root of the degree, etc.) you have used and explain why you think it is a good choice.

### Pinning nodes​ (fixing node positions)

• Modify the html so that when you double click a node, it pins the node’s position such that it will not be modified by the graph layout algorithm (note: pinned nodes can still be dragged around by the user but they will remain at their positions otherwise).
• Mark pinned nodes so that they are visually distinguishable from unpinned nodes, e.g., pinned nodes shown with a different color, or border thickness, or visually annotated with a “star” (*), etc.
• Double clicking a pinned node should unpin (unfreeze) its position and unmark it.

## Q3. Visualizing scatter plots

Use the dataset provided in the file iris.tsv (in the folder Q3) to create a scatterplot.
Features/ Attributes in the dataset:

1. Sepal length in cm
2. Sepal width in cm
3. Petal length in cm
4. Petal width in cm
5. Class: Iris Setosa, Iris Versicolor, Iris Virginica

### Creating scatter plots​

• Create two scatter plots, one for each feature combination specified below. In the scatter plots, visualize the different classes using different symbols (circle for setosa, square for versicolor and triangle for virginica) and add a legend showing how symbols map to the classes
• Features 1 and 2
• Features 3 and 4
• In explanation.txt​, using no more than 40 words, discuss which plot is better at separating the classes and why.

Scatter plots should be placed one after the other in an html page as shown in the reference below. Please note that your design need not be identical to the given reference.
Based on the scatter plot created for features 1 and 2 (Sepal Length vs Sepal Width), create new plots for the following questions:

• Scaling symbol sizes.​ Set the size of each symbol in the plot to be proportional to the square root of the the length parameter. Create a new plot for this part.
• Axis​ Scales in D3. Create two plots for this part to try out two axis scales in D3, one for using the square root scale (applied to both axes) and another for using the log scale (also applied to both axes). Explain in no more than 40 words which scale works best for this dataset in explanation.txt.

## Q4. Visualizing heat map

Use the dataset 2 provided in hourly_heatmap.json (in the folder Q4) that describes glucose readings over time, and visualize it using D3 heatmaps. To get started, refer to the heatmap example here​.

• Plot the glucose readings against the time of the day (Hint: Use the glucose readings as a “z” parameter in the given example)
• Now use the file day_heatmap.json (in the folder Q4) to plot the glucose readings against the day of the week on the heatmap. Use the day names instead of numbers as the tick labels on the axis, e.g., day 1 being Monday.
• A pattern should emerge from the visualizations. Explain the pattern and why it occurs, using no more than 40 words in explanation.txt​.

Please note that there will be two heat maps, one for part i and the other for part ii. Place them one after the other on an html page (the one for part i goes first).

## Q5. Sankey Chart

Formula One racing is a championship sport in which race drivers represent teams to compete for points over several races (also called Grand Prix) in a season. The team with the most points at the end of a season wins the prestigious Formula One World Constructors’ Championship award. You will visualize the flow of points for the races held in this season up to September 2016. The drivers win points according to their final standing in each race, which finally get added to their respective team’s total.

• Create a Sankey Chart using the datasets provided (​races.csv and teams.csv) in the Q5 folder. The chart should visualize the flow of points in the order:
``````race → driver → team
``````

You may refer to this example to create the chart (sankey.js is provided in the lib folder). You can keep the blocks’ vertical positions static. Your chart should look like the example Sankey Chart for the 2015 season as shown in Figure 5.

Hint : For this part, you will have to read in the csv files and combine the data into a format that can be passed to the sankey library. To accomplish this, you may find the following javascript functions useful: d3.nest(), array.filter(), array.map()

• Use the d3­tip library to add tooltips as shown in Figure 5 (you can make your own visual style choices using css properties).
• From the visualization you have created, determine the following:
• Which team has the best current standing?
• Which driver has the most points currently?
• Which driver won the Monaco Grand Prix?
• Which two drivers switched their teams mid­season?

## Q6. Interactive visualization

Mr. Fluke runs a small company named FooBar. His company manufactures eight products around the year. He wants you to create an interactive visualization report using D3 so that he can see the total revenue generated per product type and the revenue breakdown across product types for the four quarters in 2015. Use the dataset provided in the Q6 folder. Integrate the dataset provided in dataset.txt directly in an array variable in the script.

• Create a horizontal bar chart with its vertical axis denoting the product names and its horizontal axis denoting the total revenue. Each bar should have the total revenue amount in dollars labelled inside it. See Figure 6 for an example.
• Create a legend​ with three columns.
• Column 1: quarter labels: Q1, Q2, Q3, Q4
• Column 2: initialized with each quarter’s total revenue (e.g., Q1’s value is initialized as the sum of all products’ revenues in Q1)
• Column 3: presents the percentage share of each value in Column 2
• While hovering over any bar, the second and third columns in the legend should update to 8 show the revenue generated (in value and percentage share, respectively) for each quarter of the selected product. For example, when hovering over Product C’s bar, the second and third columns in the legend should update to show Product C’s revenues in the four quarters and those revenues’ percentage shares. See Figure 6 for an example.

Note:

1. The vertical axis of the chart should use product names as labels.
2. On hovering over any horizontal bar, the color of the bar should change. You can use any color that is visually distinct from the regular bars.
3. The legend should reset to the initial values on mouseout (i.e., when the mouse leaves a bar).

## Q7. Visualizing college scorecard data

This is a free­form question. We want you to apply the D3 knowledge that you have gained to assist decision making for a real­world problem: help students make college decisions.
Using D3, construct a visualization using the college scorecard dataset (located in the Q7 folder​) which contains statistics about colleges (e.g., affordability, value).
Create one large visualization or multiple small ones using the entire dataset or a subset of it. If you want, you may also use the Bootstrap library, which is a popular framework used in frontend development, to organize your dashboard ­­ we recommend Bootstrap because many student teams in previous semesters had good experiences using it for their projects. Place the Bootstrap library files in the lib/bootstrap folder. The visualization does not need to support any interactions.

• Points will be awarded for usability, functionality, and creativity.
• Summarize your main ideas behind the visualization in explanation.txt in no more than 50 words​.