P1: Graph Wrangling in Jupyter
- Due Apr 11, 2017 by 11:59pm
- Points 10
- Submitting a file upload
- File Types html and pdf
Setup
Install Jupyter Notebook (http://jupyter.readthedocs.io/en/latest/install.html Links to an external site.).
Install Graphviz (http://www.graphviz.org/ Links to an external site.).
Instructions
Create a Jupyter notebook that wrangles the county adjacency dataset into the DOT graph description language. Use the "fdp" (force-directed layout) tool from Graphviz to render an image file, and then display it in your notebook.
Rather than showing a graph of the counties found within the state of California (as in the example data wrangling notebook showed in the first lecture: Counties.html Download Counties.html), show the graph at the level of states and territories. Note that this dataset includes information for territories like Puerto Rico and American Samoa, so you should expect more than 50 nodes in your graph.
What counts as a state or territory? For the purposes of this task, it counts if it has a two-letter code (e.g., "AK" for Alaska and "GU" for Guam).
Grading criteria
[1pt] A Notebook (html or pdf exported version) is turned in displaying at least some Python code execution.
[1pt] Some Markdown cells are used to give structure to the notebook. At least one text cell at the top of the notebook should describe the purpose of the notebook.
[1pt] One cell shows a preview of the raw dataset (e.g., using the "head" command).
[2pts] Some Python code wrangles the dataset into a dot-language graph description file.
[1pt] Some Python code directly displays the number of unique states and territories that will be shown in the graph.
[1pt] One cell shows a preview of the dot file.
[1pt] The fdp layout engine is used to render an image.
[1pt] The image is displayed within the notebook.
[1pt] The node for the state of California is highlighted in some way (such as changing the size, shape, or color of the node using node attributes in the dot language).