Studio: Data Visualization with Python

Getting Started

Warning

We have created a new repo for Class 19 and 20 exercises and studios. Please fork this repo to your Github account, and then clone it to your local device

Class 19 and 20 Exercise Studio Repo

Open up data-analysis-projects-class-19-and-20/class-19/studio/. You will be working in the file data-viz-studio.ipynb.

The Data

Your mentors will split you up into small groups and each group will be given something in the dataset to visualize.

For this studio, we will be using this Goodreads dataset on Kaggle.

Note

This dataset was cleaned before being uploaded and is included in the above GitHub repository, so you can focus on the visualizations.

Decide What to Visualize

As a group, decide which dataset you want to use to create your visualization:

  1. Number of books published per year.
  2. Number of books published by each publisher.
  3. Text ratings count versus reviews count.
Note

Some publishers’ names are in non-Latin scripts, such as Japanese and Russian. Matplotlib may not know how to display these names; that is okay!

Create Your Visualizations

Once you have your group, everyone in the group needs to make two visualizations:

  1. One made using Matplotlib.
  2. One made using Seaborn.
  3. These visualizations can be of any chart type.
  4. Make sure that everyone is doing something different!

Share with Your Group

When everyone is done creating their two charts, come back together and discuss as a group which chart most effectively tells the story.

Feel free to also try different color schemes and style the winning chart as a group to make it even stronger!

Present to the Class

When the whole class comes back together, each group will present their winning chart.

Each presentation should be about 5 minutes and cover:

  1. Why they picked this chart?
  2. What was effective about the chart for them?
  3. What changes do they think would make the chart even better?

Submitting Your Work

When finished make sure to push your changes up to GitHub.

Copy the link to your GitHub repository and paste it into the submission box in Canvas for Studio: Data Visualization w/Python and click Submit.