18.3. Studio: Data Visualization with Python

18.3.1. Getting Started

For this weeks studio fork this GitHub repository and clone to your computer. If you need a refresher on how to do this see Instruction for Using Github w/Jupyter Notebooks.

For this studio, we will be using this Goodreads dataset on Kaggle.

note

This dataset was cleaned before being uploaded and is included in the above GitHub repository, so you can focus on the visualizations.

Your mentors will split you up into small groups and each group will be given something in the dataset to visualize. The options are:

  1. Number of books published per year.
  2. Number of books published by each publisher.
    • Some publishers’ names are in non-Latin scripts, such as Japanese and Russian. Matplotlib may not know how to display these names; that is okay!
  3. Text ratings count versus reviews count.

18.3.2. Create Your Visualizations

Once you have your group, everyone in the group needs to make two visualizations:

  • One in Matplotlib.
  • One in Seaborn.
  • These visualizations can be of any chart type.
  • Make sure that everyone is doing something different!

18.3.3. Share with Your Group

When everyone is done creating their two charts, come back together and discuss as a group which chart most effectively tells the story.

  • Feel free to also try different color schemes and style the winning chart as a group to make it even stronger!

18.3.4. Present to the Class

When the whole class comes back together, each group will present their winning chart.

Each presentation should be about 5min and cover:

  • Why they picked this chart.
  • What was effective about the chart for them?
  • What changes do they think would make the chart even better.

18.3.5. Submitting Your Work

When finished make sure to push your changes up to GitHub. Copy the link to your GitHub repository and paste it into the submission box in Canvas for Studio: Data Visualization w/Python and click Submit.