29.3. Studio: Introduction to Statistics and Data Modeling

29.3.1. Getting Started

Fork this GitHub repository and clone to your computer. If you need a refresher on how to do this see Instruction for Using Github w/Jupyter Notebooks.

The Kaggle dataset we will be using can be found here, and has been included in the repo for you already.

  1. Break into small groups.

  2. With each group choosing a different variable from group A and one from group B.

    1. Group A:

      1. sex

      2. anemia

      3. diabetes

      4. high_blood_preasure

      5. smoking

    2. Group B:

      1. Ejection_fraction

      2. creatinine_phosphokinase

      3. age

      4. serum_sodium

      5. platelets

  3. Once you have your group, everyone in the group needs to work along in their own notebook, adding code to help answer the questions.

  4. Discuss as a group what observations can be made and any inferences that you might make regarding each variable.

  5. Answer the questions and record your observations in the space provided.

  6. Each group member will submit their own notebook at the end of the studio.

  7. Use the last 30 minutes of studio time to regroup and discuss what each group has learned about their variables and what inferences they would want to explore further.

29.3.2. Submitting Your Work

When finished make sure to push your changes up to GitHub. Copy the link to your GitHub repository and paste it into the submission box in Canvas for Studio: Data Modeling Part 1 and click Submit.