14.1. Exploratory Data Analysis

Read the following articles, follow along where instructed.


You do not need to install pandas, it comes with Anaconda.


For Medium articles: if you run out of free articles, open the page in an incognito window.

14.1.1. Exploring Data with Python

Code along with this article.

  1. Exploratory data analysis in Python.

  2. Stop at Step #8 “Detecting Outliers”.

14.1.2. Get to Know Your Data

  1. Getting to know your data.

  2. Data Types in Statistics.

14.1.3. Python pandas

Code along with this article.

  1. Python Pandas Tutorial: A Complete Introduction for Beginners

  2. Stop at “Handling Duplicates”.

14.1.4. Statistics in pandas

  1. Basic statistics in pandas DataFrame.

14.1.5. What is a DataFrame?

A pandas DataFrame is similar to a Python dictionary. The column names are like keys and the values are the data for that column.

Diagram of a Pandas Dataframe.

For more information about pandas DataFrames and the diagram above, visit w3resource.

The column values are called a pandas Series. Here is how pandas Series are used to build a dataframe.
Diagram of how ``pandas Series``  a dataframe.

For more information about pandas Series and diagram above, visit w3resource.

14.1.6. Check Your Understanding


What is the pandas function used to return the number of rows and columns in a DataFrame?


Column names cannot be changed in a DataFrame?

  1. True

  2. False


What can knowing the data types present in a data set tell us about the data being presented?


What is the pandas method for reading a CSV file type?


Visualized below is the “purchases” DataFrame . What is the pandas syntax to select for Robert’s data?

DataFrame showing name of person and if they purchased apples and/or oranges.


How do we view only the first 13 rows of a DataFrame?


A DataFrame column is a Series?

  1. True

  2. False


Which pandas function will print the number of records, three quartiles, mean, standard deviation, minimum and maximum values of a DataFrame?

  1. .describe()

  2. .index()

  3. .statistics()

  4. .head()