13.1. Introduction To Data

Read the following articles:

13.1.1. Methods and Techniques

  1. Data Analysis Methods & Techniques.

13.1.2. Data Analytics in Python

  1. A Beginner’s Guide to Data Analysis in Python.

13.1.3. Data Project Life Cycle

  1. Data project life cycle.

  2. Asking the right questions.

13.1.4. What is a Dataset?

Think back to when you are working on a monthly budget. Let’s say you have a singular spreadsheet tracking your monthly automatic payments for May that includes info such as the company name, the amount, and the date. Now just this one spreadsheet alone could be a dataset. It would be small and you may not be able to gain many insights from just one month of automatic payments, but it is still a dataset. If you wanted to gain further insight into your monthly payments, your dataset might include every spreadsheet you have of automatic payments for the past year. If you wanted an overview of your whole budget, you might add spreadsheets of your taxes, paychecks, and grocery bills to your dataset, such as the dataset below.

Diagram of my May budget dataset.

13.1.5. Check Your Understanding

Question

Name the 5 main types of analysis.

Question

What does KPI stand for?

  1. Key Performance Indicators

  2. Key Performance Identifiers

  3. Key Performance Ideas

  4. Key Performance Idioms

Question

What does EDA stand for?

  1. Exploratory Data Analysis

  2. Explanatory Data Analysis

  3. Exploratory Data Audit

  4. Examinable Data Analysis

Question

What is between Data Understanding and Exploratory Analysis/Modeling in the project life cycle?

  1. Data Preparation

  2. Validation

  3. Visualization and Presentation

  4. Business Issue

Question

If you are a data analyst for a big box store, what is a standard KPI you would want to use to help drive up revenue?

Question

If you are a data analyst for a big box store, where would you get the data that would help answer the question?

Question

If you are a data analyst for a big box store, how do you foresee ensuring data quality?