Chapter 6

Cleaning Data with Spreadsheets

Learning Objectives

  • Understand common data cleaning techniques used to remove data
  • Identify different use cases for cleaning a data set
  • Identify corrupted data and handle it appropriately
  • Recognize the four types of “dirty” data: Missing, Irregular, Unnecessary, and Inconsistent

Key Terminology

Data Cleaning Techniques

  1. Filtering
  2. Sorting
  3. Redundant data
  4. Trailing Whitespace
  5. functions
  6. REGEX

Types of Dirty Data

  1. Missing data
  2. Irregular data
  3. Unnecessary data
  4. Inconsistent data