17.1. Data Manipulation

17.1.1. Readings

The readings and exercises are a little different for this chapter. Both have been combined into a GitHub repository containing two jupyter notebooks. The link can be found here.

Note

You will find all the assigned reading within each notebook in the GitHub repository you forked for this week’s exercises.

17.1.2. Check Your Understanding

Question

What is the syntax for multiple aggregation functions across a single column, such as ‘age’?

Question

What data type is the syntax using?

Question

What is this line of code doing with the data?

data_group = data.groupby("embark_town")

Question

What is this line of code doing with the data?

data_group = data.groupby("embark_town").agg("mean")

Question

According to pandas documentation, using a for loop is the only way to update values in a column.

  1. True
  2. False

Question

What is the syntax to rename a column?

Question

Converting numeric data to categorical data is an example of?

Question

The pivot() method is the only way to aggregate values in a table.

  1. True
  2. False

Question

Which method do you use to create a “Wide-to-Long” table?

Question

When creating a small table, you should store it in its own variable to keep your original table safe.

  1. True
  2. False

Question

When appending a new row, if it contains a column that doesn’t previously exist in the original table then an error will be thrown.

  1. True
  2. False

Question

Concatenation can act on which axes?

  1. 1 and 0
  2. 1, only
  3. 0 only
  4. for as many columns as the table contains.

Question

Using our flowers and garden_supply tables, write the syntax to merge a subset of columns, where flowers is the right table, and garden_supply on the left. This subset should only look at “Flower” and “Sold_As” only in the garden_supply table, and “Name” in the flowers table.

garden_supply[["Flower","Sold_As"]].merge(flowers[["Name"]],left_on="Flower", right_on="Name")
flowers[["Flower", "Sold_As"]].merge(garden_supply[["Name"]], left_on="Flower", right_on="Name")
garden_supply[["Flower", "Sold_As"]].merge(flowers[["Name"]], left_on="Name", right_on="Flowers")
garden_supply[["Name"]].merge(flowers[["Flower", "Sold_As"]], left_on="Flower", right_on="Name")

Question

The default merge in the pandas merge() function is a left merge.

  1. True
  2. False

Question

Which merge combines ALL of the rows of the merged dataframes, filling in NaN if values are missing?

Question

(1 of 2) In the merge() function, there are the following parameters: on, left_on, and right_on. When would you use them?

Question

(2 of 2) What is the difference between on and left_on in the merge() function?

Question

When working with join, the right table will always be joined based on its index and not a designated column.

  1. True
  2. False

Question

The default join() type is: