2.3. Complex Queries

Simple SQL queries are commands that perform straightforward data retrieval, usually from only one table at a time. Examples of these queries include using the SELECT and WHERE keywords, and they narrow the result set by focusing on a single parameter.

Example

1
2
3
SELECT *
FROM movies
WHERE release_date >= 2015;

Note that in line 1, we selected all of the fields in the movies table, but we could easily replace * with a smaller set of column names.

Complex SQL queries go beyond the standard requests by retrieving data from several tables and by limiting the result set with multiple conditions.

In the previous chapter, you learned how to use joins to display data that is spread over different tables. Another advanced SQL tool is called a subquery, which is a simple SQL command embedded within another query. By nesting queries, you can set up larger restrictions on the data included in the result set.

Complex queries often combine joins and subqueries, and they may also include multiple AND/OR operators within the WHERE clause.

2.3.1. Review Joins

Code along with the following examples using the storage database and tables you created earlier.

2.3.1.1. Inner Join

Each record in writing_supply stores a utensil_type value. We could use a simple query to filter the records by type, but this would not contain data such as quantity, which is stored in a different table.

To return a result set that contains information that appears in both the writing_supply and pencil_drawer tables, we use an INNER JOIN.

Example

1
2
3
SELECT *
FROM writing_supply
INNER JOIN pencil_drawer ON writing_supply.supply_id = pencil_drawer.supply_id;

The result set contains all of the records from both tables that have matching supply_id values.

Result set for an inner join of the writing_supply and pencil_drawer tables.

We could accomplish the same result using a series of simple queries, but this would be inefficient. We would need to first run a query on writing_supply and then use the results to shape one or more queries on pencil_drawer.

To reduce the size of the result set, we can request specific fields and add conditions to the query:

Example

1
2
3
4
SELECT writing_supply.supply_id, pencil_type, drawer_id, quantity
FROM writing_supply
INNER JOIN pencil_drawer ON writing_supply.supply_id = pencil_drawer.supply_id
WHERE refill = true AND pencil_type = "Mechanical";

Result Set

Result set for an inner join with a WHERE clause.

Note that in line 1, we need to specify the source for supply_id, since both tables contain a column with that name.

2.3.1.2. Left/Right Join

We can use a LEFT or RIGHT join to retain all of the records from one table and pull in overlapping data from another.

Example

1
2
3
SELECT writing_supply.supply_id, utensil_type, drawer_id, color
FROM writing_supply
LEFT JOIN pen_drawer ON writing_supply.supply_id = pen_drawer.supply_id;

The result set contains null values for any rows that involve pencils. The left join retains all of the data in writing_supply, but it can only combine that information with data from pen_drawer if the rows share supply_id values.

Result set for an inner join with a WHERE clause.

As with inner joins, we can restrict the size of the result set by adding one or more conditions:

Example

1
2
3
4
SELECT writing_supply.supply_id, utensil_type, drawer_id, color, quantity
FROM writing_supply
LEFT JOIN pen_drawer ON writing_supply.supply_id = pen_drawer.supply_id
WHERE refill = true;

Result Set

Result set for a left join with a WHERE clause.

2.3.1.3. Multiple Joins

The UNION keyword allows us to combine the results of separate SELECT commands. Run each of the following queries individually and examine the two result sets. Next, run the queries with UNION.

Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
SELECT writing_supply.supply_id, utensil_type, drawer_id, quantity FROM writing_supply
LEFT JOIN pencil_drawer ON writing_supply.supply_id = pencil_drawer.supply_id
WHERE refill = true

UNION

SELECT writing_supply.supply_id, utensil_type, drawer_id, quantity FROM writing_supply
RIGHT JOIN pen_drawer ON writing_supply.supply_id = pen_drawer.supply_id
WHERE refill = true
ORDER BY supply_id;

Result Set

Result set for UNION of a left and right join.

Lines 1 - 3 merge data from pencil_drawer into writing_supply, so long as the rows have matching supply_id values and have refill set as true. Lines 7 - 10 merge data from writing_supply into pen_drawer with the same conditions. The UNION command combines the two result sets.

Note

Recall that MySQL has no FULL OUTER JOIN syntax. If we want to combine all of the data from two separate tables, we must use the UNION keyword between LEFT JOIN and RIGHT JOIN queries.

2.3.2. Subqueries

Consider the following situations:

  1. Retrieve the supply_id values for any writing_supply containers that hold pens.
  2. Using the supply_id values, retrieve the ID and color values for any drawers in the last container that hold 60 or more pens.

We can accomplish these actions by using two simple SQL queries:

Example

1
2
3
4
5
6
SELECT supply_id FROM writing_supply
WHERE utensil_type = "Pen";
/* First result set contains the supply_id values 1, 2, and 5. */

SELECT drawer_id, color FROM pen_drawer
WHERE quantity >= 60 AND supply_id = 5;

Second Result Set

Result set of the two simple SQL queries.

To complete the second SQL query, we must examine the result set from the first, then hard-code the largest supply_id value into the line 6. This is inefficient.

By using a subquery, we can combine the two SQL commands to accomplish the same result. Let’s begin by embedding one simple SQL command inside the WHERE clause of a second.

Example

1
2
SELECT drawer_id, color FROM pen_drawer
WHERE supply_id IN (SELECT supply_id FROM writing_supply WHERE utensil_type = "Pen");

Result Set

Result set of the initial complex SQL query.

Items to note:

  1. An embedded inner query will always execute before the outer query. In this case, the SELECT statement in line 2 runs first, followed by the SELECT statement in line 1.
  2. The inner query in line 2 creates a result set of supply_id values from the writing_supply table, based on the condition utensil_type = "Pen".
  3. The outer query returns a result set of drawer_id and color values from the pen_drawer table.
  4. The condition WHERE supply_id IN checks if the supply_id value for a pen_drawer row matches one of the supply_id values returned from the inner query.

The result set from this complex SQL command is not yet what we want, since it returns values for ALL drawers in ALL of the pen supply containers. Let’s modify the query by adding the condition for quantity.

Example

1
2
3
SELECT drawer_id, color FROM pen_drawer
WHERE supply_id IN (SELECT supply_id FROM writing_supply WHERE utensil_type = "Pen")
AND quantity >= 60;

Result Set

Result set of the restricted complex SQL query.

Now the result set shows only the information for pen drawers with 60 or more items. This is good but still not quite complete, since we only want data from the last writing_supply row that contains pens. To fix this, we need to restrict the inner query to that single supply_id value.

The last pen container in writing_supply will have the largest value for supply_id. Fortunately, SQL has a defined function, MAX(column_name), that returns the largest value in the specified column.

Example

1
2
3
SELECT drawer_id, color FROM pen_drawer
WHERE supply_id = (SELECT MAX(supply_id) FROM writing_supply WHERE utensil_type = "Pen")
AND quantity >= 60;

Result Set

Result set of the final, complex SQL query.

Success! Our complex SQL query now produces the same result as the two separate, simple SQL queries. However, using a subquery is more flexible, since it does not rely on hard-coded values. We can see this benefit if we add 100 more entries to writing_supply. The original pair of queries still checks for entries with supply_id = 5, even though this may no longer be the last pen container. The complex query correctly identifies the last pen container regardless of how many entries writing_supply contains.

2.3.2.1. Where Else Can We Add Subqueries?

In the examples above, you added one subquery into the WHERE clause of another SQL command. However, it is also possible to place a subquery in the FROM clause. Instead of pulling values from an entire table, this setup retrieves data from the result set of the inner query.

Subqueries can be used with INSERT, UPDATE, and DELETE commands, and it is also possible to place subqueries inside subqueries. We will not go over these options here, but the links below provide some examples if you wish to explore the topics on your own:

  1. Subquery with INSERT,
  2. Subquery with UPDATE or DELETE,
  3. Nested subqueries.

2.3.2.2. Last Reminders

  1. In most cases, subqueries should be enclosed in parentheses ().
  2. If a subquery returns multiple rows in its result set, using the comparison operators (=, >, <=, etc.) in a WHERE clause throws an error. In these cases, use ANY, ALL, or IN to check the condition across all of the rows.
  3. In a WHERE clause, a subquery must be placed on the right hand side of the comparison operator (ANY, IN, =, >, etc.)

2.3.3. Check Your Understanding

Question

UNION and JOIN produce the same result set.

  1. True
  2. False

Question

A subquery and a UNION accomplish the same thing.

  1. True
  2. False

Question

What is the execution order for the following complex SQL query?

1
2
3
4
SELECT column_1 FROM table_1
WHERE column_1 IN (SELECT column_2 FROM table_2
   WHERE column_2 IN (SELECT column_3 FROM table_3
      WHERE num_items > 30));
  1. SELECT column_1, then SELECT column_2, then SELECT column_3
  2. SELECT column_1, then SELECT column_3, then SELECT column_2
  3. SELECT column_2, then SELECT column_1, then SELECT column_3
  4. SELECT column_2, then SELECT column_3, then SELECT column_1
  5. SELECT column_3, then SELECT column_1, then SELECT column_2
  6. SELECT column_3, then SELECT column_2, then SELECT column_1