Skip to Content
Course content

5.2 Subqueries and Nested Queries

Subqueries and nested queries are powerful SQL techniques that allow you to use a query within another query. They help you break down complex problems into smaller, more manageable parts, enabling you to perform operations like filtering data based on the results of another query, or retrieving aggregated values for comparisons.

1. What is a Subquery?

A subquery (also called an inner query or nested query) is a query embedded inside another query. It is typically used to retrieve a value or set of values that can be used by the outer query. The subquery can be placed in different parts of the main query, such as in the SELECT, WHERE, or FROM clauses.

  • Purpose: To retrieve data that will be used in the main query to filter, compare, or compute results.
  • Types of Subqueries:
    • Single-row subquery: Returns a single value.
    • Multi-row subquery: Returns multiple values.
    • Correlated subquery: A subquery that references columns from the outer query.

2. Syntax of Subqueries

A subquery is enclosed in parentheses and can be used in various clauses, depending on the requirement.

Example 1: Subquery in WHERE Clause

SELECT employee_name, salary
FROM employees
WHERE department_id = (
  SELECT department_id
  FROM departments
  WHERE department_name = 'Sales'
);
  • Explanation: In this example, the subquery returns the department_id of the "Sales" department. The outer query then uses this value to retrieve employees who belong to that department.

Example 2: Subquery in SELECT Clause

SELECT employee_name, 
       (SELECT department_name FROM departments WHERE departments.department_id = employees.department_id) AS department
FROM employees;
  • Explanation: This example uses a subquery to fetch the department_name for each employee based on their department_id.

3. Types of Subqueries

3.1 Single-Row Subqueries

A single-row subquery returns exactly one row and one column of data, which is then used in comparison operators such as =, >, <, etc.

Example:

SELECT employee_name
FROM employees
WHERE salary = (
  SELECT MAX(salary)
  FROM employees
);
  • Explanation: The subquery returns the highest salary, and the outer query retrieves the name of the employee who has that salary.

3.2 Multi-Row Subqueries

A multi-row subquery returns multiple rows and is used with operators like IN, ANY, or ALL.

Example:

SELECT employee_name
FROM employees
WHERE department_id IN (
  SELECT department_id
  FROM departments
  WHERE location = 'New York'
);
  • Explanation: The subquery returns all department_id values for departments located in "New York." The outer query then retrieves employees who work in any of those departments.

3.3 Correlated Subqueries

A correlated subquery references columns from the outer query, meaning the subquery is evaluated once for each row processed by the outer query. It is different from regular subqueries, which are evaluated once.

Example:

SELECT employee_name, salary
FROM employees e
WHERE salary > (
  SELECT AVG(salary)
  FROM employees
  WHERE department_id = e.department_id
);
  • Explanation: This subquery calculates the average salary for each department and then compares each employee's salary to that average. The e.department_id refers to the outer query's column, making the subquery correlated.

4. Nested Queries in the FROM Clause

A nested query can also appear in the FROM clause, creating a derived table or a temporary table that can be used like a regular table in the outer query.

Example:

SELECT sub.department_id, AVG(sub.salary) AS average_salary
FROM (
  SELECT department_id, salary
  FROM employees
  WHERE salary > 50000
) AS sub
GROUP BY sub.department_id;
  • Explanation: The subquery first filters out employees with a salary greater than 50,000. The outer query then calculates the average salary for each department based on the filtered data.

5. Using Subqueries for Complex Operations

Subqueries can simplify complex queries that would otherwise require multiple steps or temporary tables. They allow you to perform operations such as:

  • Filtering: Retrieve results based on calculations or results from another query.
  • Aggregation: Get aggregated values (e.g., sums, averages) in a separate query and use them for comparisons.
  • Comparison: Compare individual rows in the outer query to results from the subquery.

6. Best Practices with Subqueries

  • Performance Considerations: Subqueries, especially correlated subqueries, can sometimes result in performance issues. It is important to test and optimize queries that contain subqueries, particularly for large datasets.
  • Using EXISTS or IN: When you need to check for the existence of rows or compare a value against a list of values, using EXISTS or IN with subqueries can be more efficient than other operators.
  • Avoiding Nested Subqueries When Possible: Sometimes, joins or temporary tables can achieve the same result as subqueries and might be more efficient. Always consider if a subquery is the best approach for your query.

7. Conclusion

Subqueries and nested queries are powerful tools in SQL that allow you to break down complex operations into smaller, more manageable parts. By understanding and utilizing subqueries, you can perform advanced filtering, comparison, and aggregation operations, making SQL a highly flexible language for working with relational databases. However, careful consideration of performance is necessary when using subqueries, especially in large datasets.

Commenting is not enabled on this course.