sqlintermediate10 minutes

Refactor a Slow Sales Summary Query for Better Performance

Improve the efficiency of a SQL query that summarizes total sales by product category and month by simplifying its logic and optimizing joins and aggregations.

Challenge prompt

You are given a SQL query that calculates the total sales amount per product category per month. The current query uses multiple nested subqueries and joins that make it slow on large datasets. Refactor the provided query to improve readability and performance without changing the output results. Ensure you minimize unnecessary joins and aggregations and write a single clean query that produces the same summary results.

Guidance

  • Analyze the query for redundant joins or subqueries that can be simplified or removed.
  • Use GROUP BY with appropriate columns rather than nesting multiple aggregations.
  • Consider using common table expressions (CTEs) if it improves readability without harming performance.

Hints

  • Check if joining the sales and products table can happen before aggregation instead of using subqueries after aggregation.
  • Avoid aggregating multiple times; try a single aggregation with grouped columns.

Starter code

SELECT
    p.category,
    DATE_FORMAT(s.sale_date, '%Y-%m') AS sale_month,
    SUM(s.amount) AS total_sales
FROM
    sales s
JOIN
    products p ON s.product_id = p.id
GROUP BY
    p.category, sale_month
ORDER BY
    p.category, sale_month;

Expected output

A result set showing each product category with total sales amounts grouped by year-month, sorted by category and month. For example: | category | sale_month | total_sales | |-----------|------------|-------------| | Apparel | 2023-01 | 15000 | | Apparel | 2023-02 | 17500 | | Electronics | 2023-01 | 30000 |

Core concepts

SQL query optimizationjoins and aggregationsGROUP BYdate functions

Challenge a Friend

Send this duel to someone else and see if they can solve it.