sqladvanced10 minutes

Predict the Output of a Complex Window Function and CTE SQL Query

Analyze the given SQL query involving multiple CTEs and window functions to determine the exact output produced by the final SELECT statement.

Challenge prompt

Given the following SQL query using common table expressions (CTEs) and window functions, predict the exact output table returned by the final SELECT statement. Focus on understanding the window frame specifications, partitioning, ordering, and how these influence the ranking and aggregation results.

Guidance

  • Carefully trace the intermediate results created by each CTE before the final SELECT.
  • Pay attention to how window functions like ROW_NUMBER() and SUM() OVER() are partitioned and ordered.
  • Consider the effect of filtering conditions applied after CTEs and before the final output.

Hints

  • Start by evaluating the first CTE results, then move on to how the window functions modify the data.
  • Remember that ROW_NUMBER() resets for each partition defined in the OVER clause.
  • The combination of WHERE clauses after the CTEs may exclude certain rows, affecting the final output.

Starter code


WITH Sales AS (
    SELECT
        salesperson_id,
        region,
        sale_amount,
        sale_date
    FROM sales_data
),
RankedSales AS (
    SELECT
        salesperson_id,
        region,
        sale_amount,
        sale_date,
        ROW_NUMBER() OVER (PARTITION BY region ORDER BY sale_amount DESC, sale_date ASC) AS region_rank,
        SUM(sale_amount) OVER (PARTITION BY region ORDER BY sale_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total
    FROM Sales
),
FilteredSales AS (
    SELECT * FROM RankedSales
    WHERE region_rank <= 3
)
SELECT region, salesperson_id, sale_amount, region_rank, running_total
FROM FilteredSales
ORDER BY region, region_rank;

Expected output

A result set displaying the top 3 sales (by sale_amount descending and sale_date ascending as tie-breaker) for each region from the sales_data table, containing columns: region, salesperson_id, sale_amount, region_rank (1 to 3), and running_total (cumulative sum of sale_amount ordered by sale_date within each region). Rows should be ordered by region alphabetically, then by region_rank ascending.

Core concepts

CTEsWindow FunctionsROW_NUMBERAggregationPartitioning and Ordering

Challenge a Friend

Send this duel to someone else and see if they can solve it.