Predict the Output of a Recursive CTE with Aggregations and Window Functions
Analyze the output of an advanced SQL query using recursive common table expressions (CTEs), aggregations, and window functions. Predict the final result set given the schema and sample data.
Challenge prompt
Given the following schema and sample data representing employees and their managers, analyze the provided recursive CTE query. Determine the output—specifically, the final columns and rows returned by the query. -- Employees Table CREATE TABLE Employees( EmployeeID INT PRIMARY KEY, ManagerID INT NULL, Salary INT NOT NULL ); INSERT INTO Employees (EmployeeID, ManagerID, Salary) VALUES (1, NULL, 100000), (2, 1, 80000), (3, 1, 90000), (4, 2, 70000), (5, 2, 60000), (6, 3, 85000), (7, 3, 65000); -- Query to analyze WITH RECURSIVE HierarchyCTE AS ( SELECT EmployeeID, ManagerID, Salary, 1 AS Level FROM Employees WHERE ManagerID IS NULL UNION ALL SELECT e.EmployeeID, e.ManagerID, e.Salary, h.Level + 1 FROM Employees e JOIN HierarchyCTE h ON e.ManagerID = h.EmployeeID ), Aggregated AS ( SELECT ManagerID, COUNT(EmployeeID) AS TeamSize, AVG(Salary) AS AvgSalary FROM HierarchyCTE GROUP BY ManagerID ), Ranked AS ( SELECT h.EmployeeID, h.ManagerID, h.Level, a.TeamSize, a.AvgSalary, RANK() OVER (PARTITION BY h.ManagerID ORDER BY a.TeamSize DESC) AS TeamRank FROM HierarchyCTE h LEFT JOIN Aggregated a ON h.EmployeeID = a.ManagerID ) SELECT * FROM Ranked ORDER BY Level, TeamRank, EmployeeID;
Guidance
- • Understand how the recursive CTE builds the hierarchy levels starting from the top manager (ManagerID IS NULL).
- • Consider how the Aggregated CTE calculates team size and average salary grouped by manager.
- • Analyze how the window function RANK() operates partitioned by ManagerID and ordered by team size descending.
- • Trace the joins carefully to see how employees are linked with their team aggregation.
Hints
- • Remember the base case of the recursion includes only EmployeeID 1 (top-level manager).
- • Note the LEFT JOIN in Ranked means some employees might have NULLs for aggregated data if they aren't managers themselves.
- • RANK() resets for each ManagerID partition; employees with the same manager get ranked according to their team size.
Starter code
CREATE TABLE Employees(
EmployeeID INT PRIMARY KEY,
ManagerID INT NULL,
Salary INT NOT NULL
);
INSERT INTO Employees (EmployeeID, ManagerID, Salary) VALUES
(1, NULL, 100000),
(2, 1, 80000),
(3, 1, 90000),
(4, 2, 70000),
(5, 2, 60000),
(6, 3, 85000),
(7, 3, 65000);
WITH RECURSIVE HierarchyCTE AS (
SELECT EmployeeID, ManagerID, Salary, 1 AS Level
FROM Employees
WHERE ManagerID IS NULL
UNION ALL
SELECT e.EmployeeID, e.ManagerID, e.Salary, h.Level + 1
FROM Employees e
JOIN HierarchyCTE h ON e.ManagerID = h.EmployeeID
),
Aggregated AS (
SELECT
ManagerID,
COUNT(EmployeeID) AS TeamSize,
AVG(Salary) AS AvgSalary
FROM HierarchyCTE
GROUP BY ManagerID
),
Ranked AS (
SELECT
h.EmployeeID,
h.ManagerID,
h.Level,
a.TeamSize,
a.AvgSalary,
RANK() OVER (PARTITION BY h.ManagerID ORDER BY a.TeamSize DESC) AS TeamRank
FROM HierarchyCTE h
LEFT JOIN Aggregated a ON h.EmployeeID = a.ManagerID
)
SELECT * FROM Ranked
ORDER BY Level, TeamRank, EmployeeID;Expected output
EmployeeID | ManagerID | Level | TeamSize | AvgSalary | TeamRank -----------|-----------|-------|----------|-----------|--------- 1 | NULL | 1 | 2 | 73333.33 | 1 2 | 1 | 2 | 2 | 65000 | 1 3 | 1 | 2 | 2 | 75000 | 1 4 | 2 | 3 | NULL | NULL | 1 5 | 2 | 3 | NULL | NULL | 1 6 | 3 | 3 | NULL | NULL | 1 7 | 3 | 3 | NULL | NULL | 1
Core concepts
Challenge a Friend
Send this duel to someone else and see if they can solve it.