Create a SQL Function to Calculate Moving Average Sales Over Dynamic Time Windows
Build a SQL function that calculates the moving average of daily sales for a specified product over a dynamic number of previous days. The function should handle gaps in dates and provide efficient performance for large sales datasets.
Challenge prompt
Write a SQL function named `moving_average_sales` that takes three parameters: a product ID, a date, and a window size N (integer). The function returns the average of the sales quantity for that product over the N days ending on the given date (inclusive). Assume the existence of a `sales` table with columns: `product_id` (int), `sale_date` (date), and `quantity` (int). Your function should handle cases where some dates in the window have no sales records by treating missing days as zero sales. Optimize the function for large datasets.
Guidance
- • Use window functions or aggregation with date range filtering to calculate sums over the specified date window.
- • Handle missing dates by generating all dates within the window for the product and joining with the sales data.
- • Ensure the function is optimized with indexes on `product_id` and `sale_date`.
Hints
- • Consider using a calendar table or generate_series function (if available) to fill in missing dates within the window.
- • Filtering the sales data before aggregation improves query performance.
- • Use COALESCE to treat NULL sales quantities as zero when calculating the average.
Starter code
CREATE OR REPLACE FUNCTION moving_average_sales(p_product_id INT, p_date DATE, p_window INT) RETURNS NUMERIC AS $$
BEGIN
-- Your implementation here
RETURN 0;
END;
$$ LANGUAGE plpgsql;Expected output
For example, moving_average_sales(101, '2024-06-30', 7) returns the average sales quantity for product 101 from 2024-06-24 to 2024-06-30 inclusive.
Core concepts
Challenge a Friend
Send this duel to someone else and see if they can solve it.