sqladvanced30 minutes

Design and Query a Normalized E-Commerce Sales Data Model

Create a relational schema to model an e-commerce sales environment and write advanced SQL queries to retrieve complex insights on sales performance and customer behavior.

Challenge prompt

You are tasked with designing a normalized data model for an e-commerce platform focusing on sales transactions, customers, products, and order details. Using PostgreSQL, create the necessary tables with proper relationships and constraints. Once your schema is ready, write a SQL query to determine the top 5 customers with the highest total purchase amount in the last quarter, listing their names, total amount spent, and total number of orders. Details: - Model tables for Customers, Products, Orders, OrderItems. - Enforce data integrity with not nulls, primary keys, and foreign keys. - Each order can contain multiple products with quantity and price. - Calculate total purchase amounts by summing product price * quantity across orders. Deliverables: 1. SQL DDL statements creating the schema. 2. SQL query retrieving the top 5 customers by total spending in the last quarter with their total spent and order counts.

Guidance

  • Normalize data to avoid redundancy, separating entities such as Customers, Products, and Orders into different tables.
  • Use appropriate data types for fields such as dates, prices (decimal), and quantities (integer).
  • Implement foreign key constraints to enforce the relationships between Orders and Customers, and between OrderItems and Orders/Products.
  • Use window functions or aggregation with filtering on dates to calculate totals and ranks efficiently.

Hints

  • Consider using a composite primary key on OrderItems combining order_id and product_id.
  • Use the DATE_TRUNC and CURRENT_DATE functions to determine the last quarter dynamically.
  • Aggregate order totals by joining OrderItems with Orders and Customers, then use ORDER BY and LIMIT to identify the top customers.

Starter code

CREATE TABLE Customers (
  customer_id SERIAL PRIMARY KEY,
  name VARCHAR(100) NOT NULL,
  email VARCHAR(100) UNIQUE NOT NULL,
  created_at TIMESTAMP NOT NULL DEFAULT NOW()
);

CREATE TABLE Products (
  product_id SERIAL PRIMARY KEY,
  name VARCHAR(100) NOT NULL,
  price DECIMAL(10,2) NOT NULL CHECK (price >= 0)
);

CREATE TABLE Orders (
  order_id SERIAL PRIMARY KEY,
  customer_id INT NOT NULL REFERENCES Customers(customer_id),
  order_date DATE NOT NULL
);

CREATE TABLE OrderItems (
  order_id INT NOT NULL REFERENCES Orders(order_id),
  product_id INT NOT NULL REFERENCES Products(product_id),
  quantity INT NOT NULL CHECK (quantity > 0),
  price DECIMAL(10,2) NOT NULL CHECK (price >= 0),
  PRIMARY KEY (order_id, product_id)
);

Expected output

A query result showing these columns: customer_name, total_amount_spent, total_orders For example: | customer_name | total_amount_spent | total_orders | |---------------|--------------------|--------------| | Alice Smith | 1450.75 | 12 | | Bob Johnson | 1320.40 | 9 | | Carla Evans | 1275.00 | 8 | | Daniel Wu | 1100.20 | 10 | | Emma Brown | 975.95 | 7 |

Core concepts

normalized database designforeign key constraintsadvanced SQL aggregationdate filtering with SQL functions

Challenge a Friend

Send this duel to someone else and see if they can solve it.