SQL Window Functions Demystified with Real Sales Data

SQL Window Functions Demystified with Real Sales Data

SQL Window Functions Demystified with Real Sales Data

 

SQL window functions are an underrated powerhouse for data manipulation, especially in analytics and reporting. When working with e-commerce data — such as tracking user sales, identifying top products, or calculating cumulative revenue — window functions can provide elegant, performant solutions where traditional aggregation queries fall short. In this post, we’ll use sample sales data to show how to rank weekly top sellers, calculate running totals, and segment users — all without needing subqueries or unnecessary joins.

1. Understanding Window Functions: A Quick Introduction

Window functions allow you to perform calculations across sets of table rows related to the current row. Unlike aggregate functions, they don’t collapse rows — they add more context to each row using a specified window frame.

Here’s the basic syntax of a window function:

SELECT
    column_name,
    function_name(...) OVER (
        PARTITION BY ...
        ORDER BY ...
        ROWS BETWEEN ...
    ) AS new_column
FROM table_name;

They’re incredibly useful for analytical operations like ranking, running totals, and segmentation — let’s explore these through e-commerce data examples.

2. Dataset: Weekly Sales Data

We’ll assume a sample table called sales with the following structure:

CREATE TABLE sales (
    order_id INT,
    user_id INT,
    sale_amount DECIMAL(10, 2),
    product_id INT,
    sale_date DATE
);

To simplify explanations, we’ll assume the data spans multiple weeks and involves multiple purchases by users and across products.

3. Ranking Products by Weekly Sales

Let’s say we want to know the top-selling products each week. We can use the RANK() or DENSE_RANK() window function partitioned by week:

SELECT
    product_id,
    DATE_TRUNC('week', sale_date) AS week_start,
    SUM(sale_amount) AS total_sales,
    RANK() OVER (
        PARTITION BY DATE_TRUNC('week', sale_date)
        ORDER BY SUM(sale_amount) DESC
    ) AS rank
FROM sales
GROUP BY product_id, DATE_TRUNC('week', sale_date)

Explanation:

  • DATE_TRUNC('week', sale_date) groups sales into weekly buckets.
  • RANK() assigns a rank to each product’s weekly sales in descending order.

Tip: Use DENSE_RANK() if you want to remove gaps in ranking (e.g., if there’s a tie).

4. Calculating Running Totals for Each User

To analyze user purchase behavior over time, we can calculate a running total of their cumulative purchases:

SELECT
    user_id,
    sale_date,
    sale_amount,
    SUM(sale_amount) OVER (
        PARTITION BY user_id
        ORDER BY sale_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_total
FROM sales;

This running total helps us understand how much each user is spending over time and is crucial for cohort or lifecycle value analysis.

Performance Tip: Make sure sale_date is indexed when querying large datasets with ORDER BY clauses in window functions.

5. Segmenting Users Based on Weekly Purchase Behavior

Suppose we want to classify users into frequency buckets — such as occasional, regular, and power users — based on purchase frequency per week.

WITH weekly_user_sales AS (
    SELECT
        user_id,
        DATE_TRUNC('week', sale_date) AS week,
        COUNT(*) AS weekly_orders
    FROM sales
    GROUP BY user_id, DATE_TRUNC('week', sale_date)
)
SELECT *,
    CASE
        WHEN weekly_orders = 1 THEN 'Occasional'
        WHEN weekly_orders BETWEEN 2 AND 4 THEN 'Regular'
        ELSE 'Power'
    END AS user_segment
FROM weekly_user_sales;

We are:

  • Using a common table expression (CTE) to aggregate weekly orders per user.
  • Applying CASE logic to segment users dynamically based on behavior.

Real-world Extension: You could combine this with window functions like NTILE to quantile users based on percentile-based segments.

6. Percent of Total Sales Contribution

If you want to understand what share of sales each product contributes within a week, window functions help simplify this:

SELECT
    product_id,
    week,
    total_sales,
    ROUND(100 * total_sales / SUM(total_sales) OVER (PARTITION BY week), 2) AS percent_of_weekly_sales
FROM (
    SELECT
        product_id,
        DATE_TRUNC('week', sale_date) AS week,
        SUM(sale_amount) AS total_sales
    FROM sales
    GROUP BY product_id, DATE_TRUNC('week', sale_date)
) AS weekly_totals

Now each product is listed with its corresponding weekly contribution to total product revenue. This is highly valuable in dashboarding and product mix optimization.

7. Performance Considerations and Best Practices

While window functions are powerful, here are a few performance tips to keep in mind:

  • Ensure appropriate indexes exist on PARTITION BY and ORDER BY columns.
  • Keep the ROWS frame clause as precise as needed; default behavior can retain more rows in memory than necessary.
  • Use CTEs to organize long queries and simplify nested logic.
  • Pre-aggregate large datasets in staging tables if performance becomes an issue.

Conclusion

SQL window functions unlock a suite of powerful analytics capabilities that are essential for making sense of sales metrics, user behavior, and product performance. By layering these techniques, you can produce dynamic, real-time insights without compromising performance or query readability.

Next time you find yourself reaching for nested subqueries or temporary tables to summarize data — reach instead for window functions.

 

Useful links: