Understanding window functions in PostgreSQL (Part 2 - DENSE_RANK())

We continue our study (and enjoyment!) of PostgreSQL window functions from last time. And in case you didn’t read that one or don’t wish to, here’s a quick motivation for using window functions: they help us write “difficult” queries that would otherwise be inconceivable in basic SQL features!

Find the third-most costly product in a particular category

I don’t know about you, but a requirement like the above makes my heart freeze. You can feel your brain overloaded, whirring through options in a futile attempt to reconcile its basic SQL knowledge into something that could answer the above. But once again, window functions (this time DENSE_RANK()) make the job a whole lot easier.

Before jumping the gun, let’s make sure we understand the schema correctly:

CREATE TABLE product_groups (
    group_id serial PRIMARY KEY,
    group_name VARCHAR (255) NOT NULL
);

CREATE TABLE products (
    product_id serial PRIMARY KEY,
    product_name VARCHAR (255) NOT NULL,
    price DECIMAL (11, 2),
    group_id INT NOT NULL,
    FOREIGN KEY (group_id) REFERENCES product_groups (group_id)
);

We have here a typical products table with just one tiny twist: the category (group) the product belongs to is stored in the product_groups table. This wasn’t strictly needed; it’s just how one of the examples I came across, was arranged. That said, having the information split up like this allows us to apply the window function in the context of a JOIN – which is good learning!

Now, before I introduce the query, I want to explain the window function we’ll be using to solve this problem – DENSE_RANK(). If you can digest the fact that window functions can help you magically create groups (partitions) for a query like the one we’re after, DENSE_RANK() is easy to explain: it works on the rows in a group and assigns them a number (for ranking) based on their order in that group. (Yes, yes, an example is coming; just hold on for a minute!)

A related function is RANK() and its contrast to DENSE_RANK() is best explained through an example – assume we have the following order of products and their prices in a particular category:

Laptop-1    2000
Phone-1     1800
Laptop-2    1800
Laptop-3    1500

then RANK() would rank them as follows:

Laptop-1    2000    1
Phone-1     1800    2
Laptop-2    1800    2
Laptop-3    1500    4

and DENSE_RANK() would rank them like this:

Laptop-1    2000    1
Phone-1     1800    2
Laptop-2    1800    2
Laptop-3    1500    3

See the difference? Both RANK() and DENSE_RANK() assign a rank of 2 to the second-highest price in the group, but RANK() skips as many ranks for the next entry change as there were duplications (so, we go 1, 2, 2, 4) while DENSE_RANK() preserves a strict numeric order (1, 2, 2, 3 – 3 comes after 2 no matter what). Both have their uses, though in our case we need DENSE_RANK() as it fits our definition of finding the third-most costly product in each category.

A DENSE_RANK() example, with a Common Table Expression (CTE) thrown in

Let’s look at the query that’ll solve the given problem: find the third-most costly product in a given category:

WITH rank_data AS (
    SELECT product_name,
           price,
           group_name,
           DENSE_RANK() OVER (
               PARTITION BY group_name
               ORDER BY price
               ) price_rank
    FROM products
             INNER JOIN
         product_groups USING (group_id)
)
SELECT *
FROM rank_data
WHERE group_name = 'Laptop'
  AND price_rank = 3;

Now, let’s not lose our minds! The query is pretty simple; it just has a few parts to it. So, let’s look at these parts one by one:

Yes, that’s all there is to it! Granted that the syntax can throw you off initially, but don’t you agree how simple and powerful these window functions are?! Notice that many of the clauses in this query have nothing to do with window functions: JOIN, CTEs, WHERE, etc., but I’ve used them to create a more meaningful example.

Go ahead and practice a little by making your own similar scenarios. Or perhaps search online and look for some examples that extend the idea into different scenarios. One thing is sure: you’ll now easily understand the sorcery involved and will enjoy it as well!