MySQL - How To Get Top N Rows per Each Group

Question: How to return the 2 largest cities for each country?

Sample Data

Assume we have the following table definition and data:

   CREATE TABLE cities
   (
      city VARCHAR(80),
      country VARCHAR(80),
      population INT
   );
 
   INSERT INTO cities VALUES ('New York', 'United States', 8175133); 
   INSERT INTO cities VALUES ('Los Angeles', 'United States', 3792621); 
   INSERT INTO cities VALUES ('Chicago', 'United States', 2695598); 
 
   INSERT INTO cities VALUES ('Paris', 'France', 2181000);
   INSERT INTO cities VALUES ('Marseille', 'France', 808000);
   INSERT INTO cities VALUES ('Lyon', 'France', 422000);
 
   INSERT INTO cities VALUES ('London', 'United Kingdom',	7825300);
   INSERT INTO cities VALUES ('Birmingham', 'United Kingdom', 1016800);
   INSERT INTO cities VALUES ('Leeds', 'United Kingdom', 770800);

MySQL Query to Get Top 2

To get the 2 largest cities for each country, you can use the following query in MySQL:

   SELECT city, country, population
   FROM
     (SELECT city, country, population, 
                  @country_rank := IF(@current_country = country, @country_rank + 1, 1) AS country_rank,
                  @current_country := country 
       FROM cities
       ORDER BY country, population DESC
     ) ranked
   WHERE country_rank <= 2;

The query returns:

city country population
Paris France 2181000
Marseille France 808000
London United Kingdom 7825300
Birmingham United Kingdom 1016800
New York United States 8175133
Los Angeles United States 3792621

How It Works

Explanation of the MySQL query:

  • Session Variables

Currently MySQL does not support ROW_NUMBER() function that can assign a sequence number within a group, but as a workaround we can use MySQL session variables.

These variables do not require declaration, and can be used in a query to do calculations and to store intermediate results.

@current_country := country

This code is executed for each row and stores the value of country column to @current_country variable.

@country_rank := IF(@current_country = country, @country_rank + 1, 1)

In this code, if @current_country is the same we increment rank, otherwise set it to 1. For the first row @current_country is NULL, so rank is also set to 1.

For correct ranking, we need to have ORDER BY country, population DESC

So if we just execute the subquery:

   SELECT city, country, population, 
       @country_rank := IF(@current_country = country, @country_rank + 1, 1) AS country_rank,
       @current_country := country 
  FROM cities
  ORDER BY country, population DESC

We get the list of cities ranked by population within the country:

city country population country_rank current_country
Paris France 2181000 1 France
Marseille France 808000 2 France
Lyon France 422000 3 France
London United Kingdom 7825300 1 United Kingdom
Birmingham United Kingdom 1016800 2 United Kingdom
Leeds United Kingdom 770800 3 United Kingdom
New York United States 8175133 1 United States
Los Angeles United States 3792621 2 United States
Chicago United States 2695598 3 United States
  • Selecting Range

When we have a rank assigned to each city within its country, we can retrieve the required range:

   -- Get top 2 for each country
   SELECT city, country, population
   FROM (/*subquery above*/) ranked
   WHERE country_rank <= 2;
 
   -- Get the city with 3rd population for each country
   SELECT city, country, population
   FROM (/*subquery above*/) ranked
   WHERE country_rank = 3;

ROW_NUMBER() - Oracle, SQL Server and PostgreSQL

In Oracle, SQL Server and PostgreSQL you can achieve the same functionality using ROW_NUMBER function:

   SELECT city, country, population
   FROM
    (SELECT city, country, population, 
                  ROW_NUMBER() OVER (PARTITION BY country ORDER BY population DESC) as country_rank
      FROM cities) ranked
   WHERE country_rank <= 2;

This query works in Oracle, SQL Server and PostgreSQL without any changes and returns:

city country population
Paris France 2181000
Marseille France 808000
London United Kingdom 7825300
Birmingham United Kingdom 1016800
New York United States 8175133
Los Angeles United States 3792621

Resources