The Challenge

  • Name: Market Data

  • Description: This challenge uses the same db file as the challenge "PS2 games". I’m curious about the recent market in action games. Find the total revenue generated by the 5th most revenue generating game in the Action genre in the last year (2023). Revenue is equal to the subtotal times quantity for a given order details. Some games have the same name because they're reboots of older games, make sure to group them up by id instead of by title! Do not pay attention to discounts, the order's total_amount (that includes services), or the price of the game itself (I haggle on them anyway). Flag Format: texsaw{x} where x is a number in decimal format precise to two decimal places

  • Category: Miscellaneous, SQL

  • Points: 50

The Solution

  1. With the description, we are also provided an SQL file db.sql

  2. Import the file into MySQL:

    1. > sudo mysql -u root -p
    2. source ./db.sql
  3. Run the following query

    SELECT sum(quantity * subtotal) AS total_revenue
    FROM OrderDetails
    WHERE game_id IN (
    	SELECT game_id
    	FROM Games
    	NATURAL INNER JOIN Genres
    	WHERE genre_name = "Action"
    )
    AND order_id in (
    	SELECT order_id
    	FROM Orders
    	WHERE YEAR(order_date) = 2023
    )
    GROUP BY game_id
    ORDER BY total_revenue DESC
    LIMIT 5;
    
    +---------------+
    | total_revenue |
    +---------------+
    |        434.84 |
    |        320.39 |
    |        297.39 |
    |        258.38 |
    |        247.16 |
    +---------------+
    
  4. The answer is the last row in the column total_revenue, 247.16.

  5. Format the total into the flag texsaw{247.16}

The Steps

Five Second Database Run-Down

  • SQL stands for Structured Query Language and is a language to perform operations on data, commonly used for relational databases.
  • A database contains tables which typically represent a single entity (person, place, thing, etc.) or a relationship between two entities.
  • A table's rows are labelled by attributes, which describe the entity or relationship.
  • Relational databases use keys to form relations.
    • The primary key identifies the entity in a table
    • A foreign key identifies a different entity that's related to the primary
  • There are a lot of rules for data in a relational database but the most important is the primary key can not be null.

Collecting Information

  1. First let's gather they key information from the challenge description:
    • Find the total revenue generated by the 5th most revenue generating game in the Action genre in the last year (2023).
    • Revenue is equal to the subtotal times quantity for a given order details.
    • Some games have the same name because they're reboots of older games, make sure to group them up by id instead of by title!
  2. In addition to the challenge description, we are provided a db.sql file which contains a set of SQL queries. Let's open the file in a text editor to see what's happening:
    1. First, a brand new iteration of the database gamestore is created and switched to.
      -- Create the database
      DROP DATABASE IF EXISTS gamestore;
      CREATE DATABASE IF NOT EXISTS gamestore;
      
      -- Switch to the created database
      USE gamestore;
      
    2. Then a lot of tables inside the gamestore database are created.
      -- Create Clients Table
      CREATE TABLE Clients ...
      ...
      -- Create Wishlists Table
      CREATE TABLE Wishlists ...
      
    3. Finally data is inserted into their respective tables.
      insert into Clients (first_name, last_name, email, phone_number, address) values ('Selie', 'O''Hickee', 'sohickee0@live.com', '6412716528', '85482 8th Point');
      ...
      insert into Wishlists (client_id, game_id, added_date) values (176, 409, '2023-10-08');
      
  3. Now that we know what's happening, let's examine the structure of the important tables using the key information from Step 1. The important tables we need to worry about are Games, Genres, Orders, and OrderDetails.

Finding the Action Games

  1. Examining the Games table:
    CREATE TABLE Games (
    	game_id INT AUTO_INCREMENT PRIMARY KEY,
    	title VARCHAR(255),
    	genre_id INT,
    	release_date DATE,
    	price DECIMAL(10, 2),
    	publisher_id INT,
    	FOREIGN KEY (genre_id) REFERENCES Genres(genre_id),
    	FOREIGN KEY (publisher_id) REFERENCES Publishers(publisher_id)
    );
    
    • When looking through the details of a game the two things that catch my attention is the game_id and the genre_id.
    • We see that the genre_id is a foreign key pointing to the Genres table, and since we know we only want to consider action games in our result, let's look at that next.
  2. Examining the Genres table:
    CREATE TABLE Genres (
    	genre_id INT AUTO_INCREMENT PRIMARY KEY,
    	genre_name VARCHAR(50)
    );
    
    • This table is much simpler than the Games table, showing only the genre_id and the genre_name. I'm curious what genre_id ties to action games:

      SELECT * FROM Genres;
      +----------+--------------+
      | genre_id | genre_name   |
      +----------+--------------+
      |        1 | Action       |
      ...
      
    • The id which ties to action games is 1. Now let's use this table and select only the action games from the Games table.

      SELECT * FROM Games NATURAL INNER JOIN Genres WHERE genre_name = "Action";
      -- or the simpler: SELECT * FROM Games WHERE genre_id=1;
      
      +----------+---------+-------------------------------+--------------+-------+--------------+------------+
      | genre_id | game_id | title                         | release_date | price | publisher_id | genre_name |
      +----------+---------+-------------------------------+--------------+-------+--------------+------------+
      |        1 |       6 | Beneath the Cybersea          | 1988-05-24   | 10.34 |           19 | Action     |
      ...
      44 rows in set (0.001 sec)
      
  3. We can see that there are 44 action games and more importantly, we now have the first key to answering the challenge:
    • SELECT game_id FROM Games NATURAL INNER JOIN Genres WHERE genre_name = "Action";

Finding The Orders in 2023

  1. Examining the Orders table:
    CREATE TABLE Orders (
    	order_id INT AUTO_INCREMENT PRIMARY KEY,
    	client_id INT,
    	order_date DATE,
    	total_amount DECIMAL(10, 2),
    	FOREIGN KEY (client_id) REFERENCES Clients(client_id)
    );
    
    • In this table we see the order_date, which of course is type DATE. Let's select all of the orders which occurred in 2023.
      SELECT * FROM Orders WHERE YEAR(order_date) = 2023;
      
      +----------+-----------+------------+--------------+
      | order_id | client_id | order_date | total_amount |
      +----------+-----------+------------+--------------+
      |        1 |         1 | 2023-06-18 |       112.64 |
      ...
      922 rows in set (0.001 sec)
      
  2. This gives us the second piece of the challenge SELECT order_id FROM Orders WHERE YEAR(order_date) = 2023;

Finding the Game Revenue

  1. Examining the OrderDetails table:

    CREATE TABLE OrderDetails (
    	order_detail_id INT AUTO_INCREMENT PRIMARY KEY,
    	order_id INT,
    	game_id INT,
    	quantity INT,
    	subtotal DECIMAL(10, 2),
    	FOREIGN KEY (order_id) REFERENCES Orders(order_id),
    	FOREIGN KEY (game_id) REFERENCES Games(game_id)
    );
    
    • This is the final piece of the challenge, here we see a foreign key order_id pointing to the Orders table, the foreign key game_id which points to the Games table, the quantity of games sold, and finally the subtotal of the order.
  2. First, let's find only the order details which involved an action game using our first query.

    SELECT *
    FROM OrderDetails
    WHERE game_id IN (
    	SELECT game_id
    	FROM Games
    	NATURAL INNER JOIN Genres
    	WHERE genre_name = "Action"
    );
    
    +-----------------+----------+---------+----------+----------+
    | order_detail_id | order_id | game_id | quantity | subtotal |
    +-----------------+----------+---------+----------+----------+
    |              43 |       43 |       6 |        1 |    67.70 |
    ...
    78 rows in set (0.001 sec)
    
  3. We can see there are a total of 78 rows in OrderDetails which involve action games. Now let's add the revenue

    SELECT *, (quantity * subtotal) AS revenue -- added revenue
    FROM OrderDetails
    WHERE game_id IN (
    	SELECT game_id
    	FROM Games
    	NATURAL INNER JOIN Genres
    	WHERE genre_name = "Action"
    );
    
    +-----------------+----------+---------+----------+----------+---------+
    | order_detail_id | order_id | game_id | quantity | subtotal | revenue |
    +-----------------+----------+---------+----------+----------+---------+
    |              43 |       43 |       6 |        1 |    67.70 |   67.70 |
    ...
    78 rows in set (0.001 sec)
    
  4. Now that we have the revenue for each order detail, we need to combine (or group) the results for each game and sum the total per game.

    SELECT *, sum(quantity * subtotal) AS total_revenue -- added sum
    FROM OrderDetails
    WHERE game_id IN (
    	SELECT game_id
    	FROM Games
    	NATURAL INNER JOIN Genres
    	WHERE genre_name = "Action"
    )
    GROUP BY game_id; -- added group by
    
    +-----------------+----------+---------+----------+----------+---------------+
    | order_detail_id | order_id | game_id | quantity | subtotal | total_revenue |
    +-----------------+----------+---------+----------+----------+---------------+
    |              43 |       43 |       6 |        1 |    67.70 |        141.91 |
    ...
    38 rows in set (0.003 sec)
    
  5. These results sum the revenue for each game and places the result into the total_revenue category. Now that we know the numbers, we need to restrict the results further to only include orders which took place in 2023 using our second query.

    SELECT *, sum(quantity * subtotal) AS total_revenue
    FROM OrderDetails
    WHERE game_id IN (
    	SELECT game_id
    	FROM Games
    	NATURAL INNER JOIN Genres
    	WHERE genre_name = "Action"
    )
    AND order_id in ( -- added the second nested query
    	SELECT order_id
    	FROM Orders
    	WHERE YEAR(order_date) = 2023
    )
    GROUP BY game_id;
    
    +-----------------+----------+---------+----------+----------+---------------+
    | order_detail_id | order_id | game_id | quantity | subtotal | total_revenue |
    +-----------------+----------+---------+----------+----------+---------------+
    |              43 |       43 |       6 |        1 |    67.70 |        141.91 |
    ...
    37 rows in set (0.002 sec)
    
  6. Lastly, we need to sort the results, we can also remove the unneeded columns from the results and just keep the total_revenue. I also limited the rows to 5 since we only need the 5th highest revenue.

    SELECT sum(quantity * subtotal) AS total_revenue -- removed the glob (*)
    FROM OrderDetails
    WHERE game_id IN (
    	SELECT game_id
    	FROM Games
    	NATURAL INNER JOIN Genres
    	WHERE genre_name = "Action"
    )
    AND order_id in (
    	SELECT order_id
    	FROM Orders
    	WHERE YEAR(order_date) = 2023
    )
    GROUP BY game_id
    ORDER BY total_revenue DESC -- added the sort and limit
    LIMIT 5;
    
    +---------------+
    | total_revenue |
    +---------------+
    |        434.84 |
    |        320.39 |
    |        297.39 |
    |        258.38 |
    |        247.16 |
    +---------------+
    
  7. The last item in the column is our answer, format that into the flag and we're done! texsaw{247.16}