The provided MySQL query is composed of two parts, each performing a SELECT
operation and combined using the UNION ALL
operator. Let’s break down the query into its components and explain how it works step-by-step.
Query Breakdown
The query retrieves two separate results:
- The user who has rated the most movies.
- The highest-rated movie in February 2020.
Both results are fetched using a UNION ALL
operator, which combines the two result sets into a single output.
First Part: Top User by Number of Movies Rated
sql
(SELECT
Users.name AS results
FROM
MovieRating
INNER JOIN
Users USING (user_id)
GROUP BY user_id , name
ORDER BY COUNT(MovieRating.movie_id) DESC , Users.name
LIMIT 1)
Explanation:
- INNER JOIN MovieRating and Users: This joins the
MovieRating
table with the Users
table based on the user_id
column. It ensures that only the ratings for each user are included in the result set. - GROUP BY user_id, name: The
GROUP BY
clause groups the results by user_id
and name
, which means each row will represent a unique user. - COUNT(MovieRating.movie_id): This counts the number of movies rated by each user. The
COUNT
function is used to aggregate the data. - ORDER BY COUNT(MovieRating.movie_id) DESC, Users.name: The result is ordered first by the number of movies rated (in descending order). In case of a tie (users with the same number of ratings), it further sorts the users alphabetically by their names.
- LIMIT 1: Only the top user (the user who has rated the most movies) is returned. In case of a tie, the user whose name comes first alphabetically is selected.
Result: This part of the query selects the name of the user who has rated the most movies.
Second Part: Top-Rated Movie in February 2020
sql
UNION ALL
(SELECT
Movies.title AS results
FROM
MovieRating
INNER JOIN
Movies USING (movie_id)
WHERE
DATE_FORMAT(created_at, '%Y-%m') = '2020-02'
GROUP BY movie_id, title
ORDER BY AVG(MovieRating.rating) DESC , Movies.title
LIMIT 1)
Explanation:
- INNER JOIN MovieRating and Movies: This joins the
MovieRating
table with the Movies
table based on the movie_id
column. Each row in the result set will contain information about both the movie and its ratings. - WHERE DATE_FORMAT(created_at, '%Y-%m') = '2020-02': The
WHERE
clause filters the results to include only those ratings that were created in February 2020. The DATE_FORMAT
function is used to extract the year and month from the created_at
column. - GROUP BY movie_id, title: The results are grouped by
movie_id
and title
to calculate the average rating for each movie. - AVG(MovieRating.rating): The
AVG
function computes the average rating for each movie. - ORDER BY AVG(MovieRating.rating) DESC, Movies.title: The results are ordered by the average rating of the movies (in descending order). In case of a tie (multiple movies having the same average rating), it is sorted alphabetically by the movie title.
- LIMIT 1: Only the movie with the highest average rating is returned. If there is a tie, the movie whose title comes first alphabetically is selected.
Result: This part of the query retrieves the title of the highest-rated movie in February 2020.
Combining the Results with UNION ALL
sql
UNION ALL
The UNION ALL
operator is used to combine the results from the two separate SELECT
queries into one result set. Unlike UNION
, which removes duplicates, UNION ALL
includes all results, so both parts of the query are simply stacked on top of each other.
Final Output
The final output of the query consists of two rows:
- The name of the user who has rated the most movies.
- The title of the highest-rated movie in February 2020.
Each row is retrieved by the corresponding part of the query, and they are combined into a single result set. The results are listed in the order they are selected, without further sorting applied to the combined data set.
Conclusion
This query is designed to provide two insights from the database:
- The top user based on the number of movies rated.
- The top movie based on ratings from February 2020.
By using UNION ALL
, the query combines the results into a single output, providing a succinct way to display both the top user and top movie in one query.