Getting Freaky with MySQL Joins

Posted Dec 26, 2016
Tagged

One of the most powerful things about relational databases like mySQL is JOINing tables. Let's discuss writing different types of JOINS in mySQL. We've constructed a mini database for a fictional app called LewbowskiIn, basically LinkedIn, but only for characters in The Big Lebowski. You don't need to watch the movie, but it's recommended.

Our Mini Database

skills

skill_id
skill_name
1 Making White Russians
2 Marijuana Cultivation
3 Home Security
4 Dog boarding
5 Janitorial Services
6 Blow Jobs

user_skills

skill_id user_id
1 1
2 1
3 2
4 2
6 4
6 5

users

user_id first_name last_name
1 Jeffrey Lebwoski
2 Walter Sobchak
3 Donny Kerabatsos
4 Bunny Knutson

 

Basic INNER JOIN 

The user_skills table maps users to different skills. We want to join the user table to the user_skills table. We do this with the JOIN statement, also called INNER JOIN. They are the same thing. The important part of the JOIN statement is the ON clause. If the ON clause isn't included you'll get back every possible combination of the tables you're joining, also known as the cartesian product. In this example we only get back the rows where there is a match in both the users table and the user_skills table.

INNER JOIN Statement

mySQL Inner Join

QUERY:

SELECT * FROM users
INNER JOIN user_skills
ON users.user_id = user_skills.user_id;

Result:

user_id first_name last_name skill_id user_id
4 Bunny Knutson 6 4
2 Walter Sobchak 4 2
2 Walter Sobchak 3 2
1 Jeffrey Lebwoski 1 1
1 Jeffrey Lebwoski 2 1

INNER JOIN Analysis:

The returned results only contain row combinations where there is a matching user_id in each table. Since Walter and Jeffrey have two skills each, their names appear twice. An INNER JOIN will only return elements that exist IN each table, that's something you need to remember, in case you were wondering what the fuck happened to Donny. Not only is he out of his element, he's also outside the scope of the statement. If we want Donny back we need to do what's called an OUTER JOIN.

The OUTER JOIN

There are two types of outer joins, the LEFT OUTER JOIN and
RIGHT OUTER JOIN , also known as LEFT JOIN and RIGHT JOIN respectively. An outer join will include rows that don't have a common column value in each table, however it depends how the query is structured. If we do the following RIGHT JOIN statement we're joining the table on the right (user_skills) to the table on the left (users). Since user_skills does not contain Donny's user_id, he'll still be missing.

RIGHT JOIN Explained

mySQL Right Join

QUERY:

SELECT * FROM users
RIGHT JOIN user_skills
ON users.user_id = user_skills.user_id;

Results:

user_id first_name last_name skill_id user_id
1 Jeffrey Lebwoski 1 1
1 Jeffrey Lebwoski 2 1
2 Walter Sobchak 3 2
2 Walter Sobchak 4 2
4 Bunny Knutson 6 4
NULL NULL NULL 7 8

LEFT JOIN Statement:

But what if we want all the users returned, whether or not they have skills? We need a LEFT JOIN statement that will return all rows in the left table (users) whether or not they exist in the right table (user_skills). Now we've got Donny back, nobody really knows what Donny does, so he has no skills.

mySQL Left Join

QUERY:

SELECT * FROM users
LEFT JOIN user_skills
WHERE users.user_id = user_skills.user_id;

Result:

user_id first_name last_name skill_id user_id
1 Jeffrey Lebwoski 1 1
1 Jeffrey Lebwoski 2 1
2 Walter Sobchak 3 2
2 Walter Sobchak 4 2
3 Donny Kerabatsos NULL NULL
4 Bunny Knutson 6 4

 

We now see Donny in the results with a value of NULL for skill_id and user_id. This is because the user table was joined to user_skills table, so we get all the rows from users no matter what. One thing to note is that the right join could have returned a similar result to the left join if we just flipped the tables like so:

Flipped Tables Statement:

SELECT * FROM users
RIGHT JOIN user_skills
ON user_skills.user_id = users.user_id;

This statement will return the same results as the previous left join statement. 

Double join

You probably noticed those last results weren't super useful because they only returned the skill_id and not the actual skill name. Let's get freaky and create a double join statement.

SELECT * FROM users
LEFT JOIN user_skills
ON users.user_id = user_skills.user_id
INNER JOIN skills
ON user_skills.skill_id = skills.skill_id;

 

This statement joins all three tables so we get a result that also contains the skill name as shown here:

user_id first_name last_name skill_id user_id skill_id skill_name
1 Jeffrey Lebwoski 1 1 1 Making White Russians
1 Jeffrey Lebwoski 2 1 2 Marijuana Cultivation
2 Walter Sobchak 3 2 3 Home Security
2 Walter Sobchak 4 2 4 Dog Boarding
4 Bunny Knutson 6 4 6 Blow Jobs

Let's clean this up a little by specifying which columns we want returned. Also notice that we need to alias the users table to help us specify the user_id. If we didn't prefix the users table FROM users u we wouldn't be able to access the user_id column. The user_id column isn't unique, it exists in both the users and user_skills tables so our join statement needs to know which table we are referencing by aliasing the users table with a u.

QUERY:

SELECT u.user_id, first_name, last_name, skill_name
FROM users u
LEFT JOIN user_skills
ON u.user_id = user_skills.user_id
INNER JOIN skills
ON user_skills.skill_id = skills.skill_id;

Result: 

user_id first_name last_name skill_name
1 Jeffrey Lebwoski Making White Russians
1 Jeffrey Lebwoski Marijuana Cultivation
2 Walter Sobchak Home Security
2 Walter Sobchak Dog Boarding
4 Bunny Knutson Blow Jobs

 

In the double join query think about the statement in two steps, the first join creates a virtual table matching the results of our earlier examples, then the next join statement joins the next table to the one created by the first join. Virtual tables are an important concept to grasp when learning mySQL.

That's it for this lesson! We broke down MySQL Joins, one of the most important concepts to master in mySQL. Stay tuned for a new lesson on mySQL from us!