Get up to 80 % extra points for free! More info:

Lesson 7 - MySQL Step By Step: Queries Over Multipe Tables (JOIN)

In the previous lesson, MySQL step by step: Data Types and NULL, we went through data types and explained the NULL value. Today, we're going to start with a simple content management system, that will look similarly to the one here at ICT.social. We're going to show queries through multiple tables.

Conceptual model

In the next few lessons, we're going to create kind of a simlified ICT.social database. First of all, let's talk about how it's going to look like. Today, of course, we're going to make just a small part. And because a picture can tell more than a thousand words, let's start right with it:

Conceptual model of a content management system - MySQL Database Step by Step

What you see is a conceptual model. It's created using the UML notation (a graphical language) and in practice, we create these kind of diagrams very often created before we even start writing any code. This way, we can think about what we really need to create.

As we can see, the system user can write comments and articles. The articles belong into sections. So it's a database of a very simple content management system, which you can certainly imagine thanks to ICT.social.

Creating tables and data

Today, we're going to focus on queries through multiple tables. Let's create some tables first. Tables for the users and articles will just be enough for us right now.

Users

Because the CMS user's going to have a different structure than our current user, we're going to create the user table again. Let's drop the current one:

DROP TABLE `user`;

Then we'll create a new table. The user's going to have (besides its id) a nickname, email, and password:

CREATE TABLE `user` (
    `user_id` int AUTO_INCREMENT,
    `nickname` varchar(155),
    `email` varchar(155),
    `password` varchar(255),
    PRIMARY KEY (`user_id`)
);

Let's insert some users right away:

INSERT INTO `user` (`nickname`, `email`, `password`) VALUES
('Michael', '[email protected]', 'dGg#@$DetA53d'),
('David', '[email protected]', '$#fdfgfHBKBKS'),
('Denny', '[email protected]', 'Jmls_aSW2RFss'),
('Emma', '[email protected]', 'fw8QT32qmcsld');

Articles

Articles will be bound to the user who wrote them, that means their author. To make the relationship between the tables, we're going to add a column with the author id to the article table. The id of the user who wrote the article is going to be stored there (it's the primary key from the user table).

We're talking about the 1:N relationship (1 user can have N (multiple) articles and each article belongs exactly to one user). The part (here our article) always stores the id of the whole (the user in this case) where it belongs to.

The article will contain (in addition to its id again) the author id, short description, url, title, content and the publication date. Let's create the article table:

CREATE TABLE `article` (
    `article_id` int AUTO_INCREMENT,
    `author_id` int,
    `description` varchar(155),
    `url` varchar(155),
    `title` varchar(155),
    `content` text,
    `published` datetime,
    PRIMARY KEY (`article_id`)
);

The only thing worth mentioning is the use of the TEXT type for the article content.

Next, we're going to add articles and assign users as authors to them. I took 4 articles from ICT.social, which I greatly shortened and simplified. The query looks like this:

INSERT INTO `article` (`author_id`, `description`, `url`, `title`, `content`, `published`) VALUES
(1, "What's an algorithm? If you don't know, read this article.", 'what-is-an-algorithm', 'Algorithm', "When we talk about algorithms, let's agree on what an algorithm is all about. Simply put, the algorithm is a guide to solving a problem. When we look at it from a human perspective, the algorithm might be a guide to get up in the morning. Even if it sounds simple, it's quite a problem. Computers are machines and they do not think. Therefore, we must describe all the steps of the algorithm. This brings us to the first property of the algorithm - it must be elementary (consisting of a finite number of simple and easy-to-understand steps, i.e. commands). 'Get out of bed' is certainly not an algorithm. 'Open your eyes, take out the blanket, sit down, put your feet on the floor and rise' - this sounds quite detailed and would be a true algorithm. But we'll probably work in IT, so we'll solve problems of sorting elements by their value or of searching for an element based on its content. These are the two basic tasks that computers do most often and which need to be thought through and optimized to take as little time as possible. As another example of algorithm, it could be also to solve a quadratic equation or to solve a sudoku.", '2012-3-21'),
(2, 'Bacteria is a cellular automaton in combination with a game.', 'bacteria-cellular-automaton', 'Bacteria', 'Bacteria is a cellular automaton developed by the British mathematician John Horton Conway in 1970. This whole game is controlled by four simple rules:/n/n
1. A living bacteria with fewer than two living neighbors die./n
2. A living bacteria with more than three living neighbors dies due to overgrowth./n
3. A living bacteria with two or three living neighbors survives without a change to the next generation./n
4. A dead bacteria with exactly three living neighbors becomes alive again./n
With a right initial placement of the bacteria, these seemingly primitive rules can create walking formations, clusters "firing" walking fives, surprisingly complex symmetrical explosions, oscillators (periodically oscillating groups), or the infinite spectacle of how complex and perfect shapes can be created by these four conditions. The entire program is conceived as a game, where your goal is to create a colony living as long as possible.', '2012-2-14'),
(3, 'Cheese Mouse is a relaxing 2D game.', 'cheese-mouse-relaxing-platformer', 'Cheese Mouse', 'Cheese mouse is a 2D game with a "hot island atmosphere" where you control a mouse and you have to get to the cheese. But there are lots of pitfalls and enemies like snakes, rats, piranhas, robots, mummies, and all sorts of creatures. I created the game with several colorful worlds at the elementary school with Veisen and it ended up at the 2nd place in the Bonusweb game competition, where it won $200. It was created in Game Maker through summer holidays, in my childhood, which greatly influenced its graphic design. I like to play it sometimes to relax and for better mood.', '2004-6-22'),
(2, 'Pacman is a remake of the iconic game.', 'pacman-remake', 'Pacman', "This is an absolutely basic version of this game including a level editor, so you can create your own levels. Over time, I'm going to modify it a bit and add some new features, fullscreen and better graphics. The engine of the game will also be the core of my new Geckon man project, which is still in the writing stage.", '2011-6-3');

Queries through multiple tables

Now, we have the articles in the database and the users assigned to them. Let's make a query over these 2 tables, get the articles and join the nicknames of their authors to them. I didn't use the word "join" accidentally, the command to access another table in a query is called JOIN. Let's write the query and then explain it. We're going to write queries on multiple lines to make them more readable:

SELECT `title`, `nickname`
FROM `article`
JOIN `user` ON `author_id` = `user_id`
ORDER BY `nickname`;

The results:

Pacman          David
Bacteria        David
Cheese Mouse    Denny
Algorithm       Michael

In the first line of the SELECT statement, we select the columns as they were all from a single table, simply listing what we're interested in. Since we're selecting articles and attaching the users to them, we select from the article table. To attach data of another table, we use the JOIN clause specifying the table we want to attach, and then the ON clause. ON is similar to WHERE, but it only applies to the attached table and not to the one which we primarily select the data from. In the condition, we define that to every article the user whose user_id is stored in the author_id column should be attached. We ordere the results by the users' nickname. In case we needed just some of the articles, we'd place a WHERE clause before the ORDER BY, as we're used to.

INNER JOIN and OUTER JOIN

The INNER and OUTER JOINs are two types of the JOIN statements. They work the same way and the only difference is what happens when the item, to which the relationship refers to, doesn't exist.

INNER JOIN

If we use just the JOIN keyword in an SQL query, it's considered an INNER JOIN by MySQL. If there's no user with the id that is stored in the article, such an article without a user wouldn't be included in the results. The relationship is unbreakable.

Let's try it. We'll add an article that'll refer to the id of a non-existing user:

INSERT INTO `article` (`author_id`, `description`, `url`, `title`, `content`, `published`) VALUES
(99, 'An article with a non-existent user to test different JOINS.', 'article-without-user', 'Article without an author', 'This article is assigned to a nonexistent user with the ID of 99 and is used to test different types of JOINS in the MySQL database.', '2012-10-21');

The inserted article refers to a user with the user_id equal to 99, who is not in the database. Let's run our SQL query with the JOIN again. Just to be clear, it's better to define that we want the INNER JOIN explicitly:

SELECT `title`, `nickname`
FROM `article`
INNER JOIN `user` ON `author_id` = `user_id`
ORDER BY `nickname`;

The results:

Pacman          David
Bacteria        David
Cheese Mouse    Denny
Algorithm       Michael

The results are still the same, and the article without an author isn't in the results.

LEFT OUTER JOIN

OUTER JOINs allow you to select even those results, that have failed to join due to missing entries. Let's try the LEFT JOIN, which considers the result valid even if the left part of the join condition (article) exists and the right one (the one that's being joined, the user) doesn't. The columns of the part being joined will contain the NULL value:

SELECT `title`, `nickname`
FROM `article`
LEFT JOIN `user` ON `author_id` = `user_id`
ORDER BY `nickname`;

The results:

Article without an author    NULL
Pacman                       David
Bacteria                     David
Cheese Mouse                 Denny
Algorithm                    Michael

We can see that the article's been selected anyway, even the right part (the one being joined, the user) hasn't been selected. Before we join tables, it's a good idea to think about whether the join can fail and what we want to happen in that case. In a real application, this probably shouldn't happen.

RIGHT OUTER JOIN

Similarly, as the LEFT outer JOIN considered the result valid when at least the left part existed, the RIGHT JOIN does the opposite. If there's the user (the right part being joined) and there's no article (the left part), it's still going to be included in the results. Personally, I haven't used this JOIN yet. We already have one such user in the table, it's Emma. Let's try the RIGHT JOIN:

SELECT `title`, `nickname`
FROM `article`
RIGHT JOIN `user` ON `author_id` = `user_id`
ORDER BY `nickname`;

The results:

Pacman          David
Bacteria        David
Cheese Mouse    Denny
NULL            Emma
Algorithm       Michael

As expected, the article without an author disappeared, and the Emma showed up.

We'd certainly find some other JOIN statements in MySQL, but this is enough for our purposes.

Using the WHERE clause

Theoretically, we can avoid using JOINS and use the FROM and WHERE clauses instead. In FROM, we can list multiple tables separated by commas. Then, in WHERE, we specify the condition to join the table. Ideally, the database converts this query to the INNER JOIN first and then processes it.

SELECT `title`, `nickname`
FROM `article`, `user`
WHERE `author_id` = `user_id`
ORDER BY `nickname`;

The results are the same as with INNER JOIN:

Pacman          David
Bacteria        David
Cheese Mouse    Denny
Algorithm       Michael

The disadvantage of joining with the WHERE clause is that we wouldn't be able to perform all kinds of JOINs and in some cases, the queries can be less optimized. We never know how the database will optimize such a query and it will vary according to the used database. Let's consider this approach rather a curiosity than for a real use.

In the next lesson, MySQL Step By Step: More Queries And the M:N Relationship, we'll continue with queries through multiple tables and add another database part of our content management system.


 

Previous article
MySQL step by step: Data Types and NULL
All articles in this section
MySQL Database Step by Step
Skip article
(not recommended)
MySQL Step By Step: More Queries And the M:N Relationship
Article has been written for you by David Capka Hartinger
Avatar
User rating:
No one has rated this quite yet, be the first one!
The author is a programmer, who likes web technologies and being the lead/chief article writer at ICT.social. He shares his knowledge with the community and is always looking to improve. He believes that anyone can do what they set their mind to.
Unicorn university David learned IT at the Unicorn University - a prestigious college providing education on IT and economics.
Activities