sql - left - outer join example




What is the difference between “INNER JOIN” and “OUTER JOIN”? (20)

Also how do LEFT JOIN, RIGHT JOIN and FULL JOIN fit in?


What is the difference between “INNER JOIN” and “OUTER JOIN”?

They are the most commonly used existential operators in SQL, where INNER JOIN is used for 'exists' and LEFT OUTER JOIN is used for 'does not exist'.

Consider these queries:

users who have posted and have votes
users who have posted but have no badges

People who look for set-based solutions (an industry term) would recognise the respective queries as:

users who have posted INTERSECT users who have votes
users who have posted MINUS users who have badges

Translating these into standard SQL:

SELECT UserId FROM Posts
INTERSECT 
SELECT UserId FROM Votes;

SELECT UserId FROM Posts
EXCEPT 
SELECT UserId FROM Badges;

Others will think along similar lines of set inclusion:

users who have posted and IN the set of users who have votes
users who have posted and NOT IN the set of users who have badges

Translating these into standard SQL:

SELECT UserId 
  FROM Posts
 WHERE UserId IN ( SELECT UserId FROM Votes );

SELECT UserId 
  FROM Posts
 WHERE UserId NOT IN ( SELECT UserId FROM Badges );

Some will think in terms of 'existance' within sets e.g.

users who have posted and EXIST in the set of users who have votes
users who have posted and do NOT EXIST in the set of users who have badges

Translating these into standard SQL (note we now need to use range variables i.e. p, v, b):

SELECT p.UserId 
  FROM Posts p
 WHERE EXISTS ( SELECT *
                  FROM Votes v
                 WHERE v.UserId = p.UserId );

SELECT p.UserId 
  FROM Posts p
 WHERE NOT EXISTS ( SELECT *
                      FROM Badges b
                     WHERE b.UserId = p.UserId );

However, I've found that the "industry standard" approach is to exclusively use joins. I don't know what the thinking is here (Law of the Instrument? Premature optimization?), so I'll go straight to the syntax:

SELECT p.UserId 
  FROM Posts p
       INNER JOIN Votes v ON v.UserId = p.UserId;

SELECT p.UserId 
  FROM Posts p
       LEFT JOIN Badges b ON b.UserId = p.UserId
 WHERE b.UserId IS NULL;

Things to note:

  • The only projection is from Users but we still need all those range variables (p, v, b) for search conditions.
  • The UserId IS NULL search condition 'belongs' to the the OUTER JOIN but is disconnected in the query.
  • LEFT is the industry standard: professionals will rewrite a query to avoid using RIGHT!
  • The OUTER keyword from LEFT OUTER JOIN is omitted.

Closing remarks:

Sometimes joins are used in queries solely to determine whether values exist or do not exists in another set. Learn to look carefully at the attributes being projected (the columns in the SELECT clause): if there are none from the joined table then they are simply being used as existential operators. Additionally for outer join, look for instances of <key_column> IS NULL in the WHERE clause.


Inner Join

Retrieve the matched rows only, that is, A intersect B.

SELECT *
FROM dbo.Students S
INNER JOIN dbo.Advisors A
    ON S.Advisor_ID = A.Advisor_ID

Left Outer Join

Select all records from the first table, and any records in the second table that match the joined keys.

SELECT *
FROM dbo.Students S
LEFT JOIN dbo.Advisors A
    ON S.Advisor_ID = A.Advisor_ID

Full Outer Join

Select all records from the second table, and any records in the first table that match the joined keys.

SELECT *
FROM dbo.Students S
FULL JOIN dbo.Advisors A
    ON S.Advisor_ID = A.Advisor_ID

References


  • INNER JOIN most typical join for two or more tables. It returns data match on both table ON primarykey and forignkey relation.
  • OUTER JOIN is same as INNER JOIN, but it also include NULL data on ResultSet.
    • LEFT JOIN = INNER JOIN + Unmatched data of left table with Null match on right table.
    • RIGHT JOIN = INNER JOIN + Unmatched data of right table with Null match on left table.
    • FULL JOIN = INNER JOIN + Unmatched data on both right and left tables with null matches.
  • Self join is not a keyword in SQL, when a table references data in itself knows as self join. Using INNER JOIN and OUTER JOIN we can write self join queries.

For example:

SELECT * 
FROM   tablea a 
       INNER JOIN tableb b 
               ON a.primary_key = b.foreign_key 
       INNER JOIN tablec c 
               ON b.primary_key = c.foreign_key 


INNER JOIN

An inner join produces a result set that is limited to the rows where there is a match in both tables for what we're looking for. If you don't know which kind of join you need, this will usually be your best bet.

LEFT OUTER JOIN

A left outer join, or left join, results in a set where all of the rows from the first, or left hand side, table are preserved. The rows from the second, or right hand side table only show up if they have a match with the rows from the first table. Where there are values from the left table but not from the right, the table will read null, which means that the value has not been set.

RIGHT OUTER JOIN

A right outer join, or right join, is the same as a left join, except the roles are reversed. All of the rows from the right hand side table show up in the result, but the rows from the table on the left are only there if they match the table on the right. Empty spaces are null, just like with the the left join.

FULL OUTER JOIN

A full outer join, or just outer join, produces a result set with all of the rows of both tables, regardless of whether there are any matches. Similarly to the left and right joins, we call the empty spaces null.

For More Reference


Inner Join An inner join focuses on the commonality between two tables. When using an inner join, there must be at least some matching data between two (or more) tables that are being compared. An inner join searches tables for matching or overlapping data. Upon finding it, the inner join combines and returns the information into one new table.

Outer Join An outer join returns a set of records (or rows) that include what an inner join would return but also includes other rows for which no corresponding match is found in the other table.

There are three types of outer joins:

Left Outer Join (or Left Join) Right Outer Join (or Right Join) Full Outer Join (or Full Join) Each of these outer joins refers to the part of the data that is being compared, combined, and returned. Sometimes nulls will be produced in this process as some data is shared while other data is not.


Joins are used to combine the data from two tables, with the result being a new, temporary table. Joins are performed based on something called a predicate, which specifies the condition to use in order to perform a join. The difference between an inner join and an outer join is that an inner join will return only the rows that actually match based on the join predicate. Lets consider Employee and Location table:

Inner Join:- Inner join creates a new result table by combining column values of two tables (Employee and Location) based upon the join-predicate. The query compares each row of Employee with each row of Location to find all pairs of rows which satisfy the join-predicate. When the join-predicate is satisfied by matching non-NULL values, column values for each matched pair of rows of Employee and Location are combined into a result row. Here’s what the SQL for an inner join will look like:

select  * from employee inner join location on employee.empID = location.empID
OR
select  * from employee, location where employee.empID = location.empID

Now, here is what the result of running that SQL would look like:

Outer Join:- An outer join does not require each record in the two joined tables to have a matching record. The joined table retains each record—even if no other matching record exists. Outer joins subdivide further into left outer joins and right outer joins, depending on which table's rows are retained (left or right).

Left Outer Join:- The result of a left outer join (or simply left join) for tables Employee and Location always contains all records of the "left" table (Employee), even if the join-condition does not find any matching record in the "right" table (Location). Here is what the SQL for a left outer join would look like, using the tables above:

select  * from employee left outer join location on employee.empID = location.empID;
//Use of outer keyword is optional

Now, here is what the result of running this SQL would look like:

Right Outer Join:- A right outer join (or right join) closely resembles a left outer join, except with the treatment of the tables reversed. Every row from the "right" table (Location) will appear in the joined table at least once. If no matching row from the "left" table (Employee) exists, NULL will appear in columns from Employee for those records that have no match in Location. This is what the SQL looks like:

select * from employee right outer join location  on employee.empID = location.empID;
//Use of outer keyword is optional

Using the tables above, we can show what the result set of a right outer join would look like:

Full Outer Joins:- Full Outer Join or Full Join is to retain the nonmatching information by including nonmatching rows in the results of a join, use a full outer join. It includes all rows from both tables, regardless of whether or not the other table has a matching value.

Image Source

MySQL 8.0 Reference Manual - Join Syntax

Oracle Join operations


Simplest Definitions

Inner Join: Returns matched records from both tables.

Full Outer Join: Returns matched and unmatched records from both tables with null for unmatched records from Both Tables.

Left Outer Join: Returns matched and unmatched records only from table on Left Side.

Right Outer Join: Returns matched and unmatched records only from table on Right Side.

In-Short

Matched + Left Unmatched + Right Unmatched = Full Outer Join

Matched + Left Unmatched = Left Outer Join

Matched + Right Unmatched = Right Outer Join

Matched = Inner Join



Assuming you're joining on columns with no duplicates, which is a very common case:

  • An inner join of A and B gives the result of A intersect B, i.e. the inner part of a Venn diagram intersection.

  • An outer join of A and B gives the results of A union B, i.e. the outer parts of a Venn diagram union.

Examples

Suppose you have two tables, with a single column each, and data as follows:

A    B
-    -
1    3
2    4
3    5
4    6

Note that (1,2) are unique to A, (3,4) are common, and (5,6) are unique to B.

Inner join

An inner join using either of the equivalent queries gives the intersection of the two tables, i.e. the two rows they have in common.

select * from a INNER JOIN b on a.a = b.b;
select a.*, b.*  from a,b where a.a = b.b;

a | b
--+--
3 | 3
4 | 4

Left outer join

A left outer join will give all rows in A, plus any common rows in B.

select * from a LEFT OUTER JOIN b on a.a = b.b;
select a.*, b.*  from a,b where a.a = b.b(+);

a |  b
--+-----
1 | null
2 | null
3 |    3
4 |    4

Right outer join

A right outer join will give all rows in B, plus any common rows in A.

select * from a RIGHT OUTER JOIN b on a.a = b.b;
select a.*, b.*  from a,b where a.a(+) = b.b;

a    |  b
-----+----
3    |  3
4    |  4
null |  5
null |  6

Full outer join

A full outer join will give you the union of A and B, i.e. all the rows in A and all the rows in B. If something in A doesn't have a corresponding datum in B, then the B portion is null, and vice versa.

select * from a FULL OUTER JOIN b on a.a = b.b;

 a   |  b
-----+-----
   1 | null
   2 | null
   3 |    3
   4 |    4
null |    6
null |    5

I don't see much details about performance and optimizer in the other answers.

Sometimes it is good to know that only INNER JOIN is associative which means the optimizer has the most option to play with it. It can reorder the join order to make it faster keeping the same result. The optimizer can use the most join modes.

Generally it is a good practice to try to use INNER JOIN instead of the different kind of joins. (Of course if it is possible considering the expected result set.)

There are a couple of good examples and explanation here about this strange associative behavior:


I recommend Jeff's blog article. The best description I've ever seen, plus there is a visualization, e.g.:

Inner Join:

Full Outer Join:


In simple words :

Inner join -> Take ONLY common records from parent and child tables WHERE primary key of Parent table matches Foreign key in Child table.

Left join ->

pseudo code

1.Take All records from left Table
2.for(each record in right table,) {
    if(Records from left & right table matching on primary & foreign key){
       use their values as it is as result of join at the right side for 2nd table.
    } else {
       put value NULL values in that particular record as result of join at the right side for 2nd table.
    }
  }

Right join : Exactly opposite of left join . Put name of table in LEFT JOIN at right side in Right join , you get same output as LEFT JOIN.

Outer join : Show all records in Both tables No matter what. If records in Left table are not matching to right table based on Primary , Forieign key , use NULL value as result of join .

Example :

Lets assume now for 2 tables

1.employees , 2.phone_numbers_employees

employees : id , name 

phone_numbers_employees : id , phone_num , emp_id   

Here , employees table is Master table , phone_numbers_employees is child table(it contains emp_id as foreign key which connects employee.id so its child table.)

Inner joins

Take the records of 2 tables ONLY IF Primary key of employees table(its id) matches Foreign key of Child table phone_numbers_employees(emp_id).

So query would be :

SELECT e.id , e.name , p.phone_num FROM employees AS e INNER JOIN phone_numbers_employees AS p ON e.id = p.emp_id;

Here take only matching rows on primary key = foreign key as explained above.Here non matching rows on primary key = foreign key are skipped as result of join.

Left joins :

Left join retains all rows of the left table, regardless of whether there is a row that matches on the right table.

SELECT e.id , e.name , p.phone_num FROM employees AS e LEFT JOIN phone_numbers_employees AS p ON e.id = p.emp_id;

Outer joins :

SELECT e.id , e.name , p.phone_num FROM employees AS e OUTER JOIN phone_numbers_employees AS p ON e.id = p.emp_id;

Diagramatically it looks like :


In simple words:

An inner join retrieve the matched rows only.

Whereas an outer join retrieve the matched rows from one table and all rows in other table ....the result depends on which one you are using:

  • Left: Matched rows in the right table and all rows in the left table

  • Right: Matched rows in the left table and all rows in the right table or

  • Full: All rows in all tables. It doesn't matter if there is a match or not


The Venn diagrams don't really do it for me.

They don't show any distinction between a cross join and an inner join, for example, or more generally show any distinction between different types of join predicate or provide a framework for reasoning about how they will operate.

There is no substitute for understanding the logical processing and it is relatively straightforward to grasp anyway.

  1. Imagine a cross join.
  2. Evaluate the on clause against all rows from step 1 keeping those where the predicate evaluates to true
  3. (For outer joins only) add back in any outer rows that were lost in step 2.

(NB: In practice the query optimiser may find more efficient ways of executing the query than the purely logical description above but the final result must be the same)

I'll start off with an animated version of a full outer join. Further explanation follows.


Explanation

Source Tables

First start with a CROSS JOIN (AKA Cartesian Product). This does not have an ON clause and simply returns every combination of rows from the two tables.

SELECT A.Colour, B.Colour FROM A CROSS JOIN B

Inner and Outer joins have an "ON" clause predicate.

  • Inner Join. Evaluate the condition in the "ON" clause for all rows in the cross join result. If true return the joined row. Otherwise discard it.
  • Left Outer Join. Same as inner join then for any rows in the left table that did not match anything output these with NULL values for the right table columns.
  • Right Outer Join. Same as inner join then for any rows in the right table that did not match anything output these with NULL values for the left table columns.
  • Full Outer Join. Same as inner join then preserve left non matched rows as in left outer join and right non matching rows as per right outer join.

Some examples

SELECT A.Colour, B.Colour FROM A INNER JOIN B ON A.Colour = B.Colour

The above is the classic equi join.

Animated Version

SELECT A.Colour, B.Colour FROM A INNER JOIN B ON A.Colour NOT IN ('Green','Blue')

The inner join condition need not necessarily be an equality condition and it need not reference columns from both (or even either) of the tables. Evaluating A.Colour NOT IN ('Green','Blue') on each row of the cross join returns.

SELECT A.Colour, B.Colour FROM A INNER JOIN B ON 1 =1

The join condition evaluates to true for all rows in the cross join result so this is just the same as a cross join. I won't repeat the picture of the 16 rows again.

SELECT A.Colour, B.Colour FROM A LEFT OUTER JOIN B ON A.Colour = B.Colour

Outer Joins are logically evaluated in the same way as inner joins except that if a row from the left table (for a left join) does not join with any rows from the right hand table at all it is preserved in the result with NULL values for the right hand columns.

SELECT A.Colour, B.Colour FROM A LEFT OUTER JOIN B ON A.Colour = B.Colour WHERE B.Colour IS NULL

This simply restricts the previous result to only return the rows where B.Colour IS NULL. In this particular case these will be the rows that were preserved as they had no match in the right hand table and the query returns the single red row not matched in table B. This is known as an anti semi join.

It is important to select a column for the IS NULL test that is either not nullable or for which the join condition ensures that any NULL values will be excluded in order for this pattern to work correctly and avoid just bringing back rows which happen to have a NULL value for that column in addition to the un matched rows.

SELECT A.Colour, B.Colour FROM A RIGHT OUTER JOIN B ON A.Colour = B.Colour

Right outer joins act similarly to left outer joins except they preserve non matching rows from the right table and null extend the left hand columns.

SELECT A.Colour, B.Colour FROM A FULL OUTER JOIN B ON A.Colour = B.Colour

Full outer joins combine the behaviour of left and right joins and preserve the non matching rows from both the left and the right tables.

SELECT A.Colour, B.Colour FROM A FULL OUTER JOIN B ON 1 = 0

No rows in the cross join match the 1=0 predicate. All rows from both sides are preserved using normal outer join rules with NULL in the columns from the table on the other side.

SELECT COALESCE(A.Colour, B.Colour) AS Colour FROM A FULL OUTER JOIN B ON 1 = 0

With a minor amend to the preceding query one could simulate a UNION ALL of the two tables.

SELECT A.Colour, B.Colour FROM A LEFT OUTER JOIN B ON A.Colour = B.Colour WHERE B.Colour = 'Green'

Note that the WHERE clause (if present) logically runs after the join. One common error is to perform a left outer join and then include a WHERE clause with a condition on the right table that ends up excluding the non matching rows. The above ends up performing the outer join...

... And then the "Where" clause runs. NULL= 'Green' does not evaluate to true so the row preserved by the outer join ends up discarded (along with the blue one) effectively converting the join back to an inner one.

If the intention was to include only rows from B where Colour is Green and all rows from A regardless the correct syntax would be

SELECT A.Colour, B.Colour FROM A LEFT OUTER JOIN B ON A.Colour = B.Colour AND B.Colour = 'Green'

SQL Fiddle

See these examples run live at SQLFiddle.com.


The answer is in the meaning of each one, so in the results.

Note :
In SQLite there is no RIGHT OUTER JOIN or FULL OUTER JOIN.
And also in MySQL there is no FULL OUTER JOIN.

My answer is based on above Note.

When you have two tables like these:

--[table1]               --[table2]
id | name                id | name
---+-------              ---+-------
1  | a1                  1  | a2
2  | b1                  3  | b2

CROSS JOIN / OUTER JOIN :
You can have all of those tables data with CROSS JOIN or just with , like this:

SELECT * FROM table1, table2
--[OR]
SELECT * FROM table1 CROSS JOIN table2

--[Results:]
id | name | id | name 
---+------+----+------
1  | a1   | 1  | a2
1  | a1   | 3  | b2
2  | b1   | 1  | a2
2  | b1   | 3  | b2

INNER JOIN :
When you want to add a filter to above results based on a relation like table1.id = table2.id you can use INNER JOIN:

SELECT * FROM table1, table2 WHERE table1.id = table2.id
--[OR]
SELECT * FROM table1 INNER JOIN table2 ON table1.id = table2.id

--[Results:]
id | name | id | name 
---+------+----+------
1  | a1   | 1  | a2

LEFT [OUTER] JOIN :
When you want to have all rows of one of tables in the above result -with same relation- you can use LEFT JOIN:
(For RIGHT JOIN just change place of tables)

SELECT * FROM table1, table2 WHERE table1.id = table2.id 
UNION ALL
SELECT *, Null, Null FROM table1 WHERE Not table1.id In (SELECT id FROM table2)
--[OR]
SELECT * FROM table1 LEFT JOIN table2 ON table1.id = table2.id

--[Results:]
id | name | id   | name 
---+------+------+------
1  | a1   | 1    | a2
2  | b1   | Null | Null

FULL OUTER JOIN :
When you also want to have all rows of the other table in your results you can use FULL OUTER JOIN:

SELECT * FROM table1, table2 WHERE table1.id = table2.id
UNION ALL
SELECT *, Null, Null FROM table1 WHERE Not table1.id In (SELECT id FROM table2)
UNION ALL
SELECT Null, Null, * FROM table2 WHERE Not table2.id In (SELECT id FROM table1)
--[OR] (recommended for SQLite)
SELECT * FROM table1 LEFT JOIN table2 ON table1.id = table2.id
UNION ALL
SELECT * FROM table2 LEFT JOIN table1 ON table2.id = table1.id
WHERE table1.id IS NULL
--[OR]
SELECT * FROM table1 FULL OUTER JOIN table2 On table1.id = table2.id

--[Results:]
id   | name | id   | name 
-----+------+------+------
1    | a1   | 1    | a2
2    | b1   | Null | Null
Null | Null | 3    | b2

Well, as your need you choose each one that covers your need ;).


The difference is in the way tables are joined if there are no common records.

JOIN is same as INNER JOIN and means to only show records common to both tables. Whether the records are common is determined by the fields in join clause.

For example:

SELECT * 
FROM t1
JOIN t2 on t1.ID = t2.ID

It means show only records where the same ID value exists in both tables.

LEFT JOIN is same as LEFT OUTER JOIN and means to show all records from left table (i.e. the one that precedes in SQL statement) regardless of the existence of matching records in the right table.

RIGHT JOIN is same as RIGHT OUTER JOIN and means opposite of LEFT JOIN, i.e. shows all records from the second (right) table and only matching records from first (left) table.


The difference is in the way tables are joined if there are no common records.

  • JOIN is same as INNER JOIN and means to only show records common to both tables. Whether the records are common is determined by the fields in join clause. For example:

    FROM t1
    JOIN t2 on t1.ID = t2.ID
    

    means show only records where the same ID value exists in both tables.

  • LEFT JOIN is same as LEFT OUTER JOIN and means to show all records from left table (i.e. the one that precedes in SQL statement) regardless of the existance of matching records in the right table.

  • RIGHT JOIN is same as RIGHT OUTER JOIN and means opposite of LEFT JOIN, i.e. shows all records from the second (right) table and only matching records from first (left) table.

Source: What's the difference between LEFT, RIGHT, INNER, OUTER, JOIN?


The precise algorithm for INNER JOIN, LEFT/RIGHT OUTER JOIN are as following:

  1. Take each row from the first table: a
  2. Consider all rows from second table beside it: (a, b[i])
  3. Evaluate the ON ... clause against each pair: ON( a, b[i] ) = true/false?
    • When the condition evaluates to true, return that combined row (a, b[i]).
    • When reach end of second table without any match, and this is an Outer Join then return a (virtual) pair using Null for all columns of other table: (a, Null) for LEFT outer join or (Null, b) for RIGHT outer join. This is to ensure all rows of first table exists in final results.

Note: the condition specified in ON clause could be anything, it is not required to use Primary Keys (and you don't need to always refer to Columns from both tables)! For example:

  • ... ON T1.title = T2.title AND T1.version < T2.version ( => see this post as a sample usage: Select only rows with max value on a column)
  • ... ON T1.y IS NULL
  • ... ON 1 = 0 (just as sample)


Note: Left Join = Left Outer Join, Right Join = Right Outer Join.


You use INNER JOIN to return all rows from both tables where there is a match. i.e. In the resulting table all the rows and columns will have values.

In OUTER JOIN the resulting table may have empty columns. Outer join may be either LEFT or RIGHT.

LEFT OUTER JOIN returns all the rows from the first table, even if there are no matches in the second table.

RIGHT OUTER JOIN returns all the rows from the second table, even if there are no matches in the first table.





outer-join