outer - sql join with where clause example




SQL join: where clause vs. on clause (10)

After reading it, this is not a duplicate of Explicit vs Implicit SQL Joins. The answer may be related (or even the same) but the question is different.


What is the difference and what should go in each?

If I understand the theory correctly, the query optimizer should be able to use both interchangeably.


This article clearly explains the difference. It also explains the "ON joined_condition vs WHERE joined_condition or joined_alias is null".

The WHERE clause filters both the left and the right side of the JOIN, while the ON clause will always filter the right side only.

  1. If you always want to fetch the left side rows and only JOIN if some condition matches then you should the ON clause.
  2. If you want to filter the product of joining both sides, then you should use the WHERE clause.

For an inner join, WHERE and ON can be used interchangeably. In fact, it's possible to use ON in a correlated subquery. For example:

update mytable
set myscore=100
where exists (
select 1 from table1
inner join table2
on (table2.key = mytable.key)
inner join table3
on (table3.key = table2.key and table3.key = table1.key)
...
)

This is (IMHO) utterly confusing to a human, and it's very easy to forget to link table1 to anything (because the "driver" table doesn't have an "on" clause), but it's legal.


In SQL, the 'WHERE' and 'ON' clause,are kind of Conditional Statemants, but the major difference between them are, the 'Where' Clause is used in Select/Update Statements for specifying the Conditions, whereas the 'ON' Clause is used in Joins, where it verifies or checks if the Records are Matched in the target and source tables, before the Tables are Joined

For Example: - 'WHERE'

SELECT * FROM employee WHERE employee_id=101

For Example: - 'ON'

*There are two tables employee and employee_details, the matching columns are employee_id.*

SELECT * FROM employee INNER JOIN employee_details ON employee.employee_id=employee_details.employee_id

Hope I have answered your Question.Revert back for clarifications.


In terms of the optimizer, it shouldn't make a difference whether you define your join clauses with ON or WHERE.

However, IMHO, I think it's much clearer to use the ON clause when performing joins. That way you have a specific section of you query that dictates how the join is handled versus intermixed with the rest of the WHERE clauses.


On INNER JOINs they are interchangeable, and the optimizer will rearrange them at will.

On OUTER JOINs, they are not necessarily interchangeable, depending on which side of the join they depend on.

I put them in either place depending on the readability.


On an inner join, they mean the same thing. However you will get different results in an outer join depending on if you put the join condition in the WHERE vs the ON clause. Take a look at this related question and this answer (by me).

I think it makes the most sense to be in the habit of always putting the join condition in the ON clause (unless it is an outer join and you actually do want it in the where clause) as it makes it clearer to anyone reading your query what conditions the tables are being joined on, and also it helps prevent the WHERE clause from being dozens of lines long.


There is great difference between where clause vs. on clause, when it comes to left join.

Here is example:

mysql> desc t1; 
+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id    | int(11)     | NO   |     | NULL    |       |
| fid   | int(11)     | NO   |     | NULL    |       |
| v     | varchar(20) | NO   |     | NULL    |       |
+-------+-------------+------+-----+---------+-------+

There fid is id of table t2.

mysql> desc t2;
+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id    | int(11)     | NO   |     | NULL    |       |
| v     | varchar(10) | NO   |     | NULL    |       |
+-------+-------------+------+-----+---------+-------+
2 rows in set (0.00 sec)

Query on "on clause" :

mysql> SELECT * FROM `t1` left join t2 on fid = t2.id AND t1.v = 'K' 
    -> ;
+----+-----+---+------+------+
| id | fid | v | id   | v    |
+----+-----+---+------+------+
|  1 |   1 | H | NULL | NULL |
|  2 |   1 | B | NULL | NULL |
|  3 |   2 | H | NULL | NULL |
|  4 |   7 | K | NULL | NULL |
|  5 |   5 | L | NULL | NULL |
+----+-----+---+------+------+
5 rows in set (0.00 sec)

Query on "where clause":

mysql> SELECT * FROM `t1` left join t2 on fid = t2.id where t1.v = 'K';
+----+-----+---+------+------+
| id | fid | v | id   | v    |
+----+-----+---+------+------+
|  4 |   7 | K | NULL | NULL |
+----+-----+---+------+------+
1 row in set (0.00 sec)

It is clear that, the first query returns a record from t1 and its dependent row from t2, if any, for row t1.v = 'K'.

The second query returns rows from t1, but only for t1.v = 'K' will have any associated row with it.


They are not the same thing.

Consider these queries:

SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
WHERE Orders.ID = 12345

and

SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID 
    AND Orders.ID = 12345

The first will return an order and its lines, if any, for order number 12345. The second will return all orders, but only order 12345 will have any lines associated with it.

With an INNER JOIN, the clauses are effectively equivalent. However, just because they are functionally the same, in that they produce the same results, does not mean the two kinds of clauses have the same semantic meaning.


this is my solution.

SELECT song_ID,songs.fullname, singers.fullname
FROM music JOIN songs ON songs.ID = music.song_ID  
JOIN singers ON singers.ID = music.singer_ID
GROUP BY songs.fullname

You must have the GROUP BY to get it to work.

Hope this help.


  • Does not matter for inner joins
  • Matters for outer joins

    a. WHERE clause: After joining. Records will be filtered after join has taken place.

    b. ON clause - Before joining. Records (from right table) will be filtered before joining. This may end up as null in the result (since OUTER join).



Example: Consider the below tables:

    1. documents:
     | id    | name        |
     --------|-------------|
     | 1     | Document1   |
     | 2     | Document2   |
     | 3     | Document3   |
     | 4     | Document4   |
     | 5     | Document5   |


    2. downloads:
     | id   | document_id   | username |
     |------|---------------|----------|
     | 1    | 1             | sandeep  |
     | 2    | 1             | simi     |
     | 3    | 2             | sandeep  |
     | 4    | 2             | reya     |
     | 5    | 3             | simi     |

a) Inside WHERE clause:

  SELECT documents.name, downloads.id
    FROM documents
    LEFT OUTER JOIN downloads
      ON documents.id = downloads.document_id
    WHERE username = 'sandeep'

 For above query the intermediate join table will look like this.

    | id(from documents) | name         | id (from downloads) | document_id | username |
    |--------------------|--------------|---------------------|-------------|----------|
    | 1                  | Document1    | 1                   | 1           | sandeep  |
    | 1                  | Document1    | 2                   | 1           | simi     |
    | 2                  | Document2    | 3                   | 2           | sandeep  |
    | 2                  | Document2    | 4                   | 2           | reya     |
    | 3                  | Document3    | 5                   | 3           | simi     |
    | 4                  | Document4    | NULL                | NULL        | NULL     |
    | 5                  | Document5    | NULL                | NULL        | NULL     |

  After applying the `WHERE` clause and selecting the listed attributes, the result will be: 

   | name         | id |
   |--------------|----|
   | Document1    | 1  |
   | Document2    | 3  | 

b) Inside JOIN clause

  SELECT documents.name, downloads.id
  FROM documents
    LEFT OUTER JOIN downloads
      ON documents.id = downloads.document_id
        AND username = 'sandeep'

For above query the intermediate join table will look like this.

    | id(from documents) | name         | id (from downloads) | document_id | username |
    |--------------------|--------------|---------------------|-------------|----------|
    | 1                  | Document1    | 1                   | 1           | sandeep  |
    | 2                  | Document2    | 3                   | 2           | sandeep  |
    | 3                  | Document3    | NULL                | NULL        | NULL     |
    | 4                  | Document4    | NULL                | NULL        | NULL     |
    | 5                  | Document5    | NULL                | NULL        | NULL     |

Notice how the rows in `documents` that did not match both the conditions are populated with `NULL` values.

After Selecting the listed attributes, the result will be: 

   | name       | id   |
   |------------|------|
   |  Document1 | 1    |
   |  Document2 | 3    | 
   |  Document3 | NULL |
   |  Document4 | NULL | 
   |  Document5 | NULL | 




on-clause