Quantcast
Channel: Planet PostgreSQL
Viewing all articles
Browse latest Browse all 9649

Hans-Juergen Schoenig: Users and join enlightenments

$
0
0

 

When being out on the road doing consulting it is sometimes impressing to figure out how little people actually know about how joins work. Especially outer joins seem to be a mystery for most people.

Here is an example which might bring some enlightenment:

test=# CREATE TABLE a (id int4);
CREATE TABLE
test=# CREATE TABLE b (id int4);
CREATE TABLE
test=# INSERT INTO a VALUES (1), (2), (3);
INSERT 0 3
test=# INSERT INTO b VALUES (2), (3), (4);
INSERT 0 3

 

We just insert a couple of rows and see how things work out.

Let us start with a simple LEFT JOIN:

test=# SELECT * FROM a LEFT JOIN b ON (a.id = b.id);
 id | id 
----+----
  1 |   
  2 |  2
  3 |  3
(3 rows)

 

The goal here is to get all rows from the left hand side and see which rows match on the right side. The “missing” value is filled with a NULL token.

One of the key misunderstandings is this magical ON-clause:

test=# SELECT * FROM a LEFT JOIN b ON (a.id = b.id AND b.id = 2);
 id | id 
----+----
  1 |   
  2 |  2
  3 |   
(3 rows)

 

To most people this result is surprising. It is important to notice that the filter is only applied to the right side. So, we get one more NULL value. This is pretty easy to see and comprehend if the usecase is that simple – but, in case of an outer-join orgy it is not that trivial anymore.

The example I have just shown is not what people usually expect. Here is an alternative:

test=# SELECT * FROM a LEFT JOIN b ON (a.id = b.id) WHERE b.id = 2;
 id | id 
----+----
  2 |  2
(1 row)

 

The WHERE-clause will filter on both columns at the same time. This will actually turn the LEFT JOIN into a useless operation because the WHERE-clause will filter stuff away anyway. NOTE: This is a mistake which can commonly observed; most people don’t get ON and WHERE right when it comes to outer joins.

FULL OUTER joins

FULL OUTER joins are the next bug miracle:

test=# SELECT * FROM a FULL JOIN b ON (a.id = b.id);
 id | id 
----+----
  1 |   
  2 |  2
  3 |  3
    |  4
(4 rows)

 

We take all rows from the left hand side, take the matches on the right hand side and add NULL values in case no matches are found. So far this is no surprise …

The problem usually starts when we try to make the ON clause a little bit more fancy:

test=# SELECT * FROM a FULL JOIN b ON (a.id = b.id AND b.id = 2);
 id | id 
----+----
  1 |   
  2 |  2
  3 |   
    |  4
    |  3
(5 rows)

 

Most people would not expect “3” to exist at all on the right hand side. As you can see it is moved to the NULL-part of the join. This is not too intuitive for most people. However, this is perfectly expected behavior.

If we filter in the WHERE clause we will again turn the outer join into a useless operation:

test=# SELECT * FROM a FULL JOIN b ON (a.id = b.id) WHERE b.id = 2;
 id | id 
----+----
  2 |  2
(1 row)

 

Maybe this will bring some enlightenment to some users out there.

 


Viewing all articles
Browse latest Browse all 9649

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>