Deepak Mahto: Optimizing Performance in PostgreSQL: Join Column and ANY Filters

Recently on the Postgres Slack, I encountered an interesting performance issue involving a SQL query that joins two tables with an ANY filter applied to one of the tables.

The problematic SQL was similar to the following:

SELECT
    tbl1.col1
FROM
    tbl1
    INNER JOIN tbl2 ON tbl1.col1 = tbl2.col1
WHERE
    tbl2.col1 IN (1,2,3);

Table tbl1 is joined with tbl2 on the column col1 from their respective tables.

A filter condition is applied to tbl2 on the same joining column col1 i.e join with tbl1. Let’s check the underlying problematic execution plan with mock tables.

--Tested on PostgreSQL 16.3
create table tbl1 as
select col1, col1::text as col2 , col1*0.999 as col3 
from generate_series(1,100) as col1;

create table tbl2 as
select col1, col1::text as col2 , col1*0.999 as col3 
from generate_series(1,10) as col1;

explain (analyze, buffers)
SELECT
    tbl1.col1
FROM
    tbl1
    INNER JOIN tbl2 ON tbl1.col1 = tbl2.col1
WHERE
    tbl2.col1 = ANY (ARRAY[1,2,3]);

Execution plan .

QUERY PLAN
----------------------------------------------
 Hash Join  (cost=1.18..3.58 rows=3 width=4) (actual time=1.354..1.404 rows=3 loops=1)
   Hash Cond: (tbl1.col1 = tbl2.col1)
   Buffers: shared hit=2
   ->  Seq Scan on tbl1  (cost=0.00..2.00 rows=100 width=4) (actual time=0.687..0.705 rows=100 loops=1)
         Buffers: shared hit=1
   ->  Hash  (cost=1.14..1.14 rows=3 width=4) (actual time=0.586..0.586 rows=3 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         Buffers: shared hit=1
         ->  Seq Scan on tbl2  (cost=0.00..1.14 rows=3 width=4) (actual time=0.034..0.041 rows=3 loops=1)
               Filter: (col1 = ANY ('{1,2,3}'::integer[]))
               Rows Removed by Filter: 7
               Buffers: shared hit=1
 Planning Time: 3.131 ms
 Execution Time: 2.049 ms
(14 rows)

Key Observations from Problematic Execution Plan

ANY (‘{1,2,3}’::integer[]) Filter is not pushed to acces path for tbl1 as filter, though it is joined with same column on which filter is applied.
Access method for table tbl1 is not influnced by filter apply to tbl2.col1 though it is join on the same column in the SQL.

Let’s do some testing on checking whether filter will be pushed for different condition on same SQL.

SQL 1 – Equality Filter tbl2.col1 = 1;

explain (analyze, buffers)
SELECT
    tbl1.col1
FROM
    tbl1
    INNER JOIN tbl2 ON tbl1.col1 = tbl2.col1
WHERE
    tbl2.col1 = 1;


QUERY PLAN
----------------------------------------------
 Nested Loop  (cost=0.00..3.38 rows=1 width=4) (actual time=0.144..0.168 rows=1 loops=1)
   Buffers: shared hit=2
   ->  Seq Scan on tbl1  (cost=0.00..2.25 rows=1 width=4) (actual time=0.115..0.134 rows=1 loops=1)
         Filter: (col1 = 1)
         Rows Removed by Filter: 99
         Buffers: shared hit=1
   ->  Seq Scan on tbl2  (cost=0.00..1.12 rows=1 width=4) (actual time=0.024..0.028 rows=1 loops=1)
         Filter: (col1 = 1)
         Rows Removed by Filter: 9
         Buffers: shared hit=1
 Planning Time: 1.343 ms
 Execution Time: 0.282 ms
(12 rows)

Filter applied on where clause(tbl2.col1 = 1) is implicitly pushed for both the tables i.e. tbl1 and tbl2.

SQL 2 – Filter tbl2.col1 in (1,2)

explain (analyze, buffers)
SELECT
    tbl1.col1
FROM
    tbl1
    INNER JOIN tbl2 ON tbl1.col1 = tbl2.col1
WHERE
    tbl2.col1 in (1,2);

QUERY PLAN
----------------------------------------------
 Hash Join  (cost=1.15..3.54 rows=2 width=4) (actual time=0.183..0.213 rows=2 loops=1)
   Hash Cond: (tbl1.col1 = tbl2.col1)
   Buffers: shared hit=2
   ->  Seq Scan on tbl1  (cost=0.00..2.00 rows=100 width=4) (actual time=0.106..0.119 rows=100 loops=1)
         Buffers: shared hit=1
   ->  Hash  (cost=1.12..1.12 rows=2 width=4) (actual time=0.040..0.041 rows=2 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         Buffers: shared hit=1
         ->  Seq Scan on tbl2  (cost=0.00..1.12 rows=2 width=4) (actual time=0.016..0.019 rows=2 loops=1)
               Filter: (col1 = ANY ('{1,2}'::integer[]))
               Rows Removed by Filter: 8
               Buffers: shared hit=1
 Planning Time: 0.291 ms
 Execution Time: 0.285 ms
(14 rows)

IN clause filter is transformed as ANY and not pushed on filter for table tbl1. It is only applied to tbl2.

ANY or IN Clause filter applied to the SQL is not pushed to another tables joined with same column as filtered columns.

Solution

Rewrite the SQL to manually apply the filter on both columns from each table in the join.

SELECT
    tbl1.col1
FROM
    tbl1
    INNER JOIN tbl2 ON tbl1.col1 = tbl2.col1
WHERE
    tbl2.col1 = ANY (ARRAY[1,2,3])
    and  tbl1.col1  = ANY (ARRAY[1,2,3]); -- Newly added.

Post changes, In Execution plan necessary filter was pushed for each table.
Filter: (col1 = ANY (‘{1,2,3}’::integer[]))

QUERY PLAN
----------------------------------------------
 Hash Join  (cost=1.18..3.57 rows=1 width=4) (actual time=0.313..0.353 rows=3 loops=1)
   Hash Cond: (tbl1.col1 = tbl2.col1)
   Buffers: shared hit=2
   ->  Seq Scan on tbl1  (cost=0.00..2.38 rows=3 width=4) (actual time=0.158..0.193 rows=3 loops=1)
         Filter: (col1 = ANY ('{1,2,3}'::integer[]))
         Rows Removed by Filter: 97
         Buffers: shared hit=1
   ->  Hash  (cost=1.14..1.14 rows=3 width=4) (actual time=0.065..0.065 rows=3 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         Buffers: shared hit=1
         ->  Seq Scan on tbl2  (cost=0.00..1.14 rows=3 width=4) (actual time=0.035..0.040 rows=3 loops=1)
               Filter: (col1 = ANY ('{1,2,3}'::integer[]))
               Rows Removed by Filter: 7
               Buffers: shared hit=1
 Planning Time: 0.447 ms
 Execution Time: 0.453 ms
(16 rows)

Ideally, it would be great if Postgres could automatically push the predicate or filter when it is applied to join columns.

Deepak Mahto: Optimizing Performance in PostgreSQL: Join Column and ANY Filters

Key Observations from Problematic Execution Plan

SQL 1 – Equality Filter tbl2.col1 = 1;

SQL 2 – Filter tbl2.col1 in (1,2)

Solution

Trending Articles

Moondru Mudichu 07-06-2016 – Polimer tv Serial

Korean Sex Porn Videos: XXX Videos & Free Porn Movies

Greg Gutfeld

Notts men wanted over alleged cocaine smuggling plot

Download: Shimpanzi – Exra Time (Prod by: Studio 26)

Shatta Wale – You Shock Me (Prod. by Willis Beatz)

Neem Baba Extra Questions Answer Class 6 English Poorvi

Bas Tum Tak Lyrics Translation (Raanjhnaa/ Raanjhanaa/ Raanjhana)

Elle Duncan’s Husband Omar Abdul Ali

'Gleeful' street attacker broke face of victim who was on night out in...

A Rage Up In The Raleigh House: Man Says He Watched Jimmy Hoffa’s Body Get...

Practice Sheet of Right form of verbs for HSC Students

Young Qualified Chinese Masseuse Erotic or Authentic

Boyfriend charged with murder of teen footballer

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

£2 million worth of cocaine estimated in supply plot by jailed Grantham men

Property developer set up cannabis factory to help pay off debts...

Walkthrough Pokemon Victory Fire Complete | English Language

Camila Ballon Arrested by Miami-Dade County Corrections on May 06, 2020

Cannot insert multiple records in SysCompileILTable (SysCompileILTable).The...