I recently had discussions with some folks about triggers in PostgreSQL. They had two main questions.
- What is the overhead of putting a trigger on a table?
- Should a trigger function be generic with IF statements to do different things for INSERT, UPDATE and DELETE?
So I created a simple test to verify some assumptions.
First, I created a simple table and made it UNLOGGED. I didn’t want the overhead of the WAL to possibly dwarf the timings of the triggers.
CREATE UNLOGGED TABLE trigger_test ( key serial primary key, value varchar, insert_ts timestamp, update_ts timestamp );
I then create two scripts to push through pgbench and get some timings.
INSERTS.pgbench
INSERT INTO trigger_test (value) VALUES (‘hello’);
UPDATES.pgbench
\set keys :scale \setrandom key 1 :keys UPDATE trigger_test SET value = 'HELLO' WHERE key = :key;
I ran these with the following pgbench commands:
pgbench -n -t 100000 -f INSERTS.pgbench postgres
pgbench -n -s 100000 -t 10000 -f UPDATES.pgbench postgres
The result is that I created 100,000 rows in the test table and then randomly updated 10,000 of them. I ran these commands several times with dropping and recreating the test table between each iteration and the average tps values I was seeing were:
Inserts: 4510 tps
Updates: 4349 tps
Then to get the overhead of a trigger, I created a trigger function that just returns. I then repeated the process of running the pgbench commands as before.
CREATE FUNCTION empty_trigger() RETURNS trigger AS $$ BEGIN RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER empty_trigger BEFORE INSERT OR UPDATE ON trigger_test FOR EACH ROW EXECUTE PROCEDURE empty_trigger();
The result with the empty trigger were:
Inserts: 4296 tps
Updates: 3988 tps
That results in a 4.8% overhead for inserts and an 8.3% overhead for updates. I didn’t dig further as to why is appears that the overhead for a trigger on an update is almost twice as high as on an insert. I’ll leave that to a follow-up when I have some more time. A 4%-8% overhead of placing a trigger on a table will likely not be noticed in most real-world applications, the overhead of what is executed inside the trigger function can be noticed, which led to the next topic.
I then wanted to see the overhead of having a single trigger function versus having separate trigger functions for inserts and updates.
For a single trigger function, I used the following:
CREATE FUNCTION single_trigger() RETURNS trigger AS $$ BEGIN IF (TG_OP = 'INSERT') THEN NEW.insert_ts = CURRENT_TIMESTAMP; RETURN NEW; ELSIF (TG_OP = 'UPDATE') THEN NEW.update_ts = CURRENT_TIMESTAMP; RETURN NEW; END IF; RETURN NULL; END; $$ LANGUAGE plpgsql; CREATE TRIGGER single_trigger BEFORE INSERT OR UPDATE ON trigger_test FOR EACH ROW EXECUTE PROCEDURE single_trigger();
And for separate trigger functions, I used:
CREATE FUNCTION insert_trigger() RETURNS trigger AS $$ BEGIN NEW.insert_ts = CURRENT_TIMESTAMP; RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE FUNCTION update_trigger() RETURNS trigger AS $$ BEGIN NEW.update_ts = CURRENT_TIMESTAMP; RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER insert_trigger BEFORE INSERT ON trigger_test FOR EACH ROW EXECUTE PROCEDURE insert_trigger(); CREATE TRIGGER update_trigger BEFORE UPDATE ON trigger_test FOR EACH ROW EXECUTE PROCEDURE update_trigger();
I then reran the same process as before to see the overhead.
Single Trigger Inserts: 3569 tps
Single Trigger Updates: 3450 tps
Separate Triggers Inserts: 3623 tps
Separate Triggers Updates: 3870 tps
It turns out that splitting the trigger function into separate functions does make a difference. For the insert trigger, keeping things as a single trigger only added 1.5% to the overhead, but for the update trigger, a single trigger function added nearly 11% overhead. The is most likely due to the update case being handle second in the trigger function. That’s another thing to dig into when there is time.