I cannot count the number of times I’ve written the same stupid code,
just to do something as simple as making sure the UPDATE
updates exactly one row.
Last time I complained about this in public was at our Christmas party at the office.
Normally nobody cares to listen to me when I’ve had a few beers and start talking about SQL stuff.
Next day I noticed a colleague had spent the hang-over-morning hacking on a patch.
What an awesome Christmas party I thought!
That was quite some time ago and my code still looks ugly, so let’s complain some more and hope it helps.
Number of ugly statements in our codebase:
$ grep -r -E "(IF NOT FOUND THEN)|(RETURNING (1|TRUE) INTO STRICT)" . | wc -l
701
In a typical OLTP application, each UPDATE
typically affects exactly one row.
If more than one row or no row at all would match, you want to throw an exception, return false,
or what ever the proper behaviour might be in the situation.
This sounds like a trivial task, but it turns out there is no nice way of doing this in plpgsql.
Here are three different work-arounds to ensure the update went through:
IF NOT FOUND THEN
Even if FOUND
could mean more than one row was affected,
it’s safe to assume exactly one row was updated if you are updating
using a primary key or any other unique column.
UPDATE Orders SET Processed = 1 WHERE OrderID = _OrderID AND Processed = 0; IF NOT FOUND THEN RAISE no_data_found; END IF;
GET DIAGNOSTICS integer_var = ROW_COUNT;
If you are a bit more paranoid, you can check the exact row count,
and throw different exceptions based on if no rows were found
or if more than one was found.
UPDATE Orders SET Processed = 1 WHERE OrderID = _OrderID AND Processed = 0; GET DIAGNOSTICS _RowCount = ROW_COUNT; IF _RowCount = 0 THEN RAISE no_data_found; ELSIF _RowCount > 1 THEN RAISE too_many_rows; END IF;
RETURNING TRUE INTO STRICT _OK
This is the one I prefer. It guarantees only one row was affected,
and is at the same time a one-liner, or two lines if you could
the DECLARE
of the boolean variable.
DECLARE _OK boolean; ... UPDATE Orders SET Processed = 1 WHERE OrderID = _OrderID AND Processed = 0 RETURNING TRUE INTO STRICT _OK;
However, none of them are pretty. Something as simple and frequent should be in the syntax of a language.
The proposed syntax in johto’s patch was:
UPDATE STRICT Orders SET Processed = 1 WHERE OrderID = _OrderID AND Processed = 0;
Tom Lane suggested to put the STRICT
keyword in the beginning instead,
avoiding conflicts with table names, which sounds like a good idea to me.
STRICT UPDATE Orders SET Processed = 1 WHERE OrderID = _OrderID AND Processed = 0;
I really hope the community will find a way to fix this syntax bug,
without causing to much pain in terms of backwards compatibility issues for users.
I remember when we upgraded from 8.4 to 9.1, I struggled changing all code where
IN/OUT parameters were in conflict with column names,
as the behaviour changed to raise an error.
Users who didn’t want to invest time fixing their code could simply set the setting plpgsql.variable_conflict
in postgresql.conf
to get back the old behaviour.
I think this was a very user-friendly and nice way to deal with the compatibility issue.
If the STRICT
syntax isn’t possible because of hypothetical compatibility issues,
perhaps it could be up to the user to turn it on or off, using a similar setting?
I would even like to go a bit further and suggest making the update of exactly one row the default in PL/pgSQL.
In the typically fewer cases where you want to update more than one row, could explicitly specify it with some other keyword,
such as ALL UPDATE
or whatever. This would make the code much cleaner, as you wouldn’t need to type STRICT
all over your code.
