In the previous episode: Alice, Bob and Carol were trying to simultaneously update the following row of the film table:
id | title | release_year -------+----------------+-------------- 47478 | Seven Samurai | 1956
Alice wanted to remove the extra space in the title, Bob was trying to change the title to the phonetic Japanese version and Carol was correcting the release year to 1954. To simplify the analysis we’ll now limit ourselves to Alice’s and Bob’s updates.
The Lost Update Problem
If Alice’s statement executes first, Bob’s change will overwrite her update. Similarly, if Bob’s statement takes precedence, his change will be overwritten. Appropriately, this is conventionally known as the lost update problem. The updates are known as blind writes or dirty writes. How can our Python user interface prevent this problem?
The traditional solution to the lost update problem is to use two-phase locking. You can use PostgreSQL’s psql application to verify how this works (I used the PROMPT1 variable to show which user is issuing the statements).
Alice’s session starts as follows:
alice@moviesdb=> begin; BEGIN alice@moviesdb=> select title from film where id = 47478; title ---------------- Seven Samurai (1 row)
Bob’s session is identical but then he issues the UPDATE statement:
bob@moviesdb=> begin; BEGIN bob@moviesdb=> select title from film where id = 47478; title ---------------- Seven Samurai (1 row) bob@moviesdb=> update film set title = 'Sichinin no Samurai' where id = 47478; UPDATE 1
When Alice tries her UPDATE, her session hangs:
alice@moviesdb=> update film set title = 'Seven Samurai' where id = 47478;
You can examine the situation from another psql session (I cheated and excluded that session’s data). I won’t try to explain (or understand) all this but you can see that Alice’s session is waiting due to an ungranted lock.
moviesdb=# select procpid, waiting, current_query from pg_stat_activity; procpid | waiting | current_query ---------+---------+----------------------------------------------------------- 25747 | t | update film set title = 'Seven Samurai' where id = 47478; 25900 | f | <IDLE> in transaction (2 rows) moviesdb=# select pid, relation::regclass, locktype, page, tuple, mode, granted moviesdb-# from pg_locks order by pid, relation, locktype; pid | relation | locktype | page | tuple | mode | granted -------+-----------+---------------+------+-------+------------------+--------- 25747 | film | relation | | | AccessShareLock | t 25747 | film | relation | | | RowExclusiveLock | t 25747 | film | tuple | 0 | 37 | ExclusiveLock | t 25747 | film_pkey | relation | | | AccessShareLock | t 25747 | film_pkey | relation | | | RowExclusiveLock | t 25747 | | transactionid | | | ShareLock | f 25747 | | transactionid | | | ExclusiveLock | t 25747 | | virtualxid | | | ExclusiveLock | t 25900 | film | relation | | | RowExclusiveLock | t 25900 | film | relation | | | AccessShareLock | t 25900 | film_pkey | relation | | | RowExclusiveLock | t 25900 | film_pkey | relation | | | AccessShareLock | t 25900 | | transactionid | | | ExclusiveLock | t 25900 | | virtualxid | | | ExclusiveLock | t (14 rows)
Bob’s COMMIT releases his locks …
bob@moviesdb=> commit; COMMIT
and Alice’s UPDATE now goes through:
UPDATE 1 alice@moviesdb=> commit; COMMIT alice@moviesdb=> select title from film where id = 47478; title --------------- Seven Samurai (1 row)
Hey, what happened? Alice’s UPDATE overwrote Bob’s! Wasn’t that supposed to be prevented?
Here is the rub: if it is important for the application to update the row as was presented to the user, then we need to add another qualification to the UPDATE, i.e., we need something like “and title = 'Seven Samurai'“. We’ll discuss this in a future installment.
Filed under: PostgreSQL, Python, User interfaces
