Quantcast
Channel: Planet PostgreSQL
Viewing all articles
Browse latest Browse all 9642

Josh Berkus: Why you need to apply today's update real soon

$
0
0
So, another month, another PostgreSQL update.  This one is a lot more critical than most because it patches up to three data loss issues, depending on what version of PostgreSQL you're on and which features you're using.  Two kinds of users need to schedule an update downtime for this weekend, if at all possible:
  1. Users on version 9.3
  2. Users of binary replication
While all three data loss issues depending on timing, and are thus low-probability, all three are impossible to predict in advance and difficult to detect when they occur.  As such, they should be considered high-risk and an urgent fix.  Of the three issues, the replication one affects 9.0, 9.1, 9.2 and 9.3, one of the VACUUM issues affects all versions, and the other affects only 9.3.

Annoyingly, you'll have to do some additional stuff after you update:
  • If using binary replication, you need to take a new base backup of each replica after updating.
  • You should run the following on each production database after updating:
SET vacuum_freeze_table_age = 0;
VACUUM; -- optionally, ANALYZE as well

This second step is critical for users on 9.3, and a generally good idea for users on other versions. Note that, while VACUUM is a non-blocking operation, it can create a lot of IO, so it's a good idea to do this during a slow traffic period and monitor it for pile-ups.

More information about the replication issue is here.

So, how did this happen?  Well, all three issues were unexpected side effects of fixes we applied for other issues in earlier versions.  For example, the replication issue is the result of the combination of two independent fixes for failover issues, both of which needed to be fixed.  Since all of these issues depend on both timing and heavy traffic, and their effect is subtle (as in, a few missing or extra rows), none of our existing testing was capable of showing them.

If anyone wants to devise destruction tests to detect subtle data corruption issues in PostgreSQL before we release code -- if anyone can -- please offer to help!

Viewing all articles
Browse latest Browse all 9642

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>