An ugly way to introspect database changes.

Monitoring Schema Changes via Last Commit Timestamp

A few days ago, a colleague of mine shown to me that a commercial database keeps track of last DDL change timestamp against database objects.
I began to mumble… is that possible in PostgreSQL? Of course it is, but what is the smartest way to achieve it?
I asked on the mailing list, because the first idea that came into my mind was to use commit timestamps.
Clearly, it is possible to implement something that can do the job using event triggers, that in short are triggers not attached to table tuples rather to database event like DDL commands. Great! And in fact, a very good explaination can be found here.
In this article, I present my first idea about using commit timestamps.
The system used for the test is PostgreSQL 13.4 running on Fedora Linux, with only myself connected to it (this simplifies following transactions). The idea is, in any case, general and easy enough to be used on busy systems.

Introduction to `pg_last_committed_xact()`

The special function pg_last_committed_xact() allows the database administrator (or an user) to get information about which transaction has committed last.
Let’s see this in action:

% psql -U luca -h miguel -c'select pg_last_committed_xact();'   testdb
ERROR:  could not get commit timestamp data
HINT:  Make sure the configuration parameter "track_commit_timestamp" is set.

First of all in order to get information about the committed transaction timestamps, there must be the option track_commit_timestamp configured.
Turning on and off the parameter will not provide historic data, that is even if you had the parameter on and then you turned off, you will not be able to access collected data.
Let’s turn on the parameter and see how it works. The track_commit_timestamp is a parameter with the postmaster context, and therefore requires a server restart!

% psql -U postgres -h miguel \-c'ALTER SYSTEM SET track_commit_timestamp to "on"; '\
       testdb
ALTER SYSTEM
% ssh luca@miguel 'sudo systemctl restart postgresql-13'

In the above I restarted a remote system via ssh, of course you are free to configure the parameter and restart the cluster with your preferred (or available) method.
It is now time to see which information we can get with track_commit_timestamp turned on.

testdb=>SELECTtxid_current();-[RECORD1]+-------------txid_current|380316302458testdb=>SELECT*FROMtxid_status(380316302457),pg_last_committed_xact();-[RECORD1]------------------------------txid_status|committedxid|2359180410timestamp|2021-11-2004:28:50.223275-05

Let’s dissect the above example:

txid_current() simulates a new transaction in one row, because the function gets a new xid (transaction identifier) even if not used for effective work;
txid_status() accepts a xid identifier and returns a string with the status of the transaction, and as shown, the fake transaction 380316302458 results in status committed;
pg_last_committed_xact() now is able to report both the xid and the timestamp at which the last transaction has committed, that is the transaction 380316302458 committed at 2021-11-20 04:28:50.223275-05.

Wait a minute: pg_last_committed_xact() states that the last committed transaction is 2359180410, not 380316302458. What is happening?
Wrap-around is on its way!
The above system has done a so called xid wraparound, that is normal situation in a long running PostgreSQL instance. What this means, is that txid_current() is resturning a bumped value that is, somehow, an absolute value. However, PostgreSQL “reasons” in terms of values modulo 2^32, therefore we must take into account this possible difference.
The above example therefore becomes:

testdb=>SELECTtxid_current()asxid_absolute,mod(txid_current(),pow(2,32)::bigint)asxid;-[RECORD1]+-------------xid_absolute|380316302460xid|2359180412testdb=>SELECT*FROMtxid_status(380316302460)asxid_abs_status,txid_status(2359180412)asxid_status,pg_last_committed_xact();-[RECORD1]--+------------------------------xid_abs_status|committedxid_status|xid|2359180412timestamp|2021-11-2004:34:54.531106-05

The above demonstrates that transactions 380316302460 and 2359180412 are the same, according to PostgreSQL. However, txid_status() requires an “absolute” xid number (note how the short transaction number does not report any status), while pg_last_committed_xact() reasons in terms of “running” numbers, i.e., the modulo ones.
There is another interesting function to keep in mind: pg_xact_commit_timestamp() that, given a transaction identifier, returns the known commit timestamp:

testdb=>SELECT*FROMpg_xact_commit_timestamp(2359180412::text::xid),pg_last_committed_xact();-[RECORD1]------------+------------------------------pg_xact_commit_timestamp|2021-11-2004:34:54.531106-05xid|2359180412timestamp|2021-11-2004:34:54.531106-05

As you can see, the timestamp for the same transaction is always the same. Note that a bigint requires a conversion to text before being translated into a xid.

Tracking DDL Commands

Every table in PostgreSQL has two hidden fields that track the transaction ranges: xmin indicates the transaction that created a tuple, while xmax indicates the transaction that invalidated the tuple. This is used in the MVCC (Multi Version Concurrency Control) machinery that I’m not going to discuss here, so trust that everything works just fine.
The keypoint here is: every table has fields that track the transaction that generated the tuple. This applies also to system catalogs, and in particular (with regard to this article) to pg_class.
Having stated that, and knowing that every time a DDL command applies, something is changed in the system catalogs, it is therefore possible to track when changes did happen on a particular database object or table.
Let’s see this in action:

testdb=>BEGIN;BEGINtestdb=>SELECTtxid_current()asxid_absolute,mod(txid_current(),pow(2,32)::bigint)asxid,current_timestamp;-[RECORD1]-----+------------------------------xid_absolute|380316302463xid|2359180415current_timestamp|2021-11-2005:11:56.343542-05testdb=>CREATETABLEddl_test(pkintgeneratedalwaysasidentity,ttext);CREATETABLEtestdb=>COMMIT;COMMIT

At timestamp 2021-11-20 05:11:56 the table ddl_test has been created. Since every DDL command in PostgreSQL is transactional, it is possible to track the transaction that committed such DDL (in the above example, 380316302463 alis 2359180415).
Let’s query pg_class to get information about last DDL commands on ddl_test table:

testdb=>SELECTage(xmin)astransaction_before,xminasit_was_transaction_number,pg_xact_commit_timestamp(xmin)asmodified_at,relnameastableFROMpg_classWHERErelkind='r'ANDrelname='ddl_test';-[RECORD1]-------------+------------------------------transaction_before|1it_was_transaction_number|2359180415modified_at|2021-11-2005:12:21.359126-05table|ddl_test

The above queries tells us that 1 transaction ago the transaction number 2359180415 modified the structure of ddl_test at timestamp 2021-11-20 05:12:21.359126-05**. <br/> **Everything seems fine except for the timestamp**: the transaction timestamp is not really the same as reported bypg_xact_commit_timestamp()`. The reason for this is that the moment a transaction commits is not the same as the transaction is consolidated, therefore there could some offset and lag. However, checking deeper we can see that data is coherent:

testdb=>SELECT*FROMpg_last_committed_xact();-[RECORD1]----------------------------xid|2359180415timestamp|2021-11-2005:12:21.359126-05

So this is a first ugly but pretty much unexpensive way to track changes to the table.

Let’s now add a column to the table, so to see if this machinery can work:

testdb=>BEGIN;BEGINtestdb=>SELECTtxid_current()asxid_absolute,mod(txid_current(),pow(2,32)::bigint)asxid,current_timestamp;-[RECORD1]-----+------------------------------xid_absolute|380316302464xid|2359180416current_timestamp|2021-11-2005:21:03.089031-05testdb=>ALTERTABLEddl_testADDCOLUMNtttext;ALTERTABLEtestdb=>COMMIT;COMMITtestdb=>SELECT*FROMpg_last_committed_xact();-[RECORD1]----------------------------xid|2359180416timestamp|2021-11-2005:21:32.376468-05

Transaction 2359180416 at timestamp 2021-11-20 05:21:32.376468-05 committed the ALTER TABLE. Let’s run again our query against pg_class:

testdb=>SELECTage(xmin)astransaction_before,xminasit_was_transaction_number,pg_xact_commit_timestamp(xmin)asmodified_at,relnameastableFROMpg_classWHERErelkind='r'ANDrelname='ddl_test';-[RECORD1]-------------+------------------------------transaction_before|1it_was_transaction_number|2359180416modified_at|2021-11-2005:21:32.376468-05table|ddl_test

Therefore we now know when the table was last touched by a DDL command.

Going Deeper: Introspection Against Columns

From the above we now know when a change happened to our table, but we don’t know which attribute has been changed. It is possible to push the same logic against other parts of the system catalog, for example pg_attribute that handles information about single table columns.
Here it the example applied to our demo table:

testdb=>SELECTxmin,attname,age(xmin),pg_xact_commit_timestamp(xmin)FROMpg_attributeWHEREattrelid='ddl_test'::regclass;xmin|attname|age|pg_xact_commit_timestamp------------+----------+-----+-------------------------------2359180415|tableoid|2|2021-11-2005:12:21.359126-052359180415|cmax|2|2021-11-2005:12:21.359126-052359180415|xmax|2|2021-11-2005:12:21.359126-052359180415|cmin|2|2021-11-2005:12:21.359126-052359180415|xmin|2|2021-11-2005:12:21.359126-052359180415|ctid|2|2021-11-2005:12:21.359126-052359180415|pk|2|2021-11-2005:12:21.359126-052359180415|t|2|2021-11-2005:12:21.359126-052359180416|tt|1|2021-11-2005:21:32.376468-05

All the columns except tt have been created by the very same transaction at the very same timestamp, while tt has been touched from another transation 11 minutes after.
The above is not very useful, so it is possible to improve sligthly the query into the following one:

testdb=>SELECTarray_agg(attname)ascolumns,current_timestamp-pg_xact_commit_timestamp(xmin)aswhenFROMpg_attributeWHEREattrelid='ddl_test'::regclassGROUPBYpg_xact_commit_timestamp(xmin);columns|when------------------------------------------+-----------------{tableoid,cmax,xmax,cmin,xmin,ctid,pk,t}|00:19:38.202794{tt}|00:10:27.185452

That reports all the column “touched” at the very same time and how many time has elapsed from the last change. For example, the column tt has been changed 10 minutes ago, while the other columns 19 minutes ago.
Let’s do more changes to our table and see what happen; please note that everything is executed in autocommit mode:

testdb=>ALTERTABLEddl_testADDCOLUMNttttext;ALTERTABLEtestdb=>ALTERTABLEddl_testALTERCOLUMNttSETDEFAULT'FizzBuzz';ALTERTABLEtestdb=>ALTERTABLEddl_testDROPCOLUMNt;ALTERTABLEtestdb=>SELECT*FROMpg_last_committed_xact();xid|timestamp------------+------------------------------2359180419|2021-11-2005:36:48.54285-05

If we inspect again pg_attribute we have:

testdb=>SELECTarray_agg(attname)ascolumns,current_timestamp-pg_xact_commit_timestamp(xmin)astime_ago,pg_xact_commit_timestamp(xmin)aswhenFROMpg_attributeWHEREattrelid='ddl_test'::regclassGROUPBYpg_xact_commit_timestamp(xmin);columns|time_ago|when----------------------------------------+-----------------+-------------------------------{tableoid,cmax,xmax,cmin,xmin,ctid,pk}|00:26:25.685984|2021-11-2005:12:21.359126-05{ttt}|00:04:08.244367|2021-11-2005:34:38.800743-05{tt}|00:02:31.791574|2021-11-2005:36:15.253536-05{........pg.dropped.2........}|00:01:58.50226|2021-11-2005:36:48.54285-05testdb=>SELECTage(xmin)astransaction_before,xminasit_was_transaction_number,pg_xact_commit_timestamp(xmin)asmodified_at,relnameastableFROMpg_classWHERErelkind='r'ANDrelname='ddl_test';-[RECORD1]-------------+------------------------------transaction_before|3it_was_transaction_number|2359180417modified_at|2021-11-2005:34:38.800743-05table|ddl_test

There are some interesting things in the above output. First of all, pg_class reports only the changes related to new attributes, not the dropped ones or the internally changed. On the other hand, pg_attribute reports information about every single attribute, including those changed in a “minor” mode (the SET DEFAULT for instance).
Please note how the dropped column (namely t) is no more visible, even if there is pg.dropped.2 that clearly refers to such column. In the above example it is easy enough: only one column has been dropped in a single user instance, however in a more concurrent system it is hard to get track about the information related to dropped attributes. For more information about the dropped columns, please see my previous article about why PostgreSQL does not reclaim disk space on column drop.

What about `VACUUM`?

The VACUUM FULL command totally rewrites a table, therefore this means that every information about transactions that have “touched” systsem catalogs are updated by a newer transaction. This does not mean that VACUUM is a transactional command, rather it happen to do a CREATE TABLE pretty much as we did manually.

testdb=>VACUUMFULLddl_test;VACUUMtestdb=>SELECTage(xmin)astransaction_before,xminasit_was_transaction_number,current_timestamp-pg_xact_commit_timestamp(xmin)asmodified_since,relnameastableFROMpg_classWHERErelkind='r'ANDrelname='ddl_test';transaction_before|it_was_transaction_number|modified_since|table--------------------+---------------------------+-----------------+----------1|2359180423|00:00:02.615678|ddl_test(1row)testdb=>SELECTarray_agg(attname)ascolumns,current_timestamp-pg_xact_commit_timestamp(xmin)aswhenFROMpg_attributeWHEREattrelid='ddl_test'::regclassGROUPBYpg_xact_commit_timestamp(xmin);columns|when----------------------------------------+-----------------{tableoid,cmax,xmax,cmin,xmin,ctid,pk}|00:50:03.972343{ttt}|00:27:46.530726{tt}|00:26:10.077933{........pg.dropped.2........}|00:25:36.788619

It is interesting to note an apparent inconsistency: the table has been modified 2 seconds ago while the columns have been touched between 25 and 50 minutes ago. How is that possible? Well, VACUUM FULL has rewritten the table but metadata about columns did not change.
In short, this is an indicator about VACUUM FULL execution: if the change time of a table is earlier than that of its columns probably vacuum ran. The correct way to know whenVACUUM FULL run is to inspect appropriate catalogs like pg_stat_user_tables. In any case, combining these information help understanding what happened into the system.

Let’s see something about VACUUM:

testdb=>SELECTage(xmin)astransaction_before,xminasit_was_transaction_number,current_timestamp-pg_xact_commit_timestamp(xmin)asmodified_since,relnameastableFROMpg_classWHERErelkind='r'ANDrelname='ddl_test';-[RECORD1]-------------+----------------transaction_before|11it_was_transaction_number|2359180423modified_since|00:20:41.953205table|ddl_testtestdb=>VACUUMddl_test;VACUUMtestdb=>SELECTage(xmin)astransaction_before,xminasit_was_transaction_number,current_timestamp-pg_xact_commit_timestamp(xmin)asmodified_since,relnameastableFROMpg_classWHERErelkind='r'ANDrelname='ddl_test';-[RECORD1]-------------+----------------transaction_before|11it_was_transaction_number|2359180423modified_since|00:20:51.272209table|ddl_test

The result is that pg_class is unchanged, with regard to the transaction that generated the tuple.
Why?
Since VACUUM is a command that cannot be run within a transaction, it cannot be considered in the described workflow, therefore it is like an invisible command (with regard to transactions).

What about `ANALYZE`?

Unlike VACUUM, the command ANALYZE can be run in a transaction, and this is clearly shown by the age result increasing by one:

testdb=>SELECTage(xmin)astransaction_before,xminasit_was_transaction_number,current_timestamp-pg_xact_commit_timestamp(xmin)asmodified_since,relnameastableFROMpg_classWHERErelkind='r'ANDrelname='ddl_test';-[RECORD1]-------------+----------------transaction_before|14it_was_transaction_number|2359180423modified_since|01:37:54.483495table|ddl_testtestdb=>ANALYZEddl_test;ANALYZEtestdb=>SELECTage(xmin)astransaction_before,xminasit_was_transaction_number,current_timestamp-pg_xact_commit_timestamp(xmin)asmodified_since,relnameastableFROMpg_classWHERErelkind='r'ANDrelname='ddl_test';-[RECORD1]-------------+----------------transaction_before|15it_was_transaction_number|2359180423modified_since|01:38:05.267443table|ddl_test

What is not changing in the above example is the transaction that generated the tuple in pg_class: it is always 2359180423, before and after the ANALYZE command (that did run in a transaction).
Why?
Well, ANALYZEhits another table: pg_statistic. Such table is the root of all statistical information like pg_stat_user_tables and friends, and is the one updated by ANALYZE. This can be clearly inspected with a similar query:

testdb=#SELECTxmin,age(xmin),staattnumFROMpg_statisticWHEREstarelid='ddl_test'::regclass;xmin|age|staattnum------------+-----+-----------2359180437|1|12359180437|1|32359180437|1|4

Please note that the query has been run as a superuser, because of the need of privileges. The result set is made of three rows because there are three “active” (i.e., not dropped) columns within the table, and all of them has been modified (from a statistic point of view) by ANALYZE, that ran in transaction 2359180437 that is now one transaction far (i.e., it was the previous transaction).

Conclusions

Keeping track of commit timestamps could be useful for database introspection, at least to get a glance at when things changed.
The same trick can also be used against regular table tuples, to get an idea of when a tuple appeared in that form in the table.
However, this is not a very good approach, and something much more complex can be built like using already mentioned event triggers.
But hey, this is PostgreSQL: you can extend it in pretty much any direction!

Luca Ferrari: Monitoring Schema Changes via Last Commit Timestamp

Monitoring Schema Changes via Last Commit Timestamp

Introduction to `pg_last_committed_xact()`

Tracking DDL Commands

Going Deeper: Introspection Against Columns

What about `VACUUM`?

What about `ANALYZE`?

Conclusions

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112

Monitoring Schema Changes via Last Commit Timestamp

Introduction to pg_last_committed_xact()

Tracking DDL Commands

Going Deeper: Introspection Against Columns

What about VACUUM?

What about ANALYZE?

Conclusions

Trending Articles

Introduction to `pg_last_committed_xact()`

What about `VACUUM`?

What about `ANALYZE`?