Ibrar Ahmed: Multi-Master Replication Solutions for PostgreSQL

June 9, 2020, 10:28 am

≫ Next: Hans-Juergen Schoenig: Composite type performance issues in PostgreSQL

≪ Previous: Steven Pousty: Using the Crunchy PostgreSQL Operator with PostGIS

multi-master replication postgresql Due to the immense generation of data, scalability has become one of the hottest topics in the field of databases. Scalability can be achieved horizontally or vertically. Vertical scalability means adding more resources/hardware to existing nodes to enhance the capability of the database to store and process more data, for example, adding a new process, memory, or disk to an existing node. Every DBMS engine improves the capability of vertical scalability by improving the locking/mutex mechanisms and the concurrency by which it can use the newly added resources more effectively. The database engines provide configuration parameters, which helps to utilize the available hardware resources more effectively.

Due to the cost of hardware and the limit to add the new hardware in the existing node, it is not always possible to add new hardware. Therefore, horizontal scalability is required, which means adding more nodes to existing network nodes instead of enhancing the capability of the existing node.

Contrary to vertical scalability, horizontal scalability is challenging to implement. It requires more development effort and requires more work to set up. PostgreSQL provides quite a rich feature set for both vertical scalability and horizontal scalability. It supports machines with multiple processors and a lot of memory and provides configuration parameters to manage the use of these resources. The new features of parallelism in PostgreSQL make the vertical scalability more prominent, but it is also not lacking horizontal scalability. Replication is the crucial pillar of horizontal scalability, and PostgreSQL supports unidirectional master-slave replication, which is enough for many use-cases.

Key Concepts

Database Replication

Database replication replicates the data on other servers and stores it on more than one node. In that process, the database instance is transferred from one node to another node and an exact copy is made. Data replication is used to improve data availability, a key feature of HA. There can be a full database instance, or some frequently used or desired objects are replicated to another server. As replication provides the multiple consistent copies of the database, it not only provides high availability but also improves the performance.

Synchronous replication

There are two strategies while writing the data on disk, Synchronous, and Asynchronous. Synchronous replication means data is written to master and slave simultaneously, in other words, “Synchronous replication” means commit waits for write/flush on the remote side. Synchronous replication is used in the high-end transactional environment with instant failover requirements.

Asynchronous replication

Asynchronous means data is written to master first and then replicated to the slave. In the case of a crash, data loss can occur, but Asynchronous replication provides very little overhead and therefore it is acceptable in most cases. It does not overburden the master. The fail-over from primary to secondary takes a longer time than with synchronous replication.

In a nutshell, the main difference between Synchronous and Asynchronous is when data is written to master and slave.

Single Master Replication

Single master replication means data is allowed to be modified only on a single node, and these modifications are replicated to one or more nodes. The data update and insertion are only possible on the master node. In that case, applications require routing the traffic to master, which adds some complexity to the application. Because of single masters, there is no chance of conflicts. Most of the time single master replication is enough for the application because it is less complicated to configure and manage. But in some cases, single master replication is not enough and you need multi-master replication.

Multi-Master Replication

Multi-master replications mean there is more than one node that acts as master nodes. Data is replicated between nodes and updates and insertion can be possible on a group of master nodes. In that case, there are multiple copies of the data. The system is also responsible for resolving any conflicts that occur between concurrent changes. There are two main reasons to have a multiple master replication; one is HA, and the second is performance. In most cases, some nodes are dedicated to the intensive write operation, and some nodes are dedicated to some nodes or for failover.

Here are the pros and cons of multi-master replication

Pros:

In case one master fails, the other master is there to serve the update and insert.
Master nodes are located in several different locations so chances of failure of all the masters are very minimal.
Data updates are possible on multiple servers.
The application does not require to route the traffic to only a single master.

Cons:

The main disadvantage of multi-master replication is its complexity.
Conflict resolution is very difficult because simultaneous writes on multiple nodes are possible.
Sometimes manual intervention is required in case of conflict.
Chance of data inconsistency.

As we have already discussed, Single-Master replication is enough in most of the cases and highly recommended, but still, there are some cases where there is a requirement of Multi-Master replication. PostgreSQL has built-in single-master replication, but unfortunately, there is no multiple-master replication in mainline PostgreSQL. There are some Multimaster replication solutions available, some of these are in the form applications and some are PostgreSQL forks. These forks have their own small communities and are mostly managed by a single company, but not by the mainline PostgreSQL community.

There are multiple categories of these solutions, open-source/ closed source, priority, free, and paid.

BDR (Bi-Directional Replication)
xDB
PostgreSQL-XL
PostgreSQL-XC / PostgreSQL-XC2
Rubyrep
Bucardo

Here are some key features of all the replications solutions

1- BDR (Bi-Directional Replication)

BDR is a multi-master replication solution, and it has different versions. Early versions of BDR are open-source but its latest version is closed-source. This solution is developed by 2ndQuadrant and one of the most elegant Multimaster solutions to date. BDR provides asynchronous multi-master logical replication. This is based on the PostgreSQL logical-decoding feature. Since BDR applies essentially replays the transaction on the other nodes, the replay operation can fail if there is a conflict between a transaction being applied and a transaction that was committed on the receiving node.

2 – xDB

EnterpriseDB developed its own bi-direction replications solution in Java called xDB. It is based on their own protocol. Because it is a closed-source solution, no design information is known to the world.

Developed and maintained by EnterpriseDB.
Developed in Java.
The source code is close-sourced.
xDB Replication Server contained multiple executable programs.
This is a completely closed-source Proprietary software.
Developed in Java, people complained about its performance.
Failover time is not acceptable.
The user interface is available for configuring and maintaining the replication system

3 – PostgreSQL XC/XC2

PostgreSQL-XC is developed by EnterpriseDB and NTT. It is a synchronous replication solution. Postgres-XC is an open-source project to provide a write-scalable, synchronous, symmetric, and transparent PostgreSQL cluster solution. I have not seen much development in PostgreSQL-XC for many years from EnterpriseDB and NTT. Currently, Huawei is working on that. Some performance gain has been reported in the case of OLAP, but not suitable for TPS.

4 – PostgreSQL XL

It is a fork of PostgreSQL-XC and currently supported by 2ndQuadrant. It is well behind Community PostgreSQL. As for as I know it based on PostgreSQL 10.6, which is not aligned with PostgreSQL latest version PostgreSQL-12. As we know it is based on PostgreSQL-XC, it is very good when we are talking about the OLAP, but not much suitable for Hight TPS.

Note: All PostgreSQL XC/XC2/XL are considered as “PostgreSQL-derived software” which are not synchronized with the current development of PostgreSQL.

5 – Rubyrep

It is an asynchronous master-master replication developed by Arndt Lehmann. It claims the easiest configuration, setup, and works across platforms including windows. It always operates on two servers, which are referenced as “left” and “right” in Rubyrep terminology. So it is meaningful to call it a “2-master” setup rather than “multi-master”.

The rubyrep can continuously replicate changes between the left and right databases.
Automatically sets up necessary triggers, log tables, etc.
Automatically discovers newly added tables and synchronizes the table content
Automatically reconfigures sequences to avoid duplicate key conflicts
Tracks changes to primary key columns
Can implement both master-slave and master-master replication
Pre Built conflict resolution methods available: left / right wins; earlier / later change wins
Custom conflict resolution specifiable via ruby code snippets
Replication decisions can optionally be logged in the rubyrep event log table

Note: – This project has not been active for the last three years, in terms of development.

6 – Bucardo

Bucardo is a Trigger-based replication solution developed by Jon Jensen and Greg Sabino Mullane of End Point Corporation. Bucardo’s solution has been around for almost 20 years now and was originally designed as a “lazy” asynchronous solution that eventually replicates all changes. There is a Perl daemon that listens to NOTIFY requests and acts on them. Changes happening on tables are recorded in a table (bucardo_delta) and notifies the daemon. Daemon notifies the controller which spins up a “kid” to synchronize the table changes. If there is conflict, standard or custom conflict handlers are used to sort it out.

Trigger-based replication
Conflict resolution policy
Dependency on Perl 5, DBI, DBD::Pg, DBIx::Safe.
Installation and configuration are complex.
Replication breaks often and is buggy.

Conclusion

Most of the cases of single-master replication are enough, and it has been observed that people are configuring multi-master replication and over-complicating their design. It is highly recommended to design the system and try to avoid the multi-master replication and only use it where there is no way to design the system without that. There are two reasons: the first is it makes the system over-complex and hard to debug, and second, you will not get any support from the PostgreSQL community because there are no community-maintained multi-master replications available.

↧

Hans-Juergen Schoenig: Composite type performance issues in PostgreSQL

June 10, 2020, 1:00 am

≫ Next: Bruce Momjian: Controlling Connection Parameters Using Environment Variables

≪ Previous: Ibrar Ahmed: Multi-Master Replication Solutions for PostgreSQL

PostgreSQL is a really powerful database and offers many features to make SQL even more powerful. One of these impressive things is the concept of a composite data type. In PostgreSQL a column can be a fairly complex thing. This is especially important if you want to work with server side stored procedures or functions. However, there are some details people are usually not aware of when making use of stored procedures and composite types.

Composite data types in PostgreSQL

Before we dive into the main topic of this post, I want to give you a mini introduction to composite types in general:


test=# CREATE TYPE person AS (id int, name text, income numeric);
CREATE TYPE

I have created a simple data type to store persons. The beauty is that the composite type can be seen as one column:

test=# SELECT '(10, "hans", 500)'::person;
person
------------------
(10," hans",500)
(1 row)

However, it is also possible to break it up again and represent it as a set of fields.


test=# SELECT ('(10, "hans", 500)'::person).*;
id | name | income
----+-------+--------
10 | hans | 500
(1 row)

A composite type can be used as part of a table just like any other data type. Here is how it works:


test=# CREATE TABLE data (p person, gender char(1));
CREATE TABLE
test=# \d data
Table "public.data"
Column | Type | Collation | Nullable | Default
--------+--------------+-----------+----------+---------
p | person | | |
gender | character(1) | | |

As you can see the column type is “person”.
Armed with this kind of information we can focus our attention on performance. In PostgreSQL a composite type is often used in conjunction with stored procedures to abstract values passed to a function or to handle return values.

Be careful with database performance

Why is that important? Let me create a type containing 3 million entries:


test=# CREATE TABLE x (id int);
CREATE TABLE
test=# INSERT INTO x SELECT *
FROM generate_series(1, 3000000);
INSERT 0 3000000
test=# vacuum ANALYZE ;
VACUUM

pgstattuple is an extension which is especially useful if you want to detect bloat in a table. It makes use of a composite data type to return data. Installing the extension is easy:


test=# CREATE EXTENSION pgstattuple;
CREATE EXTENSION

What we want to do next is to inspect the content of “x” and see the data (all fields). Here is what you can do:


test=# explain analyze SELECT (pgstattuple('x')).*;
QUERY PLAN
-------------------------------------------------------------------------------------------
Result (cost=0.00..0.03 rows=1 width=72) (actual time=1909.217..1909.219 rows=1 loops=1)
Planning Time: 0.016 ms
Execution Time: 1909.279 ms
(3 rows)

Wow, it took close to 2 seconds to generate the result. Why is that the case? Let us take a look at a second example:


test=# explain analyze SELECT * FROM pgstattuple('x');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Function Scan on pgstattuple (cost=0.00..0.01 rows=1 width=72) (actual time=212.056..212.057 rows=1 loops=1)
Planning Time: 0.019 ms
Execution Time: 212.093 ms
(3 rows)

Ooops? What happened? If we put the query in the FROM-clause the database is significantly faster. The same is true is we use a subselect:


test=# explain analyze SELECT (y).* FROM (SELECT pgstattuple('x') ) AS y;
QUERY PLAN
-----------------------------------------------------------------------------------------
Result (cost=0.00..0.01 rows=1 width=32) (actual time=209.666..209.666 rows=1 loops=1)
Planning Time: 0.034 ms
Execution Time: 209.698 ms
(3 rows)

Let us analyze the reasons for this behavior!

PostgreSQL: Expanding the FROM clause

The problem is that PostgreSQL expands the FROM-clause. It actually turns (pgstattuple(‘x’)) into …

…

(pgstattuple('x')).table_len,
(pgstattuple('x')).tuple_count,
(pgstattuple('x')).tuple_len,
(pgstattuple('x')).tuple_percent,

…

As you can see, the function is called more often in this case which of course explains the runtime difference. Therefore it makes a lot of sense to understand what is going on under the hood here. The performance improvement can be quite dramatic. We have seen a couple of cases in PostgreSQL support recently which could be related to this kind of behavior.

Finally …

If you want to know more about performance consider checking out my blog post about CREATE INDEX and parallelism.

The post Composite type performance issues in PostgreSQL appeared first on Cybertec.

↧

Bruce Momjian: Controlling Connection Parameters Using Environment Variables

June 10, 2020, 8:30 am

≫ Next: Craig Kerstiens: Control Runaway Postgres Queries With Statement Timeout

≪ Previous: Hans-Juergen Schoenig: Composite type performance issues in PostgreSQL

Libpq is the Postgres connection library used by almost every non-JDBC application. It allows many connection parameters, which can be specified on the command line or embedded in applications:

$ psql -h myhost.com -d mydb

↧

Craig Kerstiens: Control Runaway Postgres Queries With Statement Timeout

June 10, 2020, 9:08 am

≫ Next: Magnus Hagander: Repartitioning with logical replication in PostgreSQL 13

≪ Previous: Bruce Momjian: Controlling Connection Parameters Using Environment Variables

Control Runaway Postgres Queries With Statement Timeout

Most queries against a database are short lived. Whether you're inserting a new record or querying for a list of upcoming tasks for a user, you're not typically aggregating millions of records or sending back thousands of rows to the end user. A typical short lived query in Postgres can easily be accomplished in a few milliseconds or less. For the typical application, this means a well tuned production Postgres database is capable of easily running thousands or up to hundreds of thousands of queries per second on a beefy instance.

↧

Magnus Hagander: Repartitioning with logical replication in PostgreSQL 13

June 10, 2020, 10:43 am

≫ Next: Semab Tariq: How to use AdaBoost Machine Learning model with 2UDA – PostgreSQL and Orange (Part 6)

≪ Previous: Craig Kerstiens: Control Runaway Postgres Queries With Statement Timeout

So, you have a partitioned table. And you want to change your mind. Re-partitioning is "easy" if you can take downtime -- just create a new table with a new name and copy all the data over. But what if we want to try to do it without downtime? Logical replication enhancements in PostgreSQL 13 brings us some new options for this!

But first a disclaimer -- this is definitely not pretty! And does not take into consideration things like foreign keys and similar. But sometimes a quick hack can be the best hack.

So let's go!

↧

Semab Tariq: How to use AdaBoost Machine Learning model with 2UDA – PostgreSQL and Orange (Part 6)

June 11, 2020, 5:04 am

≫ Next: movead li: Transactions in PostgreSQL and their mechanism

≪ Previous: Magnus Hagander: Repartitioning with logical replication in PostgreSQL 13

This article gives a step by step guide to utilizing Machine Learning capabilities with 2UDA. In this article, we will use examples of Animals to predict whether they are Mammals, Birds, Fish or Insects. Software versions We are going to use 2UDA version 11.6-1 to implement the Machine Learning model. This version 11.6-1 combines: PostgreSQL […]

↧

movead li: Transactions in PostgreSQL and their mechanism

June 11, 2020, 7:16 pm

≫ Next: Luca Ferrari: Running pgbackrest on FreeBSD

≪ Previous: Semab Tariq: How to use AdaBoost Machine Learning model with 2UDA – PostgreSQL and Orange (Part 6)

Transaction is the most basic concept of a database. Using begin and end command in PostgreSQL can start and commit a transaction. Of course, this is the most common PostgreSQL transaction. In addition, there are sub transaction, multi transaction, 2pc transaction concepts in PostgreSQL. In this blog, I will demonstrate the emergence scenario and kernel implementation of these PostgreSQL transactions.

Normal Transaction

Using the PostgreSQL client to connect to the PostgreSQL server, transaction auto-commit is enabled by default, that is to say, every DML executed will automatically complete the commit process. You can use the \set AUTOCOMMIT off command to turn off auto-commit, or you can use the begin command to turn on a transaction block.

\echo :AUTOCOMMITcreate table t1(i int);insert into t1 values(1);rollback;select * from t1;\set AUTOCOMMIT off\echo :AUTOCOMMITinsert into t1 values(2);rollback;select * from t1;insert into t1 values(22);commit;select * from t1;commit;\set AUTOCOMMIT on\echo :AUTOCOMMITbegin;insert into t1 values(3);rollback;select * from t1;begin;insert into t1 values(33);commit;select * from t1;

postgres=# \echo :AUTOCOMMITonpostgres=# createtable t1(i int);CREATETABLEpostgres=# insertinto t1 values(1);INSERT01postgres=# rollback;2020-06-0815:15:50.261 CST [29689] WARNING:  there is no transaction in progressWARNING:  there is no transaction in progressROLLBACKpostgres=# select * from t1; i ---1(1 row)postgres=# -- When auto commit is on, the DML statement we executed has been successfully committed, and rollback has no impact on us

postgres=# \set AUTOCOMMIT offpostgres=# \echo :AUTOCOMMIToffpostgres=# insertinto t1 values(2);INSERT01postgres=# rollback;ROLLBACKpostgres=# select * from t1; i ---1(1 row)postgres=# insertinto t1 values(22);INSERT01postgres=# commit;COMMITpostgres=# select * from t1; i  ---- 122(2 rows)postgres=#-- When auto commit is closed, all DML statements executed by us are rolled back or committed

postgres=# commit;COMMITpostgres=# postgres=# \set AUTOCOMMIT onpostgres=# \echo :AUTOCOMMITonpostgres=# begin;BEGINpostgres=# insertinto t1 values(3);INSERT01postgres=# rollback;ROLLBACKpostgres=# select * from t1; i  ---- 122(2 rows)postgres=# begin;BEGINpostgres=# insertinto t1 values(33);INSERT01postgres=# commit;COMMITpostgres=# select * from t1; i  ---- 12233(3 rows)postgres=#-- In the begin transaction block, the DML statement we execute can be rolled back or committed

Now we use the pageinspact tool to look at the data in the t1 table

postgres=# select lp,t_xmin,t_xmax,t_ctid,t_infomask,t_data from heap_page_items(get_raw_page('t1',0)); lp | t_xmin | t_xmax | t_ctid | t_infomask |   t_data   ----+--------+--------+--------+------------+------------ 1 |    625 |      0 | (0,1)  |       2304 | \x01000000 2 |    626 |      0 | (0,2)  |       2560 | \x02000000 3 |    627 |      0 | (0,3)  |       2304 | \x16000000 4 |    628 |      0 | (0,4)  |       2560 | \x03000000 5 |    629 |      0 | (0,5)  |       2304 | \x21000000(5 rows)postgres=# select ctid,xmin,* from t1; ctid  | xmin | i  -------+------+---- (0,1) |  625 |  1 (0,3) |  627 | 22 (0,5) |  629 | 33(3 rows)postgres=#

Through the ‘pageinspact’ tool, you can see that the t1 table has 5 pieces of data, but the query result shows that t1 has 3 pieces of data. According to the above operations, transaction 626 and transaction 628 are rolled back. Therefore, the MVCC mechanism of PostgreSQL judges the commit status of transaction 626 and transaction 628, so the two records with t_ctid = (0,2) and t_ctid= (0,4) are not visible. We will not discuss the huge mechanism of MVCC here. Our focus is on how to obtain the commit status of each transaction in PostgreSQL.

There be pg_xact folderin pgdata directory of PostgreSQL：

movead@movead-PC:/h2/data/pg_xact$ ll-rw-------+1 movead movead 81926月   815:23 0000movead@movead-PC:/h2/data/pg_xact$

The file here records the commit status of the transaction. 2bit can record the status of a transaction, so a byte can record four transactions. Here, the size of each file is specified as 32 * BLCKSZ in the kernel. As shown above, the ‘0000’ file can hold transaction bewteen 1 and (32*BLCKSZ*4). Therefore, PostgreSQL can get the commit status of a transaction from pg_xact directory.

Subtransaction

The subtransaction is accompanied by a savepoint or the function with an exception. Let’s use savepoint as an example:

postgres=# truncate t1;TRUNCATE TABLEpostgres=# select txid_current(); txid_current --------------       10495(1 row)postgres=# begin;BEGINpostgres=# select txid_current(); txid_current --------------       10496(1 row)postgres=# insertinto t1 values(1);INSERT01postgres=# savepoint s1;SAVEPOINTpostgres=# insertinto t1 values(2);INSERT01postgres=# savepoint s2;SAVEPOINTpostgres=# insertinto t1 values(3);INSERT01postgres=# savepoint s3;SAVEPOINTpostgres=# insertinto t1 values(4);INSERT01postgres=# select txid_current(); txid_current --------------       10496(1 row)postgres=# commit;COMMITpostgres=# select ctid,xmin,* from t1; ctid  | xmin  | i -------+-------+--- (0,1) | 10496 | 1 (0,2) | 10497 | 2 (0,3) | 10498 | 3 (0,4) | 10499 | 4(4 rows)postgres=#

We can see that the four records we inserted in the same transaction block have different transaction ids, and the first transaction 10496 is parent transaction and 10497,10498,10499 are subtransactions. Subtransaction is also a special common transaction,they also have commit status and will write the status to ‘pg_xact’ directory. After parent transaction commits or aborts, the behavior of these subtransactions are same with normal transactions.

The difference is the commit process, all parent and child transactions should be marked as commit or abort (the transaction in the middle of rollback can not be regarded as child transaction). To ensure the atomicity of the whole transaction, PostgreSQL designs a commit mechanism for the commit of the subtransaction.

subtrans_commit

1.Mark subtransaction status as sub-committed(TRANSACTION_STATUS_SUB_COMMITTED)

2.Mark parent transaction status as committed(TRANSACTION_STATUS_COMMITTED)

3.Mark subtransaction status as committed(TRANSACTION_STATUS_COMMITTED)

The method to determine whether a subtransaction is committed as below：

If the subtransaction status showed in pg_xact directory is sub-committed, then you need to go to find the parent transaction ID in pg_subtrans directory, and judge the commit status of the child transaction according to the commit status of parent transaction; If the subtransaction status showed in the pg_xact directory is ‘committed’, then the subtransaction is in committed state; In addition, subtransaction is uncommitted.

Finally, let’s take a look at the pg_subtrans directory that records the correspondence between subtransactions and parent transactions:

movead@movead-PC:/h2/data/pg_subtrans$ ll-rw-------+ 1 movead movead 49152 6月   8 17:33 0000movead@movead-PC:/h2/data/pg_subtrans$

In this file, it uses 4byte to record the parent transaction ID for every subtransaction. Here, the size of the files is specified as 32*BLCKSZ in the kernel, Each such file can hold 32**BLCKSZ/ 4 transactions

Multi-transaction

Multiple transactions occur because of row-level locks. When multiple sessions add row-level locks to the same row, multiple transactions occur. Below we use an example to demonstrate the emergence of multiple transactions.

-- Create a test table and insert data, and use the pageinspact tool to view the distribution of data on the diskpostgres=# createtable t1(i int, j text);CREATETABLEpostgres=# insertinto t1 values(1,'PostgreSQL');INSERT01postgres=# insertinto t1 values(2,'Postgres');INSERT01postgres=# insertinto t1 values(3,'pg');INSERT01postgres=# select t_ctid,t_infomask2,t_infomask,t_xmin,t_xmax,t_data from heap_page_items(get_raw_page('t1',0)); t_ctid | t_infomask2 | t_infomask | t_xmin | t_xmax |              t_data              --------+-------------+------------+--------+--------+---------------------------------- (0,1)  |           2 |       2050 |    500 |      0 | \x0100000017506f737467726553514c (0,2)  |           2 |       2050 |    501 |      0 | \x0200000013506f737467726573 (0,3)  |           2 |       2050 |    502 |      0 | \x03000000077067(3 rows)postgres=#

-- Open a transaction block in a session, and then lock a rowpostgres=# begin;BEGINpostgres=# select * from t1 where i = 2 for share; i |    j     ---+----------2 | Postgres(1 row)postgres=# select t_ctid,t_infomask2,t_infomask,t_xmin,t_xmax,t_data from heap_page_items(get_raw_page('t1',0)); t_ctid | t_infomask2 | t_infomask | t_xmin | t_xmax |              t_data              --------+-------------+------------+--------+--------+---------------------------------- (0,1)  |           2 |       2306 |    500 |      0 | \x0100000017506f737467726553514c (0,2)  |           2 |        466 |    501 |    504 | \x0200000013506f737467726573 (0,3)  |           2 |       2306 |    502 |      0 | \x03000000077067(3 rows)postgres=# select txid_current(); txid_current --------------         504postgres=#

Here we can see that the t_xmax of t_ctid=(0,2) has changed.

-- In another session, do the samepostgres=# begin;BEGINpostgres=# select * from t1 where i = 2 for share; i |    j     ---+----------2 | Postgres(1 row)postgres=# select txid_current(); txid_current --------------         505(1 row)postgres=# select t_ctid,t_infomask2,t_infomask,t_xmin,t_xmax,t_data from heap_page_items(get_raw_page('t1',0)); t_ctid | t_infomask2 | t_infomask | t_xmin | t_xmax |              t_data              --------+-------------+------------+--------+--------+---------------------------------- (0,1)  |           2 |       2306 |    500 |      0 | \x0100000017506f737467726553514c (0,2)  |           2 |       4562 |    501 |      2 | \x0200000013506f737467726573 (0,3)  |           2 |       2306 |    502 |      0 | \x03000000077067(3 rows)postgres=#

We found that the t_xmax with t_ctid=(0, 2) becomes 2, this 2 is a multi-transaction, indicating that there are multiple transactions (504, 505) locked this line, whether multi-transaction 2 commit depends on t_infomask Value and (504,505) commit status. Let’s explore how the 2 and (504,505) in PostgreSQL are connected.

movead@movead-PC:/h2/data/pg_multixact$ tree ..├── members│   └── 0000└── offsets   └── 00002 directories, 2 filesmovead@movead-PC:/h2/data/pg_multixact$

There are files as shown above in the data directory, where the mapping relationship between multiple transactions and their corresponding transaction lists is stored. The attributes of transactions in the transaction list determine the commit status of multiple transactions.

mutitransaction

2PC Transaction

The two-phase commit (2PC) transaction is a necessary condition for implementing distributed commits. Similarly, the following will demonstrate what a 2PC transaction is and how PostgreSQL implements a 2PC transaction.

postgres=# begin;BEGINpostgres=# insertinto t1 values(1,'PostgreSQL');INSERT01postgres=# postgres=# select * from t1; i |     j      ---+------------1 | PostgreSQL(1 row)postgres=# select t_ctid,t_infomask2,t_infomask,t_xmin,t_xmax,t_data from heap_page_items(get_raw_page('t1',0)); t_ctid | t_infomask2 | t_infomask | t_xmin | t_xmax |              t_data              --------+-------------+------------+--------+--------+---------------------------------- (0,1)  |           2 |       2050 |    537 |      0 | \x0100000017506f737467726553514c(1 row)postgres=# prepare transaction 'test_2pc_trans';PREPARE TRANSACTIONpostgres=# select * from t1; i | j ---+---(0 rows)postgres=# commit prepared 'test_2pc_trans';COMMIT PREPAREDpostgres=# select * from t1; i |     j      ---+------------1 | PostgreSQL(1 row)postgres=#

Execute the above SQL in the same session, prepare transaction ‘test_2pc_trans’ is the first stage commit, commit prepared’test_2pc_trans’ is the second stage commit.

After the first-phase commit, we have exited the transaction block, and the transaction has not been fully committed, so we cannot query the data inserted in the transaction block. After the second-phase commit, the 2PC transaction has been completed, so the data is retrieved again.

2pc_commit

The 2PC transaction is also a special ordinary transaction. The judgment of the commit status of the 2PC transaction is the same as the normal transaction, except that the 2PC transaction has a ‘self-protection mechanism’: the 2PC transaction can survive independent of the session connection, even a database restart can not affect the 2PC transaction. In order to implement this protection mechanism, PostgreSQL will create a file to save transaction data for long-term 2PC transactions:

movead@movead-PC:/h2/data/pg_twophase$ ll-rw-------+1 movead movead 2526月   916:17 00000219movead@movead-PC:/h2/data/pg_twophase$

The above file 00000219 is the storage file created for the 2PC transaction 537. If PostgreSQL restarts, it will load all 2PC transactions from the pg_twophase directory into memory.

Summary

In this blog, the use of PostgreSQL transactions, sub-transactions, multi-transactions, 2PC transactions and the characteristics of each transaction and its storage method are recorded from a relatively simple perspective.

Movead Li

Movead.Li is kernel development of Highgo Software. Since joining into Highgo Software in 2016, Movead takes the most time on researching the code of Postgres and is good at ‘Write Ahead Log’ and ‘Database Backup And Recovery’. Base on the experience Movead has two open-source software on the Postgres database. One is Walminer which can analyze history wal file to SQL. The other one is pg_lightool which can do a single table or block recovery base on base backup and walfiles or walfiles only.

Hello

Now he has joined the HighGo community team and hopes to make more contributions to the community in the future.

The post Transactions in PostgreSQL and their mechanism appeared first on Highgo Software Inc..

↧

Luca Ferrari: Running pgbackrest on FreeBSD

June 11, 2020, 5:00 pm

≫ Next: Bruce Momjian: Connect Parameter Specification Options

≪ Previous: movead li: Transactions in PostgreSQL and their mechanism

pgbackrest is an amazing backup solution for PostgreSQL, quite frankly it is my favourite. And now fully supports FreeBSD too!

Running pgbackrest on FreeBSD

pgbackrest is an amazing tool for backup and recovery of a PostgreSQL database. Quite frankly, it is my favourite backup solution because it is reliable, fast and supports a lot of interesting features including retention policies and encryption.
I have already written about some problems in running pgbackrest on FreeBSD, and the problem were not related to the application itself, rather to the compilation process.
I’m really glad that now pgbackrest fully supports non-Linux platforms, including FreeBSD, thanks to the changes in the compilation approach. It is therefore a simple process to get pgbackrest installed on your FreeBSD machine!

Installing pgbackrest on FreeBSD

In order to see how simple it is now to install pgbackrest on FreeBSD, let’s download the latest stable release, the 2.27 one, and install it. The only advice is that the project needs to be compiled with GNU make, that means you have to digit gmake inestead of usual make:

% wget https://github.com/pgbackrest/pgbackrest/archive/release/2.27.tar.gz
% tar xzvf 2.27.tar.gz
% cd pgbackrest-release-2.27

% cd src
% ./configure --prefix=/usr/local/pgbackrest
% gmake
% sudo gmake install

I’ve decided to install it on a specific path, /usr/local/pgbackrest just to avoid messing with other binaries, but you can install in the default FreeBSD location /usr/local/. If everything was succesful, you can then proceed to testing the program:

% export PATH=/usr/local/pgbackrest/bin:$PATH

% pgbackrest
pgBackRest 2.27 - General help

Usage:
    pgbackrest [options] [command]

Commands:
    archive-get     Get a WAL segment from the archive.
    archive-push    Push a WAL segment to the archive.
    backup          Backup a database cluster.
    check           Check the configuration.
    expire          Expire backups that exceed retention.
    help            Get help.
    info            Retrieve information about backups.
    restore         Restore a database cluster.
    stanza-create   Create the required stanza data.
    stanza-delete   Delete a stanza.
    stanza-upgrade  Upgrade a stanza.
    start           Allow pgBackRest processes to run.
    stop            Stop pgBackRest processes from running.
    version         Get version.

Use 'pgbackrest help [command]'for more information.

Great! Installing on FreeBSD is now really simple!

Some recent history about pgbackrest

In the last few month the porject was deply improved, and I’m not going to quote the whole release history here. However, there are two major aspects that I found really interesting.

Autoconf

As you probably have noted in the above installation example, pgbackrest now uses autoconf to understand how to correctly configure the project for the hosting operating system. Autoconf was introduced in the previous year as a reaction to a pull request I opened to compile on FreeBSD.

Migrating to C

pgbackrest was initially developed mainly in Perl, with little parts written in C to deal with performances and internals of PostgreSQL WAL files format.
As of January 2020, release 2.21, the whole codebase is in C. Well, this is not fully true, since the testing and documentation part is still written in Perl, at least to my understanding, but the whole pgbackrest production thing is now in C.
The fact that the application is now written in C makes a clear distinction between pgbackrest and other similar backup solutions, that indeed take advantages of existing tools to behave as “glue” between small pieces. Moreover, it means that the backup, and most notably the restore, can run at full speed.

My little messy contribution

A long time ago… I tried to contribute to a requested feature that sounded very easy to implement, and of course it was not!
Since version 2.25 there is the --dry-run flag for the expire command:

Add –dry-run option to the expire command. Use dry-run to see which backups/archive would be removed by the expire command without actually removing anything. (Contributed by Cynthia Shang, Luca Ferrari. Reviewed by David Steele. Suggested by Marc Cousin.)

Unluckily, I was unable to complete the effort because I was unable to use the testing system, and it was my fault, I underestimated the problem. But there are two very good news about this:

the project provide me a very quick, polite and constant support in trying to fix my issues;
they required me to test my changes instead of doing the testing by themselves.

Why are the above good news? First of all, other projcets are not so reactive when new contributions come, and I think this is very important for the project health. Second, testing a feature means that the project will not introduce regressions, and forcing every developer to test their own changes is a very good habit.

Conclusions

I have already used pgbackrest on FreeBSD, but now that it is *natively** supporting this platform I believe that the project will attrac more and more users. Moreover, now that all the code has been converted to C, the already optimal performances will be much more impressive.

pgbackrest is definetely my backup solution of choice, and not only for its features, but also for the clean and rigorous way the project is mantained and improved.

↧

Bruce Momjian: Connect Parameter Specification Options

June 12, 2020, 10:00 am

≫ Next: Shaun M. Thomas: PG Phriday: 10 Things Postgres Could Improve – Part 1

≪ Previous: Luca Ferrari: Running pgbackrest on FreeBSD

I have previously covered the importance of libpq and environment variables. While you can specify discrete connection command-line parameters and environment variables, there is a catch-all setting that allows connection options to be specified in a single string, e.g.:

$ psql -d test
psql (13devel)
Type "help" for help.
 
test=> \q
 
$ psql --dbname test
psql (13devel)
Type "help" for help.
 
test=> \q
 
$ psql 'dbname=test'
psql (13devel)
Type "help" for help.
 
test=> \q

↧

Shaun M. Thomas: PG Phriday: 10 Things Postgres Could Improve – Part 1

June 12, 2020, 11:00 am

≫ Next: Paul Ramsey: Developers Diary 2

≪ Previous: Bruce Momjian: Connect Parameter Specification Options

Postgres is a database software engine with a lot to love, and certainly a non-zero amount to hate. Rather than approaching the topic from a glib and clickbaity angle, let’s see what’s really lurking in those dark recesses, and how we may potentially address them. Though the original blog post calling out Postgres’ faults gave […]

↧

Paul Ramsey: Developers Diary 2

June 11, 2020, 1:00 am

≫ Next: Claire Giordano: Release notes for Citus 9.3, the extension that scales out Postgres horizontally

≪ Previous: Shaun M. Thomas: PG Phriday: 10 Things Postgres Could Improve – Part 1

Have you ever watched a team of five-year-olds play soccer? The way the mass of children chases the ball around in a group? I think programmers do that too.

Get the ball!

There’s something about working on a problem together that is so much more rewarding than working separately, we cannot help but get drawn into other peoples problems. There’s a lot of gratification to be had in finding a solution to a shared difficulty!

Even better, different people bring different perspectives to a problem, and illuminate different areas of improvement.

Maximum Inscribed Circle

A couple months ago, my colleague Martin Davis committed a pair of new routines into JTS, to calculate the largest circles that can fit inside a polygon or in a collection of geometries.

Maximum Inscribed Circle

We want to bring all the algorithmic goodness of JTS to PostGIS, so I took up the first step, and ported “maximum inscribed circle” to GEOS and to PostGIS.

When I ported the GEOS test cases, I turned up some odd performance problems. The calculation seemed to be taking inordinately long for larger inputs. What was going on?

The “maximum inscribed circle” algorithm leans heavily on a routine called IndexedFacetDistance to calculate distances between polygon boundaries and candidate circle-centers while converging on the “maximum inscribed circle”. If that routine is slow, the whole algorithm will be slow.

Dan Baston, who originally ported the “IndexedFacetDistance” class got interested and started looking at some test cases of his own.

He found he could improve his old implementation using better memory management that he’d learned in the meantime. He also found some short-circuits to envelope distance calculation that improved performance quite a bit.

In fact, they improved performance so much that Martin ported them back to JTS, where he found that for some cases he could log a 10x performance in distance calculations.

There’s something alchemical about the whole thing.

There was a bunch of long-standing code nobody was looking at.
I ported an unrelated algorithm which exercised that code.
I wrote a test case and reported some profiling information.
Other folks with more knowledge were intrigued.
They fed their knowledge back and forth and developed more tests.
Improvements were found that made everything faster.

I did nothing except shine a light in a dark hole, and everyone else got very excited and things happened.

Toast Caching Redux

In a similar vein, as I described in my last diary entry, a long-standing performance issue in PostGIS was the repeated reading of large geometries during spatial joins.

Much of the problem was solved by dropping a very small “TOAST cache” into the process by which PostGIS reads geometries in functions frequently used in spatial joins.

TOAST

I was so happy with the improvement the TOAST cache provided that I just stopped. Fortunately, my fellow PostGIS community member Raúl Marín was more stubborn.

Having seen my commit of the TOAST cache, and having done some work in other caching parts of PostGIS, he took up the challenge and integrated the TOAST cache with the existing index caches.

The integrated system now uses TOAST identifiers to note identical repeated inputs and avoid both unneccessary reads off disk and unncessary cache checks of the index cache.

The result is that, for spatial joins over large objects, PostGIS 3.1 will be as much as 60x faster than the performance in PostGIS 3.0.

I prepared a demo for a bid proposal this week and found that an example query that took 800ms on my laptop took a full minute on the beefy 16-core demo server. What had I done wrong? Ah! My laptop is running the latest PostGIS code (which will become 3.1) while the cloud server was running PostGIS 2.4. Mystery solved!

Port, Port, Port

I may have mentioned that I’m not a very good programmer.

My current task is definitely exercising my imposter syndrome: porting Martin’s new overlay code from JTS to GEOS.

I knew it would take a long time, and I knew it would be a challenge; but knowing and experiencing are quite different things.

The challenges, as I’ve experienced them are:

Moving from Java’s garbage collected memory model to C++’s managed memory model means that I have to understand the object life-cycle which is implicit in Java and make it explicit in C++, all while avoiding accidentally introducing a lot of memory churn and data copying into the GEOS port. Porting isn’t a simple matter of transcribing and papering over syntactic idiom, it involves first understanding the actual JTS algorithms.
The age of the GEOS code base, and number of contributors over time, mean that there are a huge number of different patterns to potentially follow in trying to make a “consistent” port to GEOS. Porting isn’t a matter of blank-slate implementation of the JTS code – the ported GEOS code has to slot into the existing GEOS layout. So I have to spend a lot of time learning how previous implementations chose to handle life cycles and call patterns (pass reference, or pointer? yes. Return value? or void return and output parameter? also yes.)
My lack of C++ idiom means I spend an excessive amount of time looking up core functions and methods associated with them. This is the only place I’ve felt myself measurably get better over the past weeks.

I’m still only just getting started, having ported some core data structures, and little pieces of dependencies that the overlay needs. The reward will be a hugely improved overlay code for GEOS and thus PostGIS, but I anticipate the debugging stage of the port will take quite a while, even when the code is largely complete.

Wish me luck, I’m going to need it!

If you would like to test the new JTS overlay code, it resides on this branch.
If you would like to watch me suffer as I work on the port, the GEOS branch is here.

↧

Claire Giordano: Release notes for Citus 9.3, the extension that scales out Postgres horizontally

June 13, 2020, 10:20 am

≫ Next: Peter Eisentraut: Understanding user management in PgBouncer

≪ Previous: Paul Ramsey: Developers Diary 2

Our latest release to the Citus open source extension to Postgres is Citus 9.3.

If you’re a regular reader of the Citus Blog, you already know Citus transforms Postgres into a distributed database, distributing your data and SQL queries across multiple servers. This post—heavily inspired by the internal release notes that lead engineer Marco Slot circulated internally—is all about what’s new & notable in Citus 9.3.

And if you’re chomping at the bit to get started and try out Citus open source, just go straight to downloading the Citus open source packages for 9.3. Or head over to the Citus documentation to learn more.

Because Citus 9.3 improves our SQL support for window functions in Citus, we decided to add a few “windows” to the racecar in the Citus 9.3 release graphic below.

Citus 9.3 racecar graphic now has “windows”, because of the window function support added in Citus 9.3.

For those who prefer bullets, a summary of all things new in Citus 9.3

Citus 9.3 builds on top of all the HTAP performance goodness in Citus 9.2 and brings you advanced support for distributed SQL, operational improvements, and things that make it easier to migrate from single-node Postgres to a distributed Citus cluster.

Before we dive into what’s new in Citus 9.3, here’s a quick overview of the major themes:

ADVANCED DISTRIBUTED SQL SUPPORT
- Full support for Window functions (enabling more advanced analytics use cases)
- Improved shard pruning
- INSERT..SELECT with sequences
- Support for reference tables on the Citus coordinator
OPERATIONAL IMPROVEMENTS
- Adaptive connection management
- Propagation of ALTER ROLE across the cluster
- Faster, low-memory pg_dump
- Local data truncation function

Window functions

Window functions are a powerful SQL feature that enable you to run algorithms (do transformations) on top of your Postgres query results, in relationship to the current row. Window function support for cross-shard queries had become one of the top feature requests from our Citus analytics users.

Prior to Citus 9.3, Postgres window functions were always supported for router (e.g. single tenant) queries in Citus—and we also supported window functions in SQL queries across shards that used PARTITION BY distribution_column.

The good news: As of Citus 9.3, Citus now has full support for Postgres window functions, to enable more advanced analytics use cases.

If you’re not yet familiar with Postgres window functions, here’s a simple example. The SQL query below uses window functions (via the OVER syntax) to answer 3 questions:

for the current person, whose birthday is right before that person’s birthday?
how many people have a birthday in the same year as this person?
how many people have a birthday in the same month as this person?

SELECTname,birth_date,lag(name)OVER(ORDERBYextract(doyfrombirth_date))ASprevious_birthday,count(*)OVER(PARTITIONBYextract(yearfrombirth_date))ASsame_year,count(*)OVER(PARTITIONBYextract(monthfrombirth_date))ASsame_monthFROMbirthdaysORDERBYname;

Marco Slot reminded me: another way to understand window functions is to think about the Postgres planner, and exactly where in the process the window functions are handled.

This sketch from Julia Evans is a super useful way to visualize the order that things run in a PostgreSQL query. You can see in Julia’s zine that the SELECT is not at the beginning but rather, the SELECT is run after doing a GROUP BY.

Tweet: SQL queries run in this order — Source: This tweet from Julia Evans, @b0rk on Twitter.

For those of you who work with analytics, Postgres window functions can be incredibly powerful because you don’t have to write a whole new algorithm or rewrite your SQL query to use CTEs (common table expressions).

Improved shard pruning

We received an amazing open source contribution on the Citus github repo from one of our Citus enterprise users who wanted their Postgres queries to be faster. Many of their SQL queries only need to access a small subset of shards, but when the WHERE clause involves expressions involving OR, some of their SQL queries were still querying all shards.

By expanding the shard pruning logic, our Citus customer (thank you, Markus) has made the shard pruning logic work with arbitrary Boolean expressions. So now, these types of Postgres queries will go to fewer shards (in this case, to just 2 or 3 shards instead of 32). The net result: faster query performance.

INSERT..SELECT with sequences

One of the most powerful commands in the Citus extension to Postgres is INSERT..SELECT, because it can be used for parallel, distributed data transformations that run as one distributed transaction.

Citus 9.3 enables inserting into a distributed table with a sequence, which was one of the few previous limitations of INSERT..SELECT on distributed tables.

Support for reference tables on the Citus coordinator node

To take advantage of distributed tables in Citus, it’s important that your SQL queries can be routed or parallelized along the distribution column. (If you’re still learning about how sharding in Citus works, Joe’s documentation on choosing the distribution column is a good place to start.)

Anyway, the need to route queries along the distribution column (sometimes called the “distribution key” or the “sharding key”) means that if you’re migrating an application originally built on single-node PostgreSQL over to Citus, you might need to make a few data model and query changes.

One of the ways Citus 9.3 simplifies migrations from single-node Postgres to a distributed Citus cluster is by improving support for using a mixture of different table types in Citus: reference tables, distributed tables, and local tables (sometimes referred to as “standard Postgres tables.”)

With Citus 9.3, reference tables can now be JOINed with local tables on the Citus coordinator.

This was kind of sort of possible before, but has become a lot better in Citus 9.3. Having the reference table on the Citus coordinator node and being able to JOIN between the local table and the reference table on the coordinator itself.

And if your application reads from the reference table on the Citus coordinator node, there won’t be a roundtrip to any of the Citus worker nodes, because you’ll be able to read directly from the coordinator (unless you actually want to round robin to balance the load across the cluster.)

In the example SQL below, after adding the coordinator to the Citus metadata, JOINs between local tables and reference tables just work. (And yes, we’re taking steps to remove references to “master” from Citus and to rename master_add_node to something else. Naming matters, stay tuned.)

-- Add coordinator to Citus metadata on the coordinatorSELECTmaster_add_node('10.0.0.4',5432,groupid:=0)-- Joins between local tables and reference tables just workBEGIN;INSERTINTOlocalVALUES(1,2);SELECT*FROMlocal_tableJOINreference_tableON(x=a);END;

An example of how reference tables on the Citus coordinator can help

But what if you have tables like the ones in the schema below, in which:

clicks joins with ads
ads joins with publishers
ads joins with campaigns

Prior to Citus 9.3, for the example I explain above, you would have had to make all the tables either a distributed table or a reference table in order to enable these joins, like this:

diagram 1 — Diagram 1: Table Schema for this scenario, prior to Citus 9.3

But now with Citus 9.3, because you can now add a reference table onto the Citus coordinator node, you can JOIN local tables (aka standard Postgres tables) on the Citus coordinator with Citus reference tables.

Imagine if the clicks table is the only really big table in this database. Maybe the size and scale of the clicks table is the reason you’re considering Citus, and maybe the clicks table only needs to JOIN with the ads table.

If you make the ads table a reference table, then you can JOIN all the shards in the distributed clicks table with the ads reference table. And maybe everything else is not that big and we can just keep the rest of the tables as local tables on the Citus coordinator node. This way, any Postgres query that hits those local tables, any DDL, well, we don’t have to change anything, because the query is still being handled by Postgres, by the Citus coordinator acting as a Postgres server.

diagram 2 — Diagram 2: Table Schema for this scenario, enabled by Citus 9.3 and ability to put a reference table on coordinator

One piece we don’t have yet is support for foreign keys between the reference table and the local tables, but we are considering making that feature available in the future. (Do you think that would be useful? You can always reach us on our Citus public slack.)

Interesting side-effect: distributed tables can sit entirely on Citus coordinator node

One interesting side effect of this new Citus 9.3 feature: you can now have distributed tables that sit entirely on the coordinator, so you can add the coordinator to the metadata and create a distributed table where all the shards are on the coordinator and the Postgres queries will just work. Just think how useful that can be for testing, since you can now run your test against that single Citus coordinator node.

Adaptive connection management (a super awesome operational improvement)

Those of you who already use Citus to scale out Postgres realize that Citus does much more than just shard your data across the nodes in a database cluster. Citus also distributes the SQL queries themselves across the relevant shards and to the worker nodes in the database cluster.

The way Citus does this: Citus parallelizes your distributed SQL queries across multiple cores per worker node, by opening multiple Postgres connections to each worker during the SQL query execution, to query the shards on each Citus worker node in parallel.

Unlike the built-in parallel query mechanism in PostgreSQL, Citus continues to parallelize queries under high concurrency. However, connections in PostgreSQL are a scarce resource and prior to Citus 9.3, distributed SQL queries could fail when the coordinator exceeded the Postgres connection limit on the Citus worker nodes.

The good news is, in Citus 9.3, the Citus adaptive executor now tracks and limits the number of outgoing connections to each worker (configurable using citus.max_shared_pool_size) in a way that achieves fairness between SQL queries, to avoid any kind of starvation issues.

Effectively, the connection management in the Citus adaptive executor is now adaptive to both the type of PostgreSQL query it is running, as well as the load on the system. And if you have a lot of analytical SQL queries running at the same time, the adaptive connection management in Citus will now magically keep working without needing to set up PgBouncer.

Propagation of ALTER ROLE across the cluster

In Citus 9.3, Citus now automatically propagates commands like ALTER ROLE current_user SET..TO.. to all current and new workers. This gives you a way to reliably set configurations across all nodes in the cluster. (N.B. You do need to be the database superuser to take advantage of this new feature.)

Faster, low-memory pg_dump for Citus

Citus stores the results of PostgreSQL queries on shards in a tuple store (an in-memory data structure that overflows into a file) on the Citus coordinator. The PostgreSQL executor can then process these results on the Citus coordinator as if they came from a regular Postgres table.

However, prior to Citus 9.3, the approach of writing to a tuple store sometimes caused issues when using pg_dump to get the entire table’s contents, since the full distributed table might not fit on the Citus coordinator’s disk.

In Citus 9.3, we changed the implementation of the COPY <table> TO STDOUT command, in order to stream tuples directly from the workers to the client without storing them.

Bottom line, if you need to pg_dump your Citus distributed database, now you can stream your Citus database to your client directly, without using a lot of memory or storage on the Citus coordinator node.

Local data truncation function

When calling create_distributed_table on a table on the coordinator that already contains data, the data is copied into newly created shards, but for technical reasons it is not immediately removed from the local table on the Citus coordinator node.

The presence of the local data could later cause issues, but the procedure for removing the old data was prone to mistakes and sometimes led to operational issues. In Citus 9.3, we introduced a new function, truncate_local_data_after_distributing_table, to make it easier to clean up the local data—which saves space and makes sure you won’t run into situations where you cannot create a Postgres constraint because the local data does not match it.

Helping you scale out Postgres with Citus 9.3

By the time I publish this post, our Citus distributed engineering team will be well on their way to Citus 9.4. I can’t wait to find out what our Postgres team will bring to us next.

With the advanced support for distributed SQL, operational improvements, and things that make it easier to migrate from single-node Postgres to a distributed Citus cluster—well, with Citus 9.3, your window functions will just work, you don’t have to worry about deciding whether every table needs to be distributed or not, you can do pg_dumps more easily, adaptive connection management will improve your operational experience… In short, we think Citus 9.3 is pretty darn awesome. I hope you do, too.

↧

Peter Eisentraut: Understanding user management in PgBouncer

June 15, 2020, 2:01 am

≫ Next: Andreas 'ads' Scherbaum: Tomas Vondra

≪ Previous: Claire Giordano: Release notes for Citus 9.3, the extension that scales out Postgres horizontally

PgBouncer is a popular connection proxy and pooler for PostgreSQL. As PgBouncer presents a PostgreSQL protocol interface to client applications, it also handles client authentication. For that, it maintains its own directory of users and passwords. That is sometimes a source of confusion, so in this blog post I want to try to describe how […]

↧

Andreas 'ads' Scherbaum: Tomas Vondra

June 15, 2020, 6:00 am

≫ Next: Peter Eisentraut: Understanding user management in PgBouncer

≪ Previous: Peter Eisentraut: Understanding user management in PgBouncer

PostgreSQL Person of the Week Interview with Tomas Vondra: My name is Tomas Vondra, I live in Prague, and I’m a PostgreSQL user, developer, contributor and committer. I work for 2ndQuadrant, one of the companies contributing to PostgreSQL and providing services related to it, and I’m also involved in the local PostgreSQL community in various ways. Aside from that I do have various sports-related hobbies - cycling for example.

↧

Peter Eisentraut: Understanding user management in PgBouncer

June 15, 2020, 8:53 am

≫ Next: Bruce Momjian: Controlling Server Variables at Connection Time

≪ Previous: Andreas 'ads' Scherbaum: Tomas Vondra

↧

Bruce Momjian: Controlling Server Variables at Connection Time

June 15, 2020, 9:15 am

≫ Next: Pavel Stehule: mandown - markdown terminal viewer

≪ Previous: Peter Eisentraut: Understanding user management in PgBouncer

I have recently covered the importance of libpq environment variables and connection specification options. While most libpq options control how to connect to the Postgres server, there is one special option that can change variables on the server you connect to, e.g.:

$ psql 'options=-cwork_mem=100MB dbname=test'
psql (13devel)
Type "help" for help.
 
test=> SHOW work_mem;
 work_mem
----------
 100MB

↧

Pavel Stehule: mandown - markdown terminal viewer

June 15, 2020, 10:11 pm

≫ Next: Haroon .: RESTful CRUD API using PostgreSQL and Spring Boot – Part 2

≪ Previous: Bruce Momjian: Controlling Server Variables at Connection Time

My extensions and applications uses markdown format for documentation. There is new nice viewer - mandown - that allows viewing in terminal.

https://github.com/Titor8115/mandown

↧

Haroon .: RESTful CRUD API using PostgreSQL and Spring Boot – Part 2

June 16, 2020, 7:56 am

≫ Next: Luca Ferrari: PostgreSQL 11 Server Side Programming Errata Corrige

≪ Previous: Pavel Stehule: mandown - markdown terminal viewer

Overview This article is an extended version atop of the previous article which was a kickstart for building an application using Spring Boot and PostgreSQL. There is no internal feature supported by Java which offers mapping between class objects and database tables; therefore, we use Object Relational Model (ORM). JPA is an abstraction above JDBC […]

↧

Luca Ferrari: PostgreSQL 11 Server Side Programming Errata Corrige

June 16, 2020, 5:00 pm

≫ Next: Kirk Roybal: Oracle to PostgreSQL: Basic Architecture

≪ Previous: Haroon .: RESTful CRUD API using PostgreSQL and Spring Boot – Part 2

A reader provided us a feedback about a wrong listing.

PostgreSQL 11 Server Side Programming Errata Corrige

I have already written about how my first book on PostgreSQL, named PostgreSQL 11 Server Side Programming Quick Start Guide, gained more attention.

Gaining attention also means that readers could find out problems and errors, and this is good (to me)!

The first problem that has been reported to me is described here, so that if you are reading the book can better understand and deal with the problem.

Listing 8 on Chapter 3

The Listing 8 in chapter 3 is wrong, and in particular it is the very same listing as Listing 13 later in the chapter. The problem is that the shown listing 8 does not include a variable, namely file_type, that is referenced in the text.
Therefore, if you are dealing with that particular example, please consider that the right listing is reported on the official GitHub repository.

I’m really sorry about the misplaced listing, I hope this can help making it more readable.

↧

Kirk Roybal: Oracle to PostgreSQL: Basic Architecture

June 17, 2020, 8:17 am

≫ Next: Bruce Momjian: Dinner Q&A

≪ Previous: Luca Ferrari: PostgreSQL 11 Server Side Programming Errata Corrige

This article provides the Oracle database administrator with equivalent PostgreSQL architecture knowledge. The process is a bit loose, but it is sufficient to bootstrap the concepts that are transferable and identify the ones that are not.

↧