Regina Obe: PostGIS 2.5.1

November 17, 2018, 4:00 pm

≫ Next: Adrien Nayrat: PostgreSQL and heap-only-tuples updates - part 2

≪ Previous: Pavel Stehule: new update pspg

The PostGIS development team is pleased to provide bug fix 2.5.1 for the 2.5 stable branch.

Although this release will work for PostgreSQL 9.4 thru PostgreSQL 11, to take full advantage of what PostGIS 2.5 offers, you should be running PostgreSQL 11 and GEOS 3.7.0.

Best served with PostgreSQL 11.1 and pgRouting 2.6.1.

WARNING: If compiling with PostgreSQL+JIT, LLVM >= 6 is required Supported PostgreSQL versions for this release are: PostgreSQL 9.4 - PostgreSQL 11 GEOS >= 3.5

2.5.1

source download
NEWS
PDF docs en, de
html doc download
html online en, de, ko_KR, ja
epub doc download
pgsql help files for non-english languages: de, ja,kr, br, es
Changelog

Continue Reading by clicking title hyperlink ..

↧

Adrien Nayrat: PostgreSQL and heap-only-tuples updates - part 2

November 18, 2018, 11:00 pm

≫ Next: Hans-Juergen Schoenig: PostgresSQL: Implicit vs. explicit joins

≪ Previous: Regina Obe: PostGIS 2.5.1

Here is a series of articles that will focus on a new feature in version 11. During the development of this version, a feature caught my attention. It can be found in releases notes : https://www.postgresql.org/docs/11/static/release-11.html Allow heap-only-tuple (HOT) updates for expression indexes when the values of the expressions are unchanged (Konstantin Knizhnik) I admit that this is not very explicit and this feature requires some knowledge about how postgres works, that I will try to explain through several articles:

↧

Hans-Juergen Schoenig: PostgresSQL: Implicit vs. explicit joins

November 19, 2018, 1:00 am

≫ Next: Magnus Hagander: PGConf.EU 2018 - the biggest one yet!

≪ Previous: Adrien Nayrat: PostgreSQL and heap-only-tuples updates - part 2

If you happen to be an SQL developer, you will know that joins are really at the core of the language. Joins come in various flavors: Inner joins, left joins, full joins, natural joins, self joins, semi-joins, lateral joins, and so on. However, one of the most important distinctions is the difference between implicit and explicit joins. Over the years, flame wars have been fought over this issue. Still, not many people know what is really going on. Therefore my post might help to shed some light on the situation.

Using implicit joins

Before we dig into practical examples, it is necessary to create some tables that we can later use to perform our joins:

test=# CREATE TABLE a (id int, aid int);
CREATE TABLE
test=# CREATE TABLE b (id int, bid int);
CREATE TABLE

In the next step some rows are added to those tables:


test=# INSERT INTO a
         VALUES (1, 1), (2, 2), (3, 3);
INSERT 0 3
test=# INSERT INTO b
         VALUES (2, 2), (3, 3), (4, 4);
INSERT 0 3

An implicit join is the simplest way to join data. The following example shows an implicit join:

test=# SELECT *
       FROM  a, b
       WHERE a.id = b.id;
 id | aid | id | bid
----+-----+----+-----
  2 |   2 |  2 |   2
  3 |   3 |  3 |   3
(2 rows)

In this case, all tables are listed in the FROM clause and are later connected in the WHERE clause. In my experience, an implicit join is the most common way to connect two tables. However, my observation might be heavily biased, because an implicit join is the way I tend to write things in my daily work.

Using explicit joins

The following example shows an explicit join. Some people prefer the explicit join syntax over implicit joins because of readability or for whatever other reason:

test=# SELECT *
       FROM a JOIN b
            ON (aid = bid);
 id | aid | id | bid
----+-----+----+-----
  2 |   2 |  2 |   2
  3 |   3 |  3 |   3
(2 rows)

In this case tables are connected directly using an ON-clause. The ON-clause simply contains the conditions we want to use to join those tables together.

Explicit joins support two types of syntax constructs: ON-clauses and USING-clauses. An ON-clause is perfect in case you want to connect different columns with each other. A using clause is different: It has the same meaning, but it can only be used if the columns on both sides have the same name. Otherwise a syntax error is issued:

test=# SELECT *
       FROM a JOIN b
            USING (aid = bid);
ERROR: syntax error at or near "="
LINE 1: SELECT * FROM a JOIN b USING (aid = bid);

Using is often used to connect keys with each other as shown in the next example:

test=# SELECT *
       FROM a JOIN b USING (id);
 id | aid | bid
----+-----+-----
  2 |   2 |   2
  3 |   3 |   3
(2 rows)

In my tables both column have a column called “id” so it is possible to use USING here. Keep in mind: USING is mostly syntactic sugar – there is no deeper meaning here.

Often an explicit join is not just used to join two but more tables. To show how that works, I have added one more table:

test=# CREATE TABLE c (id int, cid int);
CREATE TABLE

Let us add some data to this table:

test=# INSERT INTO c VALUES (3, 3), (4, 4), (5, 5);
INSERT 0 2

To perform an explicit join, just add addition JOIN and USING clauses (respectively ON clauses) to the statement. Here is an example:

test=# SELECT *
       FROM a INNER JOIN b USING (id)
            JOIN c USING (id);
 id | aid | bid | cid
----+-----+-----+-----
  3 |   3 |   3 | 3
(1 row)

Of course the same can be done with an implicit join:


test=# SELECT *
       FROM  a, b, c
       WHERE a.id = b.id
             AND b.id = c.id;
 id | aid | id | bid | id | cid
----+-----+----+-----+----+-----
  3 |   3 |  3 |   3 |  3 |   3
(1 row)

However, as you can see, there is a small difference. Check the number of columns returned by the query. You will notice that the implicit join returns more. The “id” column will show up more frequently in this case because the implicit join handles the column list in a slightly different way.

The column list is of course a nasty detail, because in a real application it is always better to explicitly list all columns anyway. However, this little detail should be kept in mind.

join_collpase_limit: What the optimizer does

When I am on the road working as PostgreSQL consultant or PostgreSQL support guy, people often ask if there is a performance difference between implicit and explicit joins. The answer is: “Usually not”. Let us take a look at the following statement:

test=# explain SELECT * FROM a INNER JOIN b USING (id);
                          QUERY PLAN
-----------------------------------------------------------------
Merge Join (cost=317.01..711.38 rows=25538 width=12)
   Merge Cond: (a.id = b.id)
   -> Sort (cost=158.51..164.16 rows=2260 width=8)
      Sort Key: a.id
      -> Seq Scan on a (cost=0.00..32.60 rows=2260 width=8)
   -> Sort (cost=158.51..164.16 rows=2260 width=8)
      Sort Key: b.id
      -> Seq Scan on b (cost=0.00..32.60 rows=2260 width=8)
(8 rows)

The explicit join produces exactly the same plan as the implicit plan shown below:

test=# explain SELECT * FROM a, b WHERE a.id = b.id;
                           QUERY PLAN
-----------------------------------------------------------------
Merge Join (cost=317.01..711.38 rows=25538 width=16)
   Merge Cond: (a.id = b.id)
    -> Sort (cost=158.51..164.16 rows=2260 width=8)
       Sort Key: a.id
       -> Seq Scan on a (cost=0.00..32.60 rows=2260 width=8)
    -> Sort (cost=158.51..164.16 rows=2260 width=8)
       Sort Key: b.id
       -> Seq Scan on b (cost=0.00..32.60 rows=2260 width=8)
(8 rows)

So in themajority of all cases, an implicit join does exactly the same thing as an explicit join.

However, this is no always the case. In PostgreSQL there is a variable called join_collapse_limit:

test=# SHOW join_collapse_limit;
 join_collapse_limit
---------------------
 8
(1 row)

What does it mean? If you prefer explicit over implicit joins, the planner will always plan the first couple of joins automatically – regardless of which join order you have used inside the query. The optimizer will simply reorder joins the way they seem to be most promising. But if you keep adding joins, the ones exceeding join_collapse_limit will be planned the way you have put them into the query. As you can easily imagine, we are already talking about fairly complicated queries. Joining 9 or more tables is quite a lot and beyond the typical operation in most cases.

There is another parameter called from_collapse_limit that does the same thing for implicit joins and has the same default value. If a query lists more than from_collapse_limit tables in its FROM clause, the ones exceeding the limit will not be reordered, but joined in the order they appear in the statement.

For the typical, “normal” query the performance and the execution plans stay the same and it makes no difference which type of join you prefer.

If you want to read more about joins, consider reading some of our other blog posts: https://www.cybertec-postgresql.com/en/time-in-postgresql-outer-joins/

The post PostgresSQL: Implicit vs. explicit joins appeared first on Cybertec.

↧

Magnus Hagander: PGConf.EU 2018 - the biggest one yet!

November 19, 2018, 12:01 pm

≫ Next: Jobin Augustine: Installing and Configuring JIT in PostgreSQL 11

≪ Previous: Hans-Juergen Schoenig: PostgresSQL: Implicit vs. explicit joins

It's now almost a month since PGConf.EU 2018 in Lisbon. PGConf.EU 2018 was the biggest PGConf.EU ever, and as far as I know the biggest PostgreSQL community conference in the world! So it's time to share some of the statistics and feedback.

I'll start with some attendee statistics:

451 registered attendees 2 no-shows 449 actual present attendees

Of these 451 registrations, 47 were sponsor tickets, some of who were used by sponsors, and some were given away to their customers and partners. Another 4 sponsor tickets went unused.

Another 52 were speakers.

This year we had more cancellations than we've usually had, but thanks to having a waitlist on the conference we managed to re-fill all those spaces before the event started.

↧

Jobin Augustine: Installing and Configuring JIT in PostgreSQL 11

November 19, 2018, 5:49 pm

≫ Next: Richard Yen: PgBouncer Pro Tip: Use auth_user

≪ Previous: Magnus Hagander: PGConf.EU 2018 - the biggest one yet!

Just-in-time (JIT in PostgreSQL) compilation of SQL statements is one of the highlighted features in PostgreSQL 11. There is great excitement in the community because of the many claims of up to a 30% jump in performance. Not all queries and workloads get the benefit of JIT compilation. So you may want to test your workload against this new feature.

However, It is important to have a general understanding of what it does and where we can expect the performance gains. Installing PostgreSQL 11 with the new JIT compilation feature requires few extra steps and packages. Taking the time and effort to figure out how to do this shouldn’t be a reason to shy away from trying these cutting-edge features and testing a workload against the JIT feature. This blog post is for those who want to try it.

What is JIT and What it does in PostgreSQL

Normal SQL execution in any DBMS software is similar to what an interpreted language does to the source code. No machine code gets generated out of your SQL statement. But we all know that how dramatic the performance gains can be from a JIT compilation and execution of the machine code it generates. We saw the magic Google V8 engine did to JavaScript language. The quest for doing a similar thing with SQL statement was there for quite some time. But it is a challenging task.

It is challenging because we don’t have the source code (SQL statement) ready within the PostgreSQL server. The source code that needs to undergo JIT need to come from client connections and there could be expressions/functions with a different number of arguments, and it may be dealing with tables of different number and type of columns.

Generally, a computer program won’t get modified at this level while it is running, so branching-predictions are possible. The unpredictability and dynamic nature of SQL statements coming from client connections and hitting the database from time-to-time give no scope for doing advance prediction or compilation in advance. That means the JIT compiler should kick in every time the database gets an SQL statement. For this reason, PostgreSQL needs the help of compiler infrastructure like LLVM continuously available behind. Even though there were a couple of other options, the main developer of this feature (Andres Freund) had a strong reason why LLVM was the right choice.

. In PostgreSQL 11, the JIT feature currently does:

Accelerating expression evaluation: Expressions in WHERE clauses, target lists, aggregates and projections
Tuple deforming: Converting on-disk image to corresponding memory representation.
In-lining: bodies of small custom functions, operators and user-defined data types are inline-ed into the expressions using them
You can use compiler optimizations provided by LLVM for preparing optimized machine code.

In this blog, we are going to see how to install PostgreSQL with JIT. Just like regular PostgreSQL installations, we have two options:

Get PostgreSQL from the packages in the PGDG repository
Build PostgreSQL from source

Option 1. Install from PGDG repository.

Compiling from source requires us to install all compilers and tools. We might want to avoid this for various reasons. Installing packages from a PGDG repository is straightforward. On production systems or a container, you might want to install only the bare minimum required packages. Additional packages you don’t really use are always a security concern. Distributions like Ubuntu provide more recent versions of libraries and tool-sets in their default repos. However, distributions like CentOS / RHEL are quite conservative — their priority is stability and proven servers rather than cutting-edge features. So In this section of the post is mostly relevant for CentOS7/RHEL 7.

Here are the steps for the bare minimum installation of PostgreSQL with JIT feature on CentOS7

Step 1. Install PGDG repo and Install PostgreSQL server package.

This is usually the bare minimum installation if we don’t need the JIT feature.

sudo yum install https://download.postgresql.org/pub/repos/yum/11/redhat/rhel-7-x86_64/pgdg-centos11-11-2.noarch.rpm
sudo yum install postgresql11-server

At this stage, we can initialize the data directory and start the service if we don’t need JIT:

sudo /usr/pgsql-11/bin/postgresql*-setup initdb
sudo systemctl start postgresql-11

Step 2. Install EPEL repository

sudo yum install epel-release

Step 3. Install package for PostgreSQL with llvmjit

sudo yum install postgresql11-llvmjit

Since we have already added the EPEL repository, now the dependancy can be resolved by YUM and it can pull and install the necessary package from EPEL. Installation message contains the necessary packages.

...
Installing:
postgresql11-llvmjit      x86_64     11.1-1PGDG.rhel7     pgdg11    9.0 M
Installing for dependencies:
llvm5.0                   x86_64     5.0.1-7.el7          epel      2.6 M
llvm5.0-libs              x86_64     5.0.1-7.el7          epel      13 M
...

As we can see, there are two packages: llvm5.0 and llvm5.0-libs get installed.

Note for Ubuntu users:

As we already mentioned, Repositories of recent versions of Ubuntu contains recent versions of LLVM libraries. For example, Ubuntu 16.04 LTS repo contains libllvm6.0 by default. Moreover, PostgreSQL server package is not divided to have a separate package for jit related files. So default installation of PostgreSQL 11 can get you JIT feature also.

wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
sudo apt install postgresql-11

Option 2. Building from Source

The primary means of distributing PostgreSQL is the source code. Building a minimal PostgreSQL instance requires just a C compiler. But building JIT options requires a few more things. One of the challenges you can run into is different errors during the build process due to older versions of LLVM and Clang present in the system.

Step 1. Download PostgreSQL source tarball and unpack

Tarballs are available in the repository. We can grab and unpack the latest:

curl -LO https://ftp.postgresql.org/pub/source/v11.0/postgresql-11.0.tar.bz2
tar -xvf postgresql-11.0.tar.bz2

Step 2. Get SCL Repository and Install toolset

Latest versions of LLVM, CLang and GCC are available in SCL. We can get everything in a stretch:

sudo yum install centos-release-scl
sudo yum install llvm-toolset-7 llvm-toolset-7-llvm-devel.x86_64

Now either you can set or edit your PATH to have all new tools in PATH. I would prefer to put that into my profile file:

PATH=/opt/rh/devtoolset-7/root/usr/bin/:/opt/rh/llvm-toolset-7/root/usr/bin/:$PATH

Alternatively, we can open a shell with SCL enabled:

scl enable devtoolset-7 llvm-toolset-7 bash

We should attempt to compile the source from a shell with all these paths set.

Step 3. Install Additional libraries/tools

Based on the configuration options you want, this list may change. Consider this as a sample for demonstration purposes:

sudo yum install  readline-devel zlib-devel libxml2-devel openssl-devel

Step 4. Configure with –with-llvm option and make

Now we should be able to configure and make with our preferred options. The JIT feature will be available if the

--with-llvm

option is specified. For this demonstration, I am using an installation directory with my home (/home/postgres/pg11):

./configure --prefix=/home/postgres/pg11 --with-openssl --with-libxml --with-zlib --with-llvm
make
make install

Enabling JIT

You may observe that there is a new directory under the PostgreSQL’s lib folder with name

bit code

Which contains lots of files with .bc extension these are pre-generated bytecodes for LLVM for facilitating features like in-lining.

By default, the JIT feature is disabled in PostgreSQL 11. If you want to test it, you may have to enable the parameter

jit

postgres=# ALTER SYSTEM SET jit=on;
ALTER SYSTEM
postgres=# select pg_reload_conf();
 pg_reload_conf
----------------
 t
(1 row)
postgres=# show jit;
 jit
-----
 on
(1 row)

By default, most of the simple queries won’t use JIT because of the cost. The cost is high when JIT kicks in. In case we want to test if JIT is properly configured, we can lie to PostgreSQL that that cost is very low by adjusting the parameter value. However, we should keep in mind that we are accepting negative performance gains. Let me show a quick example:

postgres=# SET jit_above_cost=5;
SET
postgres=# create table t1 (id int);
CREATE TABLE
postgres=# insert into t1 (SELECT (random()*100)::int FROM generate_series(1,800000) as g);
INSERT 0 800000
postgres=# analyze t1;
ANALYZE
postgres=# explain select sum(id) from t1;
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=8706.88..8706.89 rows=1 width=8)
   ->  Gather  (cost=8706.67..8706.88 rows=2 width=8)
         Workers Planned: 2
         ->  Partial Aggregate  (cost=7706.67..7706.68 rows=1 width=8)
               ->  Parallel Seq Scan on t1  (cost=0.00..6873.33 rows=333333 width=4)
 JIT:
   Functions: 6
   Options: Inlining false, Optimization false, Expressions true, Deforming true
(8 rows)

As we can see in the above example, a separate JIT section comes up in the explain plan.

We expect JIT compilation to make a difference in complex analytical queries because the overhead in JIT compilation gets compensated only if the code runs for the duration. Here is a simple aggregate query for demonstration. (I know this is not a complex query, and not the perfect example for demonstrating JIT feature):

postgres=# EXPLAIN ANALYZE SELECT COMPANY_ID,
      SUM(SHARES) TOT_SHARES,
      SUM(SHARES* RATE) TOT_INVEST,
      MIN(SHARES* RATE) MIN_TRADE,
      MAX(SHARES* RATE) MAX_TRADE,
      SUM(SHARES* RATE * 0.002) BROKERAGE
FROM TRADING
GROUP BY COMPANY_ID;
                                                                       QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize GroupAggregate  (cost=757298.72..758741.91 rows=5005 width=138) (actual time=16992.290..17011.395 rows=5000 loops=1)
   Group Key: company_id
   ->  Gather Merge  (cost=757298.72..758466.64 rows=10010 width=138) (actual time=16992.270..16996.919 rows=15000 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=756298.70..756311.21 rows=5005 width=138) (actual time=16983.900..16984.356 rows=5000 loops=3)
               Sort Key: company_id
               Sort Method: quicksort  Memory: 1521kB
               Worker 0:  Sort Method: quicksort  Memory: 1521kB
               Worker 1:  Sort Method: quicksort  Memory: 1521kB
               ->  Partial HashAggregate  (cost=755916.09..755991.16 rows=5005 width=138) (actual time=16975.997..16981.354 rows=5000 loops=3)
                     Group Key: company_id
                     ->  Parallel Seq Scan on trading  (cost=0.00..287163.65 rows=12500065 width=12) (actual time=0.032..1075.833 rows=10000000 loops=3)
 Planning Time: 0.073 ms
 Execution Time: 17013.116 ms
(15 rows)

We can switch on the JIT parameter at the session level and retry the same query:

postgres=# SET JIT=ON;
SET
postgres=# EXPLAIN ANALYZE SELECT COMPANY_ID,
      SUM(SHARES) TOT_SHARES,
      SUM(SHARES* RATE) TOT_INVEST,
      MIN(SHARES* RATE) MIN_TRADE,
      MAX(SHARES* RATE) MAX_TRADE,
      SUM(SHARES* RATE * 0.002) BROKERAGE
FROM TRADING
GROUP BY COMPANY_ID;
                                                                       QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize GroupAggregate  (cost=757298.72..758741.91 rows=5005 width=138) (actual time=15672.809..15690.901 rows=5000 loops=1)
   Group Key: company_id
   ->  Gather Merge  (cost=757298.72..758466.64 rows=10010 width=138) (actual time=15672.781..15678.736 rows=15000 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=756298.70..756311.21 rows=5005 width=138) (actual time=15661.144..15661.638 rows=5000 loops=3)
               Sort Key: company_id
               Sort Method: quicksort  Memory: 1521kB
               Worker 0:  Sort Method: quicksort  Memory: 1521kB
               Worker 1:  Sort Method: quicksort  Memory: 1521kB
               ->  Partial HashAggregate  (cost=755916.09..755991.16 rows=5005 width=138) (actual time=15653.390..15658.581 rows=5000 loops=3)
                     Group Key: company_id
                     ->  Parallel Seq Scan on trading  (cost=0.00..287163.65 rows=12500065 width=12) (actual time=0.039..1084.820 rows=10000000 loops=3)
 Planning Time: 0.072 ms
 JIT:
   Functions: 28
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 5.844 ms, Inlining 137.226 ms, Optimization 201.152 ms, Emission 125.022 ms, Total 469.244 ms
 Execution Time: 15696.092 ms
(19 rows)

Here we see a 7.7% improvement in performance. I executed this several times and found that the performance gain is consistently 7-8% for this simple query (which takes 15 seconds to execute). The gains are higher for queries with more calculations/expressions.

Summary

It is fairly simple to install and configure JIT with PostgreSQL as demonstrated above. One point we would like to highlight is that installing JIT packages and enabling the JIT feature can be done online while the database is up and running. This is because all JIT related parameters are dynamic in nature. Parameter changes can be loaded a SIGHUP signal or

SELECT pg_reload_conf()

by the superuser. If it is not helping our workload, we can turn it off anytime. Nothing stops you from trying it in a non-production environment. We might not see a gain in small and simple queries that take less time for execution because the overhead in doing the JIT compilation can become more than executing the SQL statement. But we should expect a good gain in OLAP workload with complex queries that run for a longer duration.

↧

Richard Yen: PgBouncer Pro Tip: Use auth_user

November 21, 2018, 3:00 pm

≫ Next: Regina Obe: PostGIS 2.2.8 EOL

≪ Previous: Jobin Augustine: Installing and Configuring JIT in PostgreSQL 11

Introduction

Anyone running a database in a production environment with over a hundred users should seriously consider employing a connection pooler to keep resource usage under control. PgBouncer is one such tool, and it’s great because it’s lightweight and yet has a handful of nifty features for DBAs that have very specific needs.

One of these nifty features that I want to share about is the auth_user and auth_query combo that serves as an alternative to the default authentication process that uses userlist.txt“What’s wrong with userlist.txt” you may ask. For starters, it makes user/role administration a little tricky. Every time you add a new user to PG, you need to add it to userlist.txt in PgBouncer. And every time you change a password, you have to change it in userlist.txt as well. Multiply that by the 30+ servers you’re managing, and you’ve got a sysadmin’s nightmare on your hands. With auth_user and auth_query, you can centralize the password management and take one item off your checklist.

What’s `auth_user`?

In the [databases] section of your pgbouncer.ini, you would typically specify a user= and password= with which PgBouncer will connect to the Postgres database with. If left blank, the user/password are declared at the connection string (i.e., psql -U <username> <database>). When this happens, PgBouncer will perform a lookup of the provided username/password against userlist.txt to verify that the credentials are correct, and then the username/password are sent to Postgres for an actual database login.

When auth_user is provided, PgBouncer will still read in credentials from the connection string, but instead of comparing against userlist.txt, it logs in to Postgres with the specified auth_user (preferably a non-superuser) and runs auth_query to pull the corresponding md5 password hash for the desired user. The validation is performed at this point, and if correct, the specified user is allowed to log in.

An Example

Assuming Postgres is installed and running, you can get the auth_user and auth_query combo running with the following steps:

Create a Postgres user to use as auth_user
Create the user/password lookup function in Postgres
Configure pgbouncer.ini

Create a Postgres user to use as `auth_user`

In your terminal, run psql -c "CREATE ROLE myauthuser WITH PASSWORD 'abc123'" to create myauthuser. Note that myauthuser should be an unprivileged user, wiht no GRANTs to read/write any tables. myauthuser is used strictly for assisting with PgBouncer authentication.

For the purposes of this example, we’ll also have a database user called mydbuser, which can be created with:

CREATE ROLE mydbuser WITH PASSWORD 'mysecretpassword'
GRANT SELECT ON emp TO mydbuser;

Create the user/password lookup function in Postgres

In your psql prompt, create a function that will be used by myauthuser to perform the user/password lookup:

CREATE OR REPLACE FUNCTION user_search(uname TEXT) RETURNS TABLE (usename name, passwd text) as
$$
  SELECT usename, passwd FROM pg_shadow WHERE usename=$1;
$$
LANGUAGE sql SECURITY DEFINER;

As mentioned in the documentation, the SECURITY DEFINER clause enables the non-privileged myauthuser to view the contents of pg_shadow, which would otherwise be limited to only admin users.

Configure `pgbouncer.ini`

Configure your [databases] section with an alias, like:

[databases]
foodb = host=db1 dbname=edb auth_user=myauthuser

Then, configure auth_query in the [pgbouncer] section with:

auth_query = SELECT usename, passwd FROM user_search($1)

Let ‘er rip!

Spin up PgBouncer and try logging in:

PGPASSWORD=thewrongpassword psql -h 127.0.0.1 -p 6432 -U mydbuser -Atc 'SELECT '\''success'\''' foodb
psql: ERROR:  Auth failed
PGPASSWORD=mysecretpassword psql -h 127.0.0.1 -p 6432 -U mydbuser -Atc 'SELECT '\''success'\''' foodb
success

As you can see, providing the wrong password for mydbuser led to a pg_shadow lookup failure, and the user was prevented from logging in. The subsequent psql call used the correct password, and successfully logged in.

Some Caveats

I’ve seen a few customers try to implement this, and one of the common mistakes I’ve seen is the failure to set pg_hba.conf properly in Postgres. Bear in mind that once the provided credentials are validated, PgBouncer will attempt to log in with the specified user. Therefore, if your auth_user is myauthuser and you’ve got a pg_hba.conf with host all myauthuser 127.0.0.1/32 md5, but you want to ultimately login with mydbuser, you won’t be able to do so because there’s no pg_hba.conf entry for mydbuser, and you’ll probably see something like this:

server login failed: FATAL no pg_hba.conf entry for host "127.0.0.1", user "mydbuser", database "edb", SSL off

Also, make sure auth_type is not set to trust in pgbouncer.ini– instead, you should set trust in pg_hba.conf for auth_user and clamp it down to only the IP(s) that will be running PgBouncer. Set auth_type to md5 so that your login attempt will be challenged with a password request.

Enjoy!

↧

Regina Obe: PostGIS 2.2.8 EOL

November 21, 2018, 4:00 pm

≫ Next: Brian Fehrle: Cloud Backup Options for PostgreSQL

≪ Previous: Richard Yen: PgBouncer Pro Tip: Use auth_user

The PostGIS development team is pleased to provide bug fix 2.2.8 for the 2.2 stable branch.

This is the End-Of-Life and final release for PostGIS 2.2 series.

We encourage you to upgrade to a newer minor PostGIS version. Refer to our Version compatibility and EOL Policy for details on versions you can upgrade to.

This release supports PostgreSQL 9.1-9.6.

2.2.8

Continue Reading by clicking title hyperlink ..

↧

Brian Fehrle: Cloud Backup Options for PostgreSQL

November 23, 2018, 2:58 am

≫ Next: Regina Obe: PostGIS 2.3.8, 2.4.6

≪ Previous: Regina Obe: PostGIS 2.2.8 EOL

As with any other component of a business, databases are extremely important its inner workings.

Whether it’s the core of the business or just another component, databases should be backed up regularly, and stored in safe locations for possible future recovery.

Should I Backup To The Cloud?

A general rule is to have at least 3 copies of anything of value and to store those backups in different locations. Backups on the same drive are useless if the drive itself dies, same host backups are also at risk if the host goes down, and same building backups are also in danger if the building burns down (drastic and unlikely, but possible).

Cloud backups offer an easy solution for the need of off-site backups without having to spin up new hardware in a secondary location. There are many different cloud services that offer backup storage, and choosing the right one will depend on backup needs, size requirements, cost, and security.

The benefits of having cloud backups are many, but mainly revolve around having these backups stored in a different location than the main database, allowing us to have a safety net in the case of a disaster recovery. While we won’t go into detail about how to set up each of these backup options, we will explore some different ideas and configurations for backups.

There are some downsides to storing backups in the cloud, starting with the transfer. If the backups for the database are extremely large, it could take a long time to do the actual upload, and could even have increased costs if the cloud service charges for bandwidth transfer. Compression is highly suggested to keep time and costs low.

Security could be another concern with hosting backups in the cloud, while some companies have strict guidelines for where their data is stored and exists. If security is a concern, any backups can be encrypted before exporting them to a cloud hosting service.

Cloud Backup Options

There are several different ways to create database backups for PostgreSQL, and depending on the type of backup, recovery time, size, and infrastructure options will vary. Since many of the cloud storage solutions are simply storage with different API front ends, any clever backup solution can be created with a bit of scripting.

Snapshot Backups

Snapshots are backups that have a copy of the PostgreSQL database at a specific point in time. These backups are created either by using pg_dump, which simply dumps the database to a single file, or by copying the base data directory for PostgreSQL. Either of these can be compressed, copied to other drives and servers, and copied to the desired cloud storage option.

Using pg_dump with compression

pg_dump -Fc severalnines > severalnines.dmp

Data directory backup with compression

psql -c "SELECT pg_start_backup('Starting Backup')"
tar -czvf severalnines.data.tar.gz data
psql -c "SELECT pg_stop_backup();"

Storing these backups in the cloud depends on which service is used, here are some options.

Amazon S3

With Amazon’s AWS platform, S3 is a data storage service that can be used to store database backups. While backups can be uploaded through the web interface, the Amazon CLI (Command Line Interface) can be used to do it from the command line and through backup automation scripts. Information about the AWS CLI can be found here. If backups are to be kept for a very long time, and recovery time isn’t a concern, backups can be transferred to Amazon’s Glacier service, providing much cheaper long term storage.

aws s3 cp severalnines.dmp s3://severalninesbucket/backups

Amazon also has different regions for their services all around the world. Even though they have a good uptime history, spreading copies of backups across multiple regions increases disaster recovery options, and lowers chances of losing valuable data.

Microsoft Azure Storage

Microsoft’s cloud platform, Azure, has storage options with their own command line interface, information can be found here.

az storage blob upload --container-name severalnines --file severalnines.dmp --name severalnines_backup

Any other modern cloud storage services should offer similar tools to copy backups to their cloud servers, consult their documentation for details.

Standby Backups

Sometimes backups themselves can be extremely large even if compressed, and uploading a daily or weekly backup to a cloud service could be out of the question due to bandwidth speeds and/or costs. So getting a backup into the cloud for safekeeping is much harder.

One way to do this is to have a warm or hot standby running in a cloud based Virtual Machine, such as an Amazon’s EC2 instance, where it’s an exact copy of the main master database, and the only data that is sent to the cloud instance is any changes, rather than another copy of the whole database. This would require transferring the whole database at once, but after that, only the changes need to go be transferred.

But is a standby server actually a backup? If the master database goes down, the standby can be turned into the master and applications redirected to it, however, if the goal is to have backups for a certain point in time over the past week / months, this won’t work out.

To fix this, several things can be done. The standby itself can be forced to have a delay, ingesting data only once it’s a day old for example. Another is to create backups in one of the traditional ways (pg_dump, data directory copy) on the cloud standby, meaning these backups won’t need to be transferred over the network since they are being created on the cloud machine itself. In-network transfers are usually quicker and cheaper.

ClusterControl Backups And The Cloud

Severalnines created ClusterControl, a database management system that helps manage many different databases including PostgreSQL. It’s an ultimate toolbox for any Database or System Administrator to have complete control and visibility of their databases, and includes very handy backup features.

With ClusterControl, backups of PostgreSQL databases can be easily managed, scheduled, and set up to automatically copy the backups made to “cloud storage” services, including Amazon S3, Microsoft Azure, and Google Cloud. This makes it not needed to script up custom tools to upload backups to the cloud, as well as gives a nice user interface for the backups in general.

Backing our databases up should always happen, and storing them in second, third, and fourth locations is a very good and common practice. Throwing in a cloud option increases disaster recovery options, and put yet another layer of back end stability for a business, where in many cases if the database disappears, the company disappears. Exploring cloud backup options today can eliminate disaster tomorrow.

Tags:

PostgreSQL

postgres

cloud

backup

↧

Regina Obe: PostGIS 2.3.8, 2.4.6

November 23, 2018, 4:00 pm

≫ Next: Pavel Stehule: plpgsql_check can be used as profiler

≪ Previous: Brian Fehrle: Cloud Backup Options for PostgreSQL

The PostGIS development team is pleased to provide bug fix 2.3.8 and 2.4.6 for the 2.3 and 2.4 stable branches.

Continue Reading by clicking title hyperlink ..

↧

Pavel Stehule: plpgsql_check can be used as profiler

November 25, 2018, 9:05 am

≫ Next: Adrien Nayrat: PostgreSQL and heap-only-tuples updates - part 3

≪ Previous: Regina Obe: PostGIS 2.3.8, 2.4.6

Today I integrated profiling functionality into plpgsql_check. When you enable profiling, then you don't need configure more.

postgres=# select lineno, avg_time, source from plpgsql_profiler_function_tb('fx(int)');
┌────────┬──────────┬───────────────────────────────────────────────────────────────────┐
│ lineno │ avg_time │                              source                               │
╞════════╪══════════╪═══════════════════════════════════════════════════════════════════╡
│      1 │          │                                                                   │
│      2 │          │ declare result int = 0;                                           │
│      3 │    0.075 │ begin                                                             │
│      4 │    0.202 │   for i in 1..$1 loop                                             │
│      5 │    0.005 │     select result + i into result; select result + i into result; │
│      6 │          │   end loop;                                                       │
│      7 │        0 │   return result;                                                  │
│      8 │          │ end;                                                              │
└────────┴──────────┴───────────────────────────────────────────────────────────────────┘
(9 rows)

In this case, the function profile is stored in session memory, and when session is closed, the profile is lost.

There is possibility to load plpgsql_check by shared_preload_libraries config option. In this case, the profile is stored in shared memory and it is "pseudo" persistent. It is cleaned, when profile reset is required or when PostgreSQL is restarted.

There is another good PLpgSQL profiler. I designed integrated plpgsql_check profiler because I would to collect different data from running time, and I would to use this profiler for calculating test coverage. More, this profiler can be used without any special PostgreSQL configuration, what can be useful for some cases, when there are not a possibility to restart a server.

↧

Adrien Nayrat: PostgreSQL and heap-only-tuples updates - part 3

November 25, 2018, 11:00 pm

≫ Next: Bruce Momjian: First Wins, Last Wins, Huh?

≪ Previous: Pavel Stehule: plpgsql_check can be used as profiler

↧

Bruce Momjian: First Wins, Last Wins, Huh?

November 26, 2018, 5:30 am

≫ Next: Stefan Fercot: PostgreSQL 12 preview - recovery.conf disappears

≪ Previous: Adrien Nayrat: PostgreSQL and heap-only-tuples updates - part 3

Someone recently pointed out an odd behavior in Postgres's configuration files. Specifically, they mentioned that the last setting for a variable in postgresql.conf is the one that is honored, while the first matching connection line in pg_hba.conf in honored. They are both configuration files in the cluster's data directory, but they behave differently. It is clear why they behave differently — because the order of lines in pg_hba.conf is significant, and more specific lines can be placed before more general lines (see the use of reject lines.) Still, it can be confusing, so I wanted to point it out.

↧

Stefan Fercot: PostgreSQL 12 preview - recovery.conf disappears

November 25, 2018, 4:00 pm

≫ Next: Joshua Drake: Breaking down the walls of exclusivity

≪ Previous: Bruce Momjian: First Wins, Last Wins, Huh?

PostgreSQL needs some infrastructure changes to have a more dynamic reconfiguration around recovery, eg. to change the primary_conninfo at runtime.

The first step, mostly to avoid having to duplicate the GUC logic, results on the following patch.

On 25th of November 2018, Peter Eisentraut committed Integrate recovery.conf into postgresql.conf:

recovery.conf settings are now set in postgresql.conf (or other GUC
sources).  Currently, all the affected settings are PGC_POSTMASTER;
this could be refined in the future case by case.

Recovery is now initiated by a file recovery.signal.  Standby mode is
initiated by a file standby.signal.  The standby_mode setting is
gone.  If a recovery.conf file is found, an error is issued.

The trigger_file setting has been renamed to promote_trigger_file as
part of the move.

The documentation chapter "Recovery Configuration" has been integrated
into "Server Configuration".

pg_basebackup -R now appends settings to postgresql.auto.conf and
creates a standby.signal file.

Author: Fujii Masao <masao.fujii@gmail.com>
Author: Simon Riggs <simon@2ndquadrant.com>
Author: Abhijit Menon-Sen <ams@2ndquadrant.com>
Author: Sergei Kornilov <sk@zsrv.org>
Discussion: https://www.postgresql.org/message-id/flat/607741529606767@web3g.yandex.ru/

Let’s compare a simple example between PostgreSQL 11 and 12.

Initialize replication on v11

With a default postgresql11-server installation on CentOS 7, let’s start archiving on our primary server:

$ mkdir /var/lib/pgsql/11/archives
$ echo "archive_mode = 'on'" >> /var/lib/pgsql/11/data/postgresql.conf
$ echo "archive_command = 'cp %p /var/lib/pgsql/11/archives/%f'" \
>> /var/lib/pgsql/11/data/postgresql.conf
# systemctl start postgresql-11.service

Check that the archiver process is running:

$ psql -c "SELECT pg_switch_wal();"
 pg_switch_wal 
---------------
 0/16AC7D0
(1 row)

$ ps -ef |grep postgres|grep archiver 
... postgres: archiver last was 000000010000000000000001

$ ls -l /var/lib/pgsql/11/archives/
total 16384
-rw-------. 1 postgres postgres 16777216 Nov 26 09:30 000000010000000000000001

Create a base copy for our secondary server:

$ pg_basebackup --pgdata=/var/lib/pgsql/11/replicated_data -P
24502/24502 kB (100%), 1/1 tablespace

Configure the recovery.conf file as following:

$ cat recovery.conf 
standby_mode = 'on'
primary_conninfo = 'port=5432'
restore_command = 'cp /var/lib/pgsql/11/archives/%f %p'
recovery_target_timeline = 'latest'

Change the default port and start:

$ echo 'port = 5433' >> /var/lib/pgsql/11/replicated_data/postgresql.conf
$ /usr/pgsql-11/bin/pg_ctl -D /var/lib/pgsql/11/replicated_data/ start

If the replication setup is correct, you should see those processes:

postgres 10950     1  ... /usr/pgsql-11/bin/postmaster -D /var/lib/pgsql/11/data/
postgres 10958 10950  ... postgres: archiver   last was 000000010000000000000004
postgres 11595 10950  ... postgres: walsender postgres [local] streaming 0/5000140
...
postgres 11586     1  ... /usr/pgsql-11/bin/postgres -D /var/lib/pgsql/11/replicated_data
postgres 11588 11586  ... postgres: startup   recovering 000000010000000000000005
postgres 11594 11586  ... postgres: walreceiver   streaming 0/5000140

We now have a local 2-nodes cluster working with Streaming Replication and archives recovery as safety net.

To stop the cluster:

$ /usr/pgsql-11/bin/pg_ctl -D /var/lib/pgsql/11/replicated_data stop
# systemctl stop postgresql-11.service 

Recovery.conf explanation

These parameters are important:

standby_mode

Specifies whether to start the PostgreSQL server as a standby. If this parameter is on, the server will not stop recovery when the end of archived WAL is reached, but will keep trying to continue recovery by fetching new WAL segments using restore_command and/or by connecting to the primary server as specified by the primary_conninfo setting.

primary_conninfo

Specifies a connection string to be used for the standby server to connect with the primary.

restore_command

The local shell command to execute to retrieve an archived segment of the WAL file series. This parameter is required for archive recovery, but optional for streaming replication.

recovery_target_timeline

Specifies recovering into a particular timeline. The default is to recover along the same timeline that was current when the base backup was taken. Setting this to latest recovers to the latest timeline found in the archive, which is useful in a standby server.

For a complete information, you can refer to the standby-settings, archive-recovery-settings and recovery-target-settings official documentation.

Same example with v12

Get PostgreSQL sources and build the v12 version for this specific commit:

# mkdir /opt/build_postgresql
# cd /opt/build_postgresql/
# git clone git://git.postgresql.org/git/postgresql.git
# cd postgresql/
# git checkout 2dedf4d9a899b36d1a8ed29be5efbd1b31a8fe85
# ./configure
# make
# make install
$ /usr/local/pgsql/bin/initdb -D /var/lib/pgsql/12/data_build

Configure the archiver process:

$ mkdir /var/lib/pgsql/12/archives
$ echo "archive_mode = 'on'" >> /var/lib/pgsql/12/data_build/postgresql.conf
$ echo "archive_command = 'cp %p /var/lib/pgsql/12/archives/%f'" \
>> /var/lib/pgsql/12/data_build/postgresql.conf
$ echo "unix_socket_directories = '/var/run/postgresql, /tmp'" \
>> /var/lib/pgsql/12/data_build/postgresql.conf
$ /usr/local/pgsql/bin/pg_ctl -D /var/lib/pgsql/12/data_build start

Check:

$ psql -c "SELECT pg_switch_wal();"
 pg_switch_wal 
---------------
 0/20000B0
(1 row)

$ ps -ef |grep postgres|grep archiver 
... postgres: archiver   last was 000000010000000000000002

Create the base copy:

$ pg_basebackup --pgdata=/var/lib/pgsql/12/replicated_data -P
24534/24534 kB (100%), 1/1 tablespace
$ echo 'port = 5433' >> /var/lib/pgsql/12/replicated_data/postgresql.conf

Here comes the part modified by this patch. Most of the parameters for the recovery and standby mode are now in the main configuration file.

$ echo "primary_conninfo = 'port=5432'" \
>> /var/lib/pgsql/12/replicated_data/postgresql.conf
$ echo "restore_command = 'cp /var/lib/pgsql/12/archives/%f %p'" \
>> /var/lib/pgsql/12/replicated_data/postgresql.conf
$ echo "recovery_target_timeline = 'latest'" \
>> /var/lib/pgsql/12/replicated_data/postgresql.conf

If you simply want to start a recovery process (eg. restore a backup), you need to create a file named recovery.signal in the data directory.

Here, we want to set up a standby server, so we need a file named standby.signal.

$ touch /var/lib/pgsql/12/replicated_data/standby.signal
$ /usr/local/pgsql/bin/pg_ctl -D /var/lib/pgsql/12/replicated_data start

If the replication is correctly setup, you should see those processes:

postgres  9033     1  ... /usr/local/pgsql/bin/postgres -D /var/lib/pgsql/12/data_build  
postgres  9039  9033  ... postgres: archiver   last was 000000010000000000000006
postgres 11608  9033  ... postgres: walsender postgres [local] streaming 0/7000060
...
postgres 11599     1  ... /usr/local/pgsql/bin/postgres -D /var/lib/pgsql/12/replicated_data
postgres 11600 11599  ... postgres: startup   recovering 000000010000000000000007 
postgres 11607 11599  ... postgres: walreceiver   streaming 0/7000060

To stop the cluster:

$ /usr/local/pgsql/bin/pg_ctl -D /var/lib/pgsql/12/replicated_data stop
$ /usr/local/pgsql/bin/pg_ctl -D /var/lib/pgsql/12/data_build stop

pg_basebackup behavior

In version 11, the -R, –write-recovery-conf options write the recovery.conf file. This patch changes this behavior by appending settings to postgresql.auto.conf.

This part is actually raising some concerns in the community. For example, after a complete restore, the recovery.conf file was moved to recovery.done. This will not be the case anymore.

Conclusion

This commit can have a big impact on your backup tools and procedures.

Until the official v12 release, a lot of changes may still happen on this topic.

The best advice is, like always, to have an attentive look to the release notes.

↧

Joshua Drake: Breaking down the walls of exclusivity

November 26, 2018, 1:03 pm

≫ Next: Tomas Vondra: Sequential UUID Generators

≪ Previous: Stefan Fercot: PostgreSQL 12 preview - recovery.conf disappears

When you are considering a conference about Postgres, one should pick the one that is focused on building the community. PostgresConf is all about building the community and we even captured it on video!

PostgresConf embraces a holistic view of what community is. We want everyone to feel welcome and encouraged to give back to PostgreSQL.org. However, that is not the only opportunity for you to give back to the Postgres community. We all have different talents and some of those don't extend to writing patches or Docbook XML.

Giving back

When considering who is part of the community and who is contributing to the community, we want to introduce you to a couple of fantastic organizers of our conference: Debra Cerda and Viral Shah. Some in the community will know Debra. She has been in the community for years and is one of the primary organizers of Austin Postgres.

Debra Cerda

Debra is our Attendee and Speaker Liaison as well as our Volunteer Coordinator. She is also a key asset in the development and performance of our Career Fair.

Viral Shah

Viral is our on-site logistics lead and is part of the volunteer acquisition team. It is Viral that works with the hotel using a fine tooth comb to make sure everything is on target, on budget, and executed with extreme efficiency.

Without her amazing attention to detail and dedication to service we wouldn't be able to deliver the level of conference our community has come to expect from PostgresConf.

Building relationships

There a lot of reasons to go to a conference. You may be looking for education on a topic, a sales lead, or possibly just to experience a central location of top talent, products, and services. All of these reasons are awesome but we find that the most important reason is to build relationships. The following are two exceptional examples of community projects.

Our first example is ZomboDB. No, they are not a sponsor (yet!) but they have a fantastic Open Source extension to Postgres that integrates Elasticsearch into Postgres.

Our second ecosystem community member is an entity that most have heard of at this point; TimescaleDB. It too is a fantastic showing of what is possible when you combine brilliance with the extensibility of Postgres.

What is notable about these two mentions is that they represent what we would call, "Professional Community." Recently ZomboDB wanted to bounce some ideas off of a Postgresql hacker regarding the Index Access Method API. We at PostgresConf were able to facilitate an introduction to Timescale and a couple of amazing minds ended up chewing the fat on their respective projects. It's relationships such as these that enable the community to grow and offer the best opportunities possible.

Part of the community

Join the Professional user and ecosystem community for Postgres today! You can start by submitting a presentation to the upcoming PostgresConf 2019 being held March 18th - 22nd, 2019 at the Sheraton Times Square.

↧

Tomas Vondra: Sequential UUID Generators

November 27, 2018, 8:10 am

≫ Next: Avinash Kumar: Newly-Released PostgreSQL Minor Versions: Time to Update!

≪ Previous: Joshua Drake: Breaking down the walls of exclusivity

UUIDs are a popular identifier data type – they are unpredictable, and/or globally unique (or at least very unlikely to collide) and quite easy to generate. Traditional primary keys based on sequences won’t give you any of that, which makes them unsuitable for public identifiers, and UUIDs solve that pretty naturally.

But there are disadvantages too – they may make the access patterns much more random compared to traditional sequential identifiers, cause WAL write amplification etc. So let’s look at an extension generating “sequential” UUIDs, and how it can reduce the negative consequences of using UUIDs.

Let’s assume we’re inserting rows into a table with an UUID primary key (so there’s a unique index), and the UUIDs are generated as random values. In the table the rows may be simply appended at the end, which is very cheap. But what about the index? For indexes ordering matters, so the database has little choice about where to insert the new item – it has to go into a particular place in the index. As the UUID values are generated as random, the location will be random, with uniform distribution for all index pages.

This is unfortunate, as it works against adaptive cache management algorithms – there is no set of “frequently” accessed pages that we could keep in memory. If the index is larger than memory, the cache hit ratio (both for page cache and shared buffers) is doomed to be poor. And for small indexes, you probably don’t care that much.

Furthermore, this random write access pattern inflates the amount of generated WAL, due to having to perform full-page writes every time a page is modified for the first time after a checkpoint. (There is a feedback loop, as FPIs increase the amount of WAL, triggering checkpoints more often – which then results in more FPIs generated, …)

Of course, UUIDs influence read patterns too. Applications typically access a fairly limited subset of recent data. For example, an e-commerce site mostly ares about orders from the last couple of days, rarely accessing data beyond this time window. That works fairly naturally with sequential identifiers (the records will have good locality both in the table and in the index), but the random nature of UUIDs contradicts that, resulting in poor cache hit ratio.

These issues (particularly the write amplification) are a common problems when using UUIDs, and are discussed on our mailing lists from time to time. See for example Re: uuid, COMB uuid, distributed farms, Indexes on UUID – Fragmentation Issue or Sequential UUID Generation.

Making UUIDs sequential (a bit)

So, how can we improve the situation while keeping as much of the UUID advantages as possible? If we stick to perfectly random UUID values, there’s little we can do. So what if we abandoned the perfect randomness, and instead made the UUIDs a little bit sequential?

Note: This is not an entirely new idea, of course. It’s pretty much what’s described as COMB on the wikipedia UUID page, and closely described by Jimmy Nilsson in The Cost of GUIDs as Primary Keys. MSSQL implements a variant of this as newsequentialid (calling UuidCreateSequential internally). The solution implemented in the sequential-uuids extension and presented here is a variant tailored for PostgreSQL.

For example, a (very) naive UUID generator might generate a 64-bit value from a sequence, and appended additional 64 random bits. Or we might use a 32-bit unix timestamp, and append 96 random bits. Those are steps in the right direction, replacing the random access pattern with a sequential one. But there are a couple of flaws, here.

Firstly, while the values are quite unpredictable (64 random bits make guessing quite hard), quite a bit of information leaks thanks to the sequential part. Either we can determine in what order the values were generated, or when the values were generated. Secondly, perfectly sequential patterns have disadvantages too – for example if you delete historical data, you may end up with quite a bit of wasted space in the index because no new items will be routed to those mostly-empty index pages.

So what sequential-uuids extension does is a bit more elaborate. Instead of generating a perfectly sequential prefix, the value is sequential for a while, but also wraps around once in a while. The wrapping eliminates the predictability (it’s no longer possible to decide in which order two UUIDs were generated or when), and increases the amount of randomness in the UUID (because the prefix is shorter).

When using a sequence (much like nextval), the prefix value increments regularly after a certain number of UUIDs generated, and then eventually wraps around after certain number of blocks. For example we with use 2B prefix (i.e. 64k possible prefixes), incremented after generating 256 UUIDs, which means wrap-around after each 16M generated UUIDs. So the prefix is computed something like this:

prefix := (nextval('s') / block_size) % block_count

and the prefix length depends on block_count. With 64k blocks we only need 2B, leaving the remaining 112 bits random.

For timestamp-based UUID generator, the prefix is incremented regularly, e.g. each 60 seconds (or any arbitrary interval, depending on your application). And after a certain number of such increments, the prefix wraps around and starts from scratch. The prefix formula is similar to the sequence-based one:

prefix := (EXTRACT(epoch FROM clock_timestamp()) / interval_length) % block_count

With 64k blocks we only need 2B (just like for sequence-based generator), leaving the remaining 112 bits random. And the prefix wrap-around every ~45 days.

The extension implements this as two simple SQL-callable functions:

uuid_sequence_nextval(sequence, block_size, block_count)
uuid_time_nextval(interval_length, block_count)

See the README for more details.

Benchmark

Time for a little benchmark, demonstrating how these sequential UUIDs improve the access patterns. I’ve used a system with a fairly weak storage system (3 x 7200k SATA RAID) – this is intentional, as it makes the I/O pattern change obvious. The test was a very simple INSERT into a table with a single UUID column, with a UNIQUE index on it. And I’ve done in starting with three dataset sizes – small (empty table), medium (fits into RAM) and large (exceds RAM).

The tests compare four UUID generators:

random: uuid_generate_v4()
time: uuid_time_nextval(interval_length := 60, interval_count := 65536)
sequence(256): uuid_sequence_nextval(s, block_size := 256, block_count := 65536)
sequence(64k): uuid_sequence_nextval(s, block_size := 65536, block_count := 65536)

And the overall results (transactions per second) in a 4-hour run look like this:

Sequential UUID Generators

For the small scale there is almost no difference – the data fits into shared buffers (4GB) so there is no I/O trashing due to data being continuously evicted.

For the medium scale (larger than shared buffers, still fits into RAM) this changes quite a bit and the random UUID generator drops to less than 30% of the original throughput, while the other generators pretty much maintain the performance.

And on the largest scale the random UUID generator throughput drops even further, while time and sequence(64k) generators keep the same throughput as on the small scale. The sequence(256) generator drops to about 1/2 the throughput – this happens because it increments the prefix very often and ends up touching far more index pages than sequence(256), as I’ll show later. But it’s still much faster than random, due to the sequential behavior.

WAL write amplification / small scale

As mentioned one of the problems with random UUIDs is the write amplification caused by many full-page images (FPI) written into WAL. So let’s see how that looks on the smallest dataset. The blue bar represents amount of WAL (in megabytes) represented by FPIs, while the orange bar represents amount of WAL occupied by regular WAL records.

Sequential UUID Generators

The difference is obvious, although the change in total amount of WAL generated is not huge (and the impact on throughput is negligible, because the dataset fits into shared buffers and so there’s no pressure / forced eviction). While the storage is weak, it handles sequential writes just fine, particularly at this volume (3.5GB over 4 hours).

WAL write amplification / medium dataset

On the medium scale (where the random throughput dropped to less than 30% compared to the small scale) the difference is much clearer.

With random generator, the test generated more than 20GB of WAL, vast majority of that due to FPIs. The time and sequence(256) generators produced only about 2.5GB WAL, and sequence(64k) about 5GB.

But wait – this chart is actually quite misleading, because this compares data from tests with very different throughput (260tps vs. 840tps). After scaling the numbers to the same throughput as random, the chart looks like this:

Sequential UUID Generators

That is – the difference is even more significant, nicely illustrating the WAL write amplification issue with random UUIDs.

Large dataset

On the large data set it looks quite similar to medium scale, although this time the random generator produced less WAL than sequence(256).

Sequential UUID Generators

But again, that chart ignores the fact that the different generators have very different throughput (random achieved only about 20% compared to the two other generators), and if we scale it to the same throughput it looks like this:

How is possible that sequence(256) produces more WAL compared to the random generator (20GB vs. 13GB), but achieved about 3x the throughput? One reason is that the amount of WAL does not really matter much here – SATA disks handle sequential writes pretty well and 20GB over 4 hours is next to nothing. What matters much more is write pattern for the data files themselves (particularly the index), and that’s much more sequential with sequence(256).

Cache hit ratio

The amount of WAL generated nicely illustrates the write amplification, and impact on write pattern in general. But what about the read access pattern? One metric we can look at is cache hit ratio. For shared buffers, it looks like this:

Sequential UUID Generators

The best-performing generators achieve ~99% cache hit ratio, which is excellent. On the large data set, the random drops to only about 85%, while the sequence(256) keeps about 93% – which is not great, but still much better than 85%.

Another interesting statistics is the number of distinct index blocks mentioned in the WAL stream. On the largest scale it looks like this:

Sequential UUID Generators

The chart may look a bit surprising, because the sequence(256) generator accesses more than twice the number of distinct blocks than random generator, while achieving 3x the performance at the same time. This may seem surprising at first, but is (again) due to the access pattern being much more sequential with sequence(256).

Choosing parameters

Both uuid_sequence_nextval and uuid_time_nextval functions have parameters, so it’s natural to ask how to pick the right parameter values. For example, in the tests presented in this post, uuid_sequence_nextval performed much better with block_size set to 64k (compared to 256). So why not to just use 64k all the time?

The answer is – it depends. 64k may work for busier systems, but for others it may be way too high. Moreover, you need to pick both parameters (block size and number of blocks) at the same time. Ask a couple of questions – how often should the generator wrap to 0? How many UUIDs does that mean? Then pick number of blocks high enough to slice the index into sufficiently small chunks, and pick the block size to match the desired wrap interval.

Summary

As illustrated by the tests, sequential UUIDs generators significantly reduce the write amplification and make the I/O patterns way more sequential and it may also improve the read access pattern. Will this help your application? Hard to say, you’ll have to give it a try. Chances are your database schema is more complicated and uses various other indexes, so maybe the issues due to UUIDs are fairly negligible in the bigger picture.

↧

Avinash Kumar: Newly-Released PostgreSQL Minor Versions: Time to Update!

November 16, 2018, 9:25 am

≫ Next: Hubert 'depesz' Lubaczewski: Waiting for PostgreSQL 12 – Integrate recovery.conf into postgresql.conf

≪ Previous: Tomas Vondra: Sequential UUID Generators

In this blog post we’ll look at what the newly-released PostgreSQL minor versions contain. You probably want to update your current versions and use these updates.

You might already have seen that they released the updates for supported PostgreSQL versions on November 8, 2018. PostgreSQL releases minor versions with several bug fixes and feature enhancements each quarter. An important point to note is that PostgreSQL 9.3 got its final minor version release (9.3.24) this quarter, and is no longer supported.

We always recommended that you keep your PostgreSQL databases updated to the latest minor versions. Applying a minor release might need a restart after updating the new binaries. The following is the sequence of steps you should follow to upgrade to the latest minor versions:

Shutdown the PostgreSQL database server
Install the updated binaries
Restart your PostgreSQL database server

Most times, you can choose to update the minor versions in a rolling fashion, in a master-slave (replication) setup. Just perform the update on one server after another, but not all-at-once. Rolling updates avoid downtime for both reads and writes simultaneously. However, we recommended that you shutdown, update and restart them all-at-once while you are performing the updates.

One of the most important fixes is a security fix: CVE-2018-16850. The bug allowed an attacker with CREATE privileges on some non-temporary schema or TRIGGER privileges on some table to create a malicious trigger that, when dumped and restored using pg_dump/pg_restore, would result in additional SQL statements being executed. This applies to PostgreSQL 10 and 11 versions.

Before proceeding further, let’s look at the list of minor versions released this quarter.

PostgreSQL 11.1
PostgreSQL 10.6
PostgreSQL 9.6.11
PostgreSQL 9.5.15
PostgreSQL 9.4.20
PostgreSQL 9.3.25

Now, let us look into the benefits you should see by updating your Postgres versions with the latest minor versions.

PostgreSQL 11.1

PostgreSQL 11.0 was released on October 18, 2018. You might want to look at our blog post on our first take on PostgreSQL 11. With the new minor release PostgreSQL 11.1, we get some interesting functionalities and fixes after 21 days of its previous release, as seen here. The following is a small list of fixes that you might find interesting:

Ability to create child indexes of partition tables in another tablespace.
NULL handling in parallel hashed multi-batch left joins.
Fix to the strictness logic that incorrectly ignored rows of ORDER BY values those were null.
Disable recheck_on_update that is forced off for all indexes, as it is not ready yet and requires more time.
Prevent creation of a partition in a trigger attached to its parent table
Disallow the pg_read_all_stats role from executing pg_stat_statements_reset() to rest the stats as it is just needed for monitoring. You must run ALTER EXTENSION pg_stat_statements UPDATE … to get this into effect.

PostgreSQL 10.6

There are some common fixes that were applied to PostgreSQL 11.1 and PostgreSQL 10.6. You can find PostgreSQL 10.6 release details here. Some of the fixes applied to PostgreSQL 10.6 are in common with other supported PostgreSQL versions, as highlighted below:

Disallow the pg_read_all_stats role from executing pg_stat_statements_reset() to rest the stats as it is just needed for monitoring. (PostgreSQL 11.1 and 10.6)
Avoid pushing sub-SELECTs containing window functions, LIMIT, or OFFSET to parallel workers to avoid the behavior of different workers getting different answers due to row-ordering variations. (PostgreSQL 9.6 and 10.6)
Fixed the WAL file recycling logic that might make a standby fail to remove the WAL files that need to be removed. (PostgreSQL 9.5.15, 9.6.11 and 10.6)
Handling of commit-timestamp tracking during recovery has been fixed to avoid recovery failures while trying to fetch the commit timestamp for a transaction that did not record it. (PostgreSQL 9.5.15, 9.6.11 and 10.6)
Applied the fix that ensures that the background workers are stopped properly when the postmaster receives a fast-shutdown request before completing the database startup. (PostgreSQL 9.5.15, 9.6.11 and 10.6)
Fixed unnecessary failures or slow connections when multiple target host names are used in libpq so that the DNS lookup happens one at a time but not all at once. (PostgreSQL 10.6)

Following is a list of some common fixes applied to PostgreSQL 9.6.11, PostgreSQL 9.5.15, PostgreSQL 9.4.20 and PostgreSQL 9.3.25:

Avoid O(N^2) slowdown in regular expression match/split functions on long strings.
Fix mis-execution of SubPlans when the outer query is being scanned backward.
Fix failure of UPDATE/DELETE … WHERE CURRENT OF … after rewinding the referenced cursor.
Fix EvalPlanQual to avoid crashes or wrong answers in concurrent updates, when the code contains an uncorrelated sub-SELECT inside a CASE construct.
Memory leak in repeated SP-GiST index scans has been fixed.
Ensure that hot standby processes use the correct WAL consistency point to prevent possible misbehavior after reaching a consistent database state during WAL replay.
Fix possible inconsistency in pg_dump’s sorting of dissimilar object names.
Ensure that pg_restore will schema-qualify the table name when emitting DISABLE/ENABLE TRIGGER commands when a restore is ran using a restrictive search_path.
Fix pg_upgrade to handle event triggers in extensions correctly.
Fix pg_upgrade’s cluster state check to work correctly on a standby server.
Fix build problems on macOS 10.14

Now that you understand the added fixes to existing PostgreSQL versions, we recommend that you test and update your PostgreSQL databases with the new minor versions (if you haven’t already).

If you are currently running your databases on PostgreSQL 9.3.x or earlier, we recommend that you to prepare a plan to upgrade your PostgreSQL databases to the supported versions ASAP. Please subscribe to our blog posts so that you know about the various options on upgrading your PostgreSQL databases to a supported major version.

↧

Hubert 'depesz' Lubaczewski: Waiting for PostgreSQL 12 – Integrate recovery.conf into postgresql.conf

November 27, 2018, 12:29 pm

≫ Next: Craig Kerstiens: How Postgres is more than a relational database: Extensions

≪ Previous: Avinash Kumar: Newly-Released PostgreSQL Minor Versions: Time to Update!

On 25th of November 2018, Peter Eisentraut committed patch: Integrate recovery.conf into postgresql.conf recovery.conf settings are now set in postgresql.conf (or other GUC sources). Currently, all the affected settings are PGC_POSTMASTER; this could be refined in the future case by case. Recovery is now initiated by a file recovery.signal. Standby mode is initiated … Continue reading

↧

Craig Kerstiens: How Postgres is more than a relational database: Extensions

November 27, 2018, 12:51 pm

≫ Next: Hubert 'depesz' Lubaczewski: Waiting for PostgreSQL 12 – Add CSV table output mode in psql.

≪ Previous: Hubert 'depesz' Lubaczewski: Waiting for PostgreSQL 12 – Integrate recovery.conf into postgresql.conf

Postgres has been a great database for decades now, and has really come into its own in the last ten years. Databases more broadly have also gotten their own set of attention as well. First we had NoSQL which started more on document databases and key/value stores, then there was NewSQL which expanding things to distributed, graph databases, and all of these models from documents to distributed to relational were not mutually exclusive. Postgres itself went from simply a relational database (which already had geospatial capabilities) to a multi modal database by adding support for JSONB.

But to me the most exciting part about Postgres isn’t how it continues to advance itself, rather it is how Postgres has shifted itself from simply a relational database to more of a data platform. The largest driver for this shift to being a data platform is Postgres extensions. Postgres extensions in simplified terms are lower level APIs that exist within Postgres that allow to change or extend it’s functionality. These extension hooks allow Postgres to be adapted for new use cases without requiring upstream changes to the core database. This is a win in two ways:

The Postgres core can continue to move at a very safe and stable pace, ensuring a solid foundation and not risking your data.
Extensions themselves can move quickly to explore new areas without the same review process or release cycle allowing them to be agile in how they evolve.

Okay, plug-ins and frameworks aren’t new when it comes to software, what is so great about extensions for Postgres? Well they may not be new to software, but they’re not new to Postgres either. Postgres has had extensions as long as I can remember. In Postgres 9.1 we saw a new sytax to make it easy to CREATE EXTENSION and since that time the ecosystem around them has grown. We have a full directory of extensions at PGXN. Older forks such as which were based on older versions are actively working on catching up to a modern release to presumably become a pure extension. By being a pure extension you’re able to stay current with Postgres versions without heavy rebasing for each new release. Now the things you can do with extensions is as powerful as ever, so much so that Citus’ distributed database support is built on top of this extension framework.

Extensions in the real world

Extensions generally fall into a few different broad categories, though in reality there are not hard boundaries on what an extension can and cannot be:

Custom data types
Monitoring
New use cases/functionality
Foreign data wrappers

Need a new datatype, there is an extension for it

Postgres is very open when it comes to adding support for new data types. We actually saw document support in Postgres way, way, way back with an XML data type. Then we saw hstore in the form an extension which was a key-value store directly in Postgres. Now we have a native datatype in JSONB which includes rich JSON document support so extensions aren’t needed for that case, but there are still others where they can be extremely useful. UUID itself is a native data, but you can have a number of functions to help with generation of them directly within the database by using uuid-ossp.

Let’s not stop with the basic ones, how crazy can we get? How about HyperLogLog. HyperLogLog is based on a research paper from Google. It implements K-minimum value, bit observable patterns, stochastic averaging, and harmonic averaging. If you’re like me and had to google pretty much every one of those things the simple way to explain it: HyperLogLog is great for approximate distincts which can be stored in a really small disk space, then composed over time to find intersections/unions of uniques across various buckets. If you’re building a web analytics tool or an ad network HyperLogLog may end up being your best friend. You’ve also got more common approximation datatypes like TopN which is great for building leaderboards.

Who doesn’t want more insights?

As an application developer, understanding what is going on with your database can be painful. Postgres already has a number of tools such as EXPLAIN which will give you insights about query plans, but extensions can give you an extra leg up. One of the most valuable extensions is a monitoring extension that allows you to quickly get insights into how often a query was run and how long it took in aggregate and on average: pg_stat_statements. Pgstatstatements essentially parameterizes queries so your where conditions are removed so you can know which ones make most sense to optimize with indexes.

Speaking of indexes, one option is to go and add an index ot everything, but you may want to be a little more methodical than just indexing everything. HypoPG is an extension that will give you insights into what performance would look like if you hypothetically added indexes. Yes, it can tell you all on it’s own how to index your database.

Postgres: A relational database but also not

Datatypes and monitoring/insights both make Postgres better, but at the end of the day with them it’s still a relational database. Where extensions start to get really fun is when they change what Postgres is capable of. PostGIS is one of the larger and older extensions which turns Postgres into the worlds most advanced open-source geospatial database. PostGIS comes with new datatypes as well as operators. The short takeaway is if you want to do anything geospatial it can help.

Maybe geospatial isn’t your thing, instead you’re dealing time series log data. If you need easy partitioning of time series data so you can archive old data or you only query a recent subset then pg_partman has your back. It enhances the already built-in time series partitioning that arrived recently in Postgres to allow you to automatically create new partitions as well as remove old ones.

Then there is scaling out. Postgres is great, but what happens when you outgrow the limits of what a single node can do? Citus transforms Postgres into a distributed horizontally scalable database. Under the covers it’s sharding your data, and then has multiple executors to route queries accordingly. Meanwhile to your application your entire distributed setup appears as a single node Postgres database. You get all the benefits of sharding without the work.

This class of extensions truly take Postgres into new territory. Whether it’s becoming a geospatial, time-series, or distributed database they move Postgres out of it’s traditional relational realm.

What if you have data not in Postgres?

Okay I know what you’re thinking. Why wouldn’t you have data in Postgres? But somewhere along the line, before you came along, someone threw data into Redis or Mongo or MySQL or really just about any other database. Foreign data wrappers are a unique class of extension all on their own that allow you to connect from within Postgres directly to some other data source. Foreign data wrappers are extremely useful when you’re working to join some disparate data source with your system of record Postgres data. It can save a lot of time over elaborate ETL jobs and get the job done just as well. A word of caution against using them in a customer facing production work flow though as they’re often not the most performant mechanism

If you can dream it Postgres extensions can do it

Postgres continues to get better and better with each new release. The support for new index types, consistently improving performance, and in general building a richer feature set has resulted in a great database. But the ecosystem of extensions around it make it into something truly unique unlike any other database. If there is a secret weapon to Postgres success it’s not Postgres itself it’s how extensions make it even more.

↧

Hubert 'depesz' Lubaczewski: Waiting for PostgreSQL 12 – Add CSV table output mode in psql.

November 27, 2018, 6:49 pm

≫ Next: Pavel Stehule: PostgreSQL 12 - psql - csv output

≪ Previous: Craig Kerstiens: How Postgres is more than a relational database: Extensions

On 26th of November 2018, Tom Lane committed patch: Add CSV table output mode in psql. "\pset format csv", or --csv, selects comma-separated values table format. This is compliant with RFC 4180, except that we aren't too picky about whether the record separator is LF or CRLF; also, the user may choose a field … Continue reading

↧

Pavel Stehule: PostgreSQL 12 - psql - csv output

November 27, 2018, 9:51 pm

≫ Next: Stefan Fercot: Combining pgBackRest and Streaming Replication

≪ Previous: Hubert 'depesz' Lubaczewski: Waiting for PostgreSQL 12 – Add CSV table output mode in psql.

After some years and long discussion, a psql console has great feature - csv output (implemented by Daniel Vérité).

Usage is very simple, just use --csv option.

[pavel@nemesis postgresql.master]$ psql --csv -c "select * from pg_namespace limit 10" postgres
oid,nspname,nspowner,nspacl
99,pg_toast,10,
10295,pg_temp_1,10,
10296,pg_toast_temp_1,10,
11,pg_catalog,10,"{postgres=UC/postgres,=U/postgres}"
2200,public,10,"{postgres=UC/postgres,=UC/postgres}"
11575,information_schema,10,"{postgres=UC/postgres,=U/postgres}"

↧

Using implicit joins

Using explicit joins

join_collpase_limit: What the optimizer does

What is JIT and What it does in PostgreSQL

Option 1. Install from PGDG repository.

Step 1. Install PGDG repo and Install PostgreSQL server package.

Step 2. Install EPEL repository

Step 3. Install package for PostgreSQL with llvmjit

Option 2. Building from Source

Step 1. Download PostgreSQL source tarball and unpack

Step 2. Get SCL Repository and Install toolset

Step 3. Install Additional libraries/tools

Step 4. Configure with –with-llvm option and make

Enabling JIT

Summary

Introduction

What’s auth_user?

An Example

Create a Postgres user to use as auth_user

Create the user/password lookup function in Postgres

Configure pgbouncer.ini

Let ‘er rip!

Some Caveats

Should I Backup To The Cloud?

Cloud Backup Options

Snapshot Backups

Amazon S3

Microsoft Azure Storage

Standby Backups

ClusterControl Backups And The Cloud

Initialize replication on v11

Recovery.conf explanation

Same example with v12

pg_basebackup behavior

Conclusion

Giving back

Building relationships

Part of the community

Making UUIDs sequential (a bit)

Benchmark

WAL write amplification / small scale

WAL write amplification / medium dataset

Large dataset

Cache hit ratio

Choosing parameters

Summary

PostgreSQL 11.1

PostgreSQL 10.6

Extensions in the real world

Need a new datatype, there is an extension for it

Who doesn’t want more insights?

Postgres: A relational database but also not

What if you have data not in Postgres?

If you can dream it Postgres extensions can do it

What’s `auth_user`?

Create a Postgres user to use as `auth_user`

Configure `pgbouncer.ini`