Avinash Kumar: PostgreSQL Upgrade Using pg_dump/pg_restore

March 27, 2019, 11:09 am

≫ Next: Paul Ramsey: GeoJSON Features from PostGIS

≪ Previous: Paul Ramsey: Notes for FDW in PostgreSQL 12

PostgreSQL logo In this blog post, we will explore

pg_dump

pg_restore

, one of the most commonly used options for performing a PostgreSQL upgrade. It is important to understand the scenarios under which

pg_dump

and

pg_restore

utilities will be helpful.

This post is the second of our Upgrading or Migrating Your Legacy PostgreSQL to Newer PostgreSQL Versions series where we’ll be exploring different methods available to upgrade your PostgreSQL databases.

About pg_dump

pg_dump

is a utility to perform a backup of single database. You cannot backup multiple databases unless you do so using separate commands in parallel. If your upgrade plan needs global objects to be copied over,

pg_dump

need to be supplemented by

pg_dumpall

. To know more about

pg_dumpall

, you may refer to our previous blog post.

pg_dump formats

pg_dump

can produce dumps in multiple formats – plain text and custom format – each with own advantages. When you use

pg_dump

with custom format

(-Fc)

, you must use

pg_restore

to restore the dump.

If the dump is taken using a plain-text format, pg_dump generates a script file of multiple SQL commands. It can be restored using psql.

A custom format dump, however, is compressed and is not human-readable.

A dump taken in plain text format may be slightly larger in size when compared to a custom format dump.

At times, you may wish to perform schema changes in your target PostgreSQL database before restore, for example, table partitioning. Or you may wish to restore only a selected list of objects from a dump file.

In such cases, you cannot restore a selected list of tables from a plain format dump of a database. If you take the database dump in custom format, you can use pg_restore, which will help you choose a specific set of tables for restoration.

Steps involved in upgrade

The most important point to remember is that both dump and restore should be performed using the latest binaries. For example, if we need to migrate from version 9.3 to version 11, we should be using the pg_dump binary of PostgreSQL 11 to connect to 9.3 .

When a server is equipped with two different versions of binaries, it is good practice to specify the full path of the pg_dump from the latest version as follows :

/usr/lib/postgresql/11/bin/pg_dump <connection_info_of_source_system> <options>

Getting the global dumps

In PostgreSQL, users/roles are global to the database cluster, and the same user can have privileges on objects in different databases. These are called “Globals” because they are applicable for all the databases within the instance. Creation of globals in the target system at the earliest opportunity is very important, because rest of the DDLs may contain GRANTs to these users/roles. It is good practice to dump the globals into a file, and to examine the file, before importing into destination system. This can be achieved using the following command :

/usr/lib/postgresql/11/bin/pg_dumpall -g -p 5432 > /tmp/globals_only.sql

Since this produces a plain SQL dump file, it can be fed to

psql

connected to the destination server. If there are no modifications required, the globals can be directly piped to the destination server using the command in the next example. Since this is a plain SQL dump file, it can be fed to psql for restore.

/usr/lib/postgresql/11/bin/pg_dumpall -g <source_connection_info> | psql -p <destination_connection_info>

The above command would work for an upgrade in a local server. You can add an additional argument

-h

for

hostname

in the

<destination_connection_info>

if you are performing an upgrade to a remote database server.

Schema Only Dumps

The next stage of the migration involves the creation of schema objects. At this point, you might want to move different database objects to different tablespaces, and partition a few of the tables. If such schema modifications are part of the plan, then we should extract the schema definition to a plain text file. Here’s an example command that can be used to achieve this :

/usr/lib/postgresql/11/bin/pg_dump -s -d databasename -p 5432 > /tmp/schema_only.sql

In general, the majority of the database objects won’t need any modifications. In such cases, it is good practice to dump the schema objects as such into the destination database using a

PIPE

, using a similar command to this:

/usr/lib/postgresql/11/bin/pg_dump -s -d databasename <source_connection> | psql -d database <destination_connection>

Once all the schema objects are created, we should be able to drop only those objects which need modification. We can then recreate them with their modified definition.

Copying data

This is the stage when the majority of the data transfers between the database servers. If there is good bandwidth between source and destination, we should look to achieve maximum parallelism at this stage. In many situations, we could analyze the foreign key dependency hierarchy and import data in parallel batches for a group of tables. Data-only copying is possible using

-a

--data-only

flag of

pg_dump

Copying the data of individual tables

You might have to incorporate schema changes as part of an upgrade. In this case, you can copy the data of a few tables individually. We provide an example here:

/usr/lib/postgresql/11/bin/pg_dump <sourcedb_connection_info> -d <database> -a -t schema.tablename | psql <destinationdb_connection_info> <databasename>

There could be special situations where you need to append only a partial selection of the data. This happens especially on time-series data. In such cases, you can use copy commands with a WHERE clause cto extract and import specific data. You can see this in the following example :

/usr/lib/postgresql/11/bin/psql <sourcedb_connection_info> -c "COPY (select * from <table> where <filter condition>)” > /tmp/selected_table_data.sql

Summary

pg_dump/pg_restore may be useful if you need to perform a faster upgrade of PostgreSQL server with a modified schema and bloat-free relations. To see more about this method in action, please subscribe to our webinar here.

—
image based on photos by Skitterphoto and Magda Ehlers from Pexels

↧

Paul Ramsey: GeoJSON Features from PostGIS

March 27, 2019, 6:00 am

≫ Next: Daniel Vérité: Text search: a custom dictionary to avoid long words

≪ Previous: Avinash Kumar: PostgreSQL Upgrade Using pg_dump/pg_restore

Every once in a while, someone comes to me and says:

Sure, it’s handy to use ST_AsGeoJSON to convert a geometry into a JSON equivalent, but all the web clients out there like to receive full GeoJSON Features and I end up writing boilerplate to convert database rows into GeoJSON. Also, the only solution I can find on the web is scary and complex. Why don’t you have a row_to_geojson function?

And the answer (still) is that working with rows is fiddly and I don’t really feel like it.

However! It turns out that, with the tools for JSON manipulation already in PostgreSQL and a little scripting it’s possible to make a passable function to do the work.

Start with a simple table.

DROPTABLEIFEXISTSmytable;CREATETABLEmytable(pkSERIALPRIMARYKEY,nameTEXT,sizeDOUBLEPRECISION,geomGEOMETRY);INSERTINTOmytable(name,size,geom)VALUES('Peter',1.0,'POINT(2 34)'),('Paul',2.0,'POINT(5 67)');

You can convert any row into a JSON structure using the to_jsonb() function.

SELECT to_jsonb(mytable.*) FROM mytable;

 {"pk": 1, "geom": "010100000000000000000000400000000000004140", "name": "Peter", "size": 1}
 {"pk": 2, "geom": "010100000000000000000014400000000000C05040", "name": "Paul", "size": 2}

That’s actually all the information we need to create a GeoJSON feature, it just needs to be re-arranged. So let’s make a little utility function to re-arrange it.

CREATEORREPLACEFUNCTIONrowjsonb_to_geojson(rowjsonbJSONB,geom_columnTEXTDEFAULT'geom')RETURNSTEXTAS$$DECLAREjson_propsjsonb;json_geomjsonb;json_typejsonb;BEGINIFNOTrowjsonb?geom_columnTHENRAISEEXCEPTION'geometry column ''%'' is missing',geom_column;ENDIF;json_geom:=ST_AsGeoJSON((rowjsonb->>geom_column)::geometry)::jsonb;json_geom:=jsonb_build_object('geometry',json_geom);json_props:=jsonb_build_object('properties',rowjsonb-geom_column);json_type:=jsonb_build_object('type','Feature');return(json_type||json_geom||json_props)::text;END;$$LANGUAGE'plpgsql'IMMUTABLESTRICT;

Voila! Now we can turn any relation into a proper GeoJSON “Feature” with just one(ish) function call.

SELECT rowjsonb_to_geojson(to_jsonb(mytable.*)) FROM mytable;                         

 {"type": "Feature", "geometry": {"type": "Point", "coordinates": [2, 34]}, "properties": {"pk": 1, "name": "Peter", "size": 1}}
 {"type": "Feature", "geometry": {"type": "Point", "coordinates": [5, 67]}, "properties": {"pk": 2, "name": "Paul", "size": 2}}

Postscript

You might be wondering why I made my function take in a jsonb input instead of a record, for a perfect row_to_geojson analogue to row_to_json. The answer is, the PL/PgSQL planner caches types, including the materialized types of the record parameter, on the first evaluation, which makes it impossible to use the same function for multiple tables. This is “too bad (tm)” but fortunately it is an easy workaround to just change the input to jsonb using to_json() before calling our function.

↧

Daniel Vérité: Text search: a custom dictionary to avoid long words

March 28, 2019, 5:51 am

≫ Next: Ibrar Ahmed: PostgreSQL: Access ClickHouse, One of the Fastest Column DBMSs, With clickhousedb_fdw

≪ Previous: Paul Ramsey: GeoJSON Features from PostGIS

The full text search is based on transforming the initial text into a tsvector. For example:

test=> select to_tsvector('english', 'This text is being processed.');
     to_tsvector      
----------------------
 'process':5 'text':2

This result is a sorted list of lexems, with their relative positions in the initial text, obtained by this process:

Raw text => Parser => Dictionaries (configurable) => tsvector

When there is enough data, we tend to index these vectors with a GIN or GIST index to speed up text search queries.

In SQL we can inspect the intermediate results of this process with the ts_debug function:

test=> select * from ts_debug('english', 'This text is being processed.');
   alias   |   description   |   token   |  dictionaries  |  dictionary  |  lexemes  
-----------+-----------------+-----------+----------------+--------------+-----------
 asciiword | Word, all ASCII | This      | {english_stem} | english_stem | {}
 blank     | Space symbols   |           | {}             |              | 
 asciiword | Word, all ASCII | text      | {english_stem} | english_stem | {text}
 blank     | Space symbols   |           | {}             |              | 
 asciiword | Word, all ASCII | is        | {english_stem} | english_stem | {}
 blank     | Space symbols   |           | {}             |              | 
 asciiword | Word, all ASCII | being     | {english_stem} | english_stem | {}
 blank     | Space symbols   |           | {}             |              | 
 asciiword | Word, all ASCII | processed | {english_stem} | english_stem | {process}
 blank     | Space symbols   | .         | {}             |              |

The parser breaks down the text into tokens (token column), each token being associated with a type (alias and description columns). Then depending on their types, these tokens are submitted as input to dictionaries mapped to these types, which may produce one lexem, or several, or zero to eliminate the term from the output vector.

In the above example, spaces and punctuation are eliminated because there are not mapped to any dictionary; common terms (“this”, “is”, “being”) are eliminated as stop words by the english_stem dictionary; “text” is kept untouched, and “processed” is stemmed as “process” by english_stem.

What about undesirables pieces of text?

Sometimes the initial text is not “clean” in the sense that it contains noise that we’d rather not index. For instance when indexing mail messages, badly formatted messages may have base64 contents that slip into text parts. When these contents correspond to attached files, they can be pretty big. Looking at how this kind of data gets transformed into lexems, here’s what we can see:

=# \x
=# select * from  ts_debug('english', 'Q29uc2lzdGVuY3kgYWxzbyBtZWFucyB0aGF0IHdoZW4gQWxpY2UgYW5kIEJvYiBhcmUgcnVubmlu');
-[ RECORD 1 ]+-------------------------------------------------------------------------------
alias        | numword
description  | Word, letters and digits
token        | Q29uc2lzdGVuY3kgYWxzbyBtZWFucyB0aGF0IHdoZW4gQWxpY2UgYW5kIEJvYiBhcmUgcnVubmlu
dictionaries | {simple}
dictionary   | simple
lexemes      | {q29uc2lzdgvuy3kgywxzbybtzwfucyb0agf0ihdozw4gqwxpy2ugyw5kiejvyibhcmugcnvubmlu}

So this text is parsed as a single numword token, from which a single long lexem is produced, since numword tokens are associated with the simple dictionary that just downcases the word. So that brings up a question: how to avoid that kind of term in the vector?

One idea is to consider that very long tokens are uninteresting for search, and as such they can just be eliminated. Even though there are really long words in some languages, such as the german Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz with its 63 characters(!), we can imagine setting a length limit over which words are ignored, just like stop words are ignored.

Filtering by length

It’s not difficult to create a dictionary to filter out long words. A text search dictionary takes the form of two functions in C, along with a few SQL declarative statements to create it and assign it to a text configuration.

In PostgreSQL source code, there are several examples of dictionaries than can be used as models:

dict_simple a built-in dictionary, part of core.
dict_int, a contrib module to filter out number longer than a given number of digits, so very close to what we want in this post.
dict_xsyn, a contrib module to add lexems to the output that are not in the original text but are synonyms of words in the text.
unaccent, a contrib module providing first a callable SQL function to remove accents, but also exposing it as a filtering dictionary, which means that lexems produced are passed to the next dictionary in the chain.

A new dictionary project can be started by pretty much copy-pasting one of these examples, since much of the code is declarative stuff that you don’t want to write from scratch.

There are two C functions to produce:

An INIT function that receives the configuration parameters of the dictionary. We can use this function to process our maximum length a parameter, instead of hard-coding it in the source.
A LEXIZE function that takes a token’s text as input and needs to produce zero, one or several lexems corresponding to that piece of text. The function also indicates if these lexems are to be passed to the rest of the chain of dictionaries. In our case we want to eliminate the token if it’s longer than our limit, or pass it unchanged.

Let’s call this dictionary dictmaxlen and length its parameter. Following the model of contrib modules, we can package its code and declarations in a Postgres extension.

The declarations actually create a dictionary template rather than a dictionary, if we want to use the correct terminology. A dictionary is instantiated from a template with CREATE TEXT SEARCH DICTIONARY (TEMPLATE = ...) with the values for the parameters.

Here are the SQL declarations for the functions and the template:

CREATEFUNCTIONdictmaxlen_init(internal)RETURNSinternalAS'MODULE_PATHNAME'LANGUAGECSTRICT;CREATEFUNCTIONdictmaxlen_lexize(internal,internal,internal,internal)RETURNSinternalAS'MODULE_PATHNAME'LANGUAGECSTRICT;CREATETEXTSEARCHTEMPLATEdictmaxlen_template(LEXIZE=dictmaxlen_lexize,INIT=dictmaxlen_init);

The only specific bit here is the name dictmaxlen, otherwise any other dictionary would have the same declarations.

C functions

Dictionary initialization (called when instantiating)

Datumdictmaxlen_init(PG_FUNCTION_ARGS){List*options=(List*)PG_GETARG_POINTER(0);DictMaxLen*d;ListCell*l;d=(DictMaxLen*)palloc0(sizeof(DictMaxLen));d->maxlen=50;/* 50 chars by defaut */foreach(l,options){DefElem*defel=(DefElem*)lfirst(l);if(strcmp(defel->defname,"length")==0)d->maxlen=atoi(defGetString(defel));else{ereport(ERROR,(errcode(ERRCODE_INVALID_PARAMETER_VALUE),errmsg("unrecognized dictionary parameter: \"%s\"",defel->defname)));}}PG_RETURN_POINTER(d);}

Generation of lexems

Datumdictmaxlen_lexize(PG_FUNCTION_ARGS){DictMaxLen*d=(DictMaxLen*)PG_GETARG_POINTER(0);char*token=(char*)PG_GETARG_POINTER(1);intbyte_length=PG_GETARG_INT32(2);if(pg_mbstrlen_with_len(token,byte_length)>d->maxlen){/* 
     * If the word is longer than our limit, returns an array without
     * any lexem.
     */TSLexeme*res=palloc0(sizeof(TSLexeme));PG_RETURN_POINTER(res);}else{/* If the word is short, pass it unmodified */PG_RETURN_POINTER(NULL);}}

Encapsulating the code into an extension

As most for contrib modules, it’s convenient to package this code as an extension to distribute and deploy it easily.

Creating an extension is relatively easy with PGXS, that is part of the postgres toolbox (for Linux distributions, it’s often provided in a development package, such as postgresql-server-dev-11 for Debian).

An extension needs a control file. The following stanza will do the job (filename: dict_maxlen.control):

# dict_maxlen extension
comment = 'text search template for a dictionary filtering out long words'
default_version = '1.0'
module_pathname = '$libdir/dict_maxlen'
relocatable = true

Thanks to PGXS, we can use a simplied Makefile that will transparently include the declarations with the paths and libraries involved in the build. Here is a ready-to-use Makefile than can build our simple extension with make && make install:

EXTENSION = dict_maxlen
EXTVERSION = 1.0
PG_CONFIG = pg_config

MODULE_big = dict_maxlen
OBJS = dict_maxlen.o

DATA = $(wildcard *.sql)

PGXS := $(shell $(PG_CONFIG) --pgxs)
include $(PGXS)

Usage

Once the extension is compiled is installed, we can instantiate it in a database and create the dictionary:

CREATEEXTENSIONdict_maxlen;CREATETEXTSEARCHDICTIONARYdictmaxlen(TEMPLATE=dictmaxlen_template,LENGTH=40-- for example);

Now this dictionary needs to be mapped to tokens (the output of the parser), with ALTER TEXT SEARCH CONFIGURATION ... ALTER MAPPING. In the example above we saw that the kind of token produced with the base64 content was numword, but we may also want to map our dictionary to tokens containing only letters: word and asciiword, or to any other of the 23 kinds of tokens that the parser can generate as of Postgres 11.

With psql this mapping can be visualized with \dF+. For english, the defaults are:

postgres=> \dF+ english
Text search configuration "pg_catalog.english"
Parser: "pg_catalog.default"
      Token      | Dictionaries 
-----------------+--------------
 asciihword      | english_stem
 asciiword       | english_stem
 email           | simple
 file            | simple
 float           | simple
 host            | simple
 hword           | english_stem
 hword_asciipart | english_stem
 hword_numpart   | simple
 hword_part      | english_stem
 int             | simple
 numhword        | simple
 numword         | simple
 sfloat          | simple
 uint            | simple
 url             | simple
 url_path        | simple
 version         | simple
 word            | english_stem

To avoid messing up with the built-in english configuration, let’s derive a new specific text search configuration with its own mappings:

CREATETEXTSEARCHCONFIGURATIONmytsconf(COPY=pg_catalog.english);ALTERTEXTSEARCHCONFIGURATIONmytsconfALTERMAPPINGFORasciiword,wordWITHdictmaxlen,english_stem;ALTERTEXTSEARCHCONFIGURATIONmytsconfALTERMAPPINGFORnumwordWITHdictmaxlen,simple;

Now let’s check the new configuration with psql:

=# \dF+ mytsconf
Text search configuration "public.mytsconf"
Parser: "pg_catalog.default"
      Token      |      Dictionaries      
-----------------+------------------------
 asciihword      | english_stem
 asciiword       | dictmaxlen,english_stem
 email           | simple
 file            | simple
 float           | simple
 host            | simple
 hword           | english_stem
 hword_asciipart | english_stem
 hword_numpart   | simple
 hword_part      | english_stem
 int             | simple
 numhword        | simple
 numword         | dictmaxlen,simple
 sfloat          | simple
 uint            | simple
 url             | simple
 url_path        | simple
 version         | simple
 word            | dictmaxlen,english_stem

And check what happens now to a numword over 40 characters:

=#selectto_tsvector('mytsconf','A long word: Q29uc2lzdGVuY3kgYWxzbyBtZWFucyB0aGF0IHdoZW4');to_tsvector-------------------'long':2'word':3

Et voilà! The undesirable token has been left out, as expected.

We may use this mytsconf configuration by passing it as an explicit first argument to text search functions, but it could also be set as a default.

For the duration of the session:

SET default_text_search_config TO 'mytsconf';

For the database (durably):

ALTER DATABASE dbname SET default_text_search_config TO 'mytsconf';

The source code for this example is available at github.

↧

Ibrar Ahmed: PostgreSQL: Access ClickHouse, One of the Fastest Column DBMSs, With clickhousedb_fdw

March 29, 2019, 7:01 am

≫ Next: Andreas 'ads' Scherbaum: PostgreSQL Europe Community User Group Recognition Guidelines

≪ Previous: Daniel Vérité: Text search: a custom dictionary to avoid long words

PostgreSQL logo

Database management systems are meant to house data but, occasionally, they may need to talk with another DBMS. For example, to access an external server which may be hosting a different DBMS. With heterogeneous environments becoming more and more common, a bridge between the servers is established. We call this bridge a “Foreign Data Wrapper” (FDW). PostgreSQL completed its support of SQL/MED (SQL Management of External Data) with release 9.3 in 2013. A foreign data wrapper is a shared library that is loaded by a PostgreSQL server. It enables the creation of foreign tables in PostgreSQL that act as proxies for another data source.

When you query a foreign table, Postgres passes the request to the associated foreign data wrapper. The FDW creates the connection and retrieves or updates the data in the external data store. Since PostgreSQL planner is involved in all of this process as well, it may perform certain operations like aggregate or joins on the data when retrieved from the data source. I cover some of these later in this post.

ClickHouse Database

ClickHouse is an open source column based database management system which claims to be 100–1,000x faster than traditional approaches, capable of processing of more than a billion rows in less than a second.

clickhousedb_fdw

clickhousedb_fdw is an open source project – GPLv2 licensed – from Percona. Here’s the link for GitHub project repository:

https://github.com/Percona-Lab/clickhousedb_fdw

It is an FDW for ClickHouse that allows you to SELECT from, and INSERT INTO, a ClickHouse database from within a PostgreSQL v11 server.

The FDW supports advanced features like aggregate pushdown and joins pushdown. These significantly improve performance by utilizing the remote server’s resources for these resource intensive operations.

If you would like to follow this post and try the FDW between Postgres and ClickHouse, you can download and set up the ontime dataset for ClickHouse. After following the instructions, the test that you have the desired data. The ClickHouse client is a client CLI for the ClickHouse Database.

Prepare Data for ClickHouse

Now the data is ready in ClickHouse, the next step is to set up PostgreSQL. We need to create a ClickHouse foreign server, user mapping, and foreign tables.

Install the clickhousedb_fdw extension

There are manual ways to install the clickhousedb_fdw, but clickhousedb_fdw uses PostgreSQL’s coolest extension install feature. By just entering a SQL command you can use the extension:

CREATE EXTENSION clickhousedb_fdw;

CREATE SERVER clickhouse_svr FOREIGN DATA WRAPPER clickhousedb_fdw
OPTIONS(dbname 'test_database', driver '/use/lib/libclickhouseodbc.so');

CREATE USER MAPPING FOR CURRENT_USER SERVER clickhouse_svr;

CREATE FOREIGN TABLE clickhouse_tbl_ontime (  "Year" Int,  "Quarter" Int8,  "Month" Int8,  "DayofMonth" Int8,  "DayOfWeek" Int8,  "FlightDate" Date,  "UniqueCarrier" Varchar(7),  "AirlineID" Int,  "Carrier" Varchar(2),  "TailNum" text,  "FlightNum" text,  "OriginAirportID" Int,  "OriginAirportSeqID" Int,  "OriginCityMarketID" Int,  "Origin" Varchar(5),  "OriginCityName" text,  "OriginState" Varchar(2),  "OriginStateFips" text,  "OriginStateName" text,  "OriginWac" Int,  "DestAirportID" Int,  "DestAirportSeqID" Int,  "DestCityMarketID" Int,  "Dest" Varchar(5),  "DestCityName" text,  "DestState" Varchar(2),  "DestStateFips" text,  "DestStateName" text,  "DestWac" Int,  "CRSDepTime" Int,  "DepTime" Int,  "DepDelay" Int,  "DepDelayMinutes" Int,  "DepDel15" Int,  "DepartureDelayGroups" text,  "DepTimeBlk" text,  "TaxiOut" Int,  "WheelsOff" Int,  "WheelsOn" Int,  "TaxiIn" Int,  "CRSArrTime" Int,  "ArrTime" Int,  "ArrDelay" Int,  "ArrDelayMinutes" Int,  "ArrDel15" Int,  "ArrivalDelayGroups" Int,  "ArrTimeBlk" text,  "Cancelled" Int8,  "CancellationCode" Varchar(1),  "Diverted" Int8,  "CRSElapsedTime" Int,  "ActualElapsedTime" Int,  "AirTime" Int,  "Flights" Int,  "Distance" Int,  "DistanceGroup" Int8,  "CarrierDelay" Int,  "WeatherDelay" Int,  "NASDelay" Int,  "SecurityDelay" Int,  "LateAircraftDelay" Int,  "FirstDepTime" text,  "TotalAddGTime" text,  "LongestAddGTime" text,  "DivAirportLandings" text,  "DivReachedDest" text,  "DivActualElapsedTime" text,  "DivArrDelay" text,  "DivDistance" text,  "Div1Airport" text,  "Div1AirportID" Int,  "Div1AirportSeqID" Int,  "Div1WheelsOn" text,  "Div1TotalGTime" text,  "Div1LongestGTime" text,  "Div1WheelsOff" text,  "Div1TailNum" text,  "Div2Airport" text,  "Div2AirportID" Int,  "Div2AirportSeqID" Int,  "Div2WheelsOn" text,  "Div2TotalGTime" text,  "Div2LongestGTime" text,"Div2WheelsOff" text,  "Div2TailNum" text,  "Div3Airport" text,  "Div3AirportID" Int,  "Div3AirportSeqID" Int,  "Div3WheelsOn" text,  "Div3TotalGTime" text,  "Div3LongestGTime" text,  "Div3WheelsOff" text,  "Div3TailNum" text,  "Div4Airport" text,  "Div4AirportID" Int,  "Div4AirportSeqID" Int,  "Div4WheelsOn" text,  "Div4TotalGTime" text,  "Div4LongestGTime" text,  "Div4WheelsOff" text,  "Div4TailNum" text,  "Div5Airport" text,  "Div5AirportID" Int,  "Div5AirportSeqID" Int,  "Div5WheelsOn" text,  "Div5TotalGTime" text,  "Div5LongestGTime" text,  "Div5WheelsOff" text,  "Div5TailNum" text) server clickhouse_svr options(table_name 'ontime');

postgres=# SELECT a."Year", c1/c2 as Value FROM ( select "Year", count(*)*1000 as c1          
           FROM clickhouse_tbl_ontime          
           WHERE "DepDelay">10 GROUP BY "Year") a                        
           INNER JOIN (select "Year", count(*) as c2 from clickhouse_tbl_ontime          
           GROUP BY "Year" ) b on a."Year"=b."Year" LIMIT 3;
Year |   value    
------+------------
1987 |        199
1988 | 5202096000
1989 | 5041199000
(3 rows)

Performance Features

PostgreSQL has improved foreign data wrapper processing by added the pushdown feature. Push down improves performance significantly, as the processing of data takes place earlier in the processing chain. Push down abilities include:

Operator and function Pushdown
Predicate Pushdown
Aggregate Pushdown
Join Pushdown

Operator and function Pushdown

The function and operators send to Clickhouse instead of calculating and filtering at the PostgreSQL end.

postgres=# EXPLAIN VERBOSE SELECT avg("DepDelay") FROM clickhouse_tbl_ontime WHERE "DepDelay" <10; 
           Foreign Scan  (cost=1.00..-1.00 rows=1000 width=32) Output: (avg("DepDelay"))  
           Relations: Aggregate on (clickhouse_tbl_ontime)  
           Remote SQL: SELECT avg("DepDelay") FROM "default".ontime WHERE (("DepDelay" < 10))(4 rows)

Predicate Pushdown

Instead of filtering the data at PostgreSQL, clickhousedb_fdw send the predicate to Clikhouse Database.

postgres=# EXPLAIN VERBOSE SELECT "Year" FROM clickhouse_tbl_ontime WHERE "Year"=1989;                                  
           Foreign Scan on public.clickhouse_tbl_ontime  Output: "Year"  
           Remote SQL: SELECT "Year" FROM "default".ontime WHERE (("Year" = 1989)

Aggregate Pushdown

Aggregate push down is a new feature of PostgreSQL FDW. There are currently very few foreign data wrappers that support aggregate push down – clickhousedb_fdw is one of them. Planner decides which aggregates are pushed down and which aren’t. Here is an example for both cases.

postgres=# EXPLAIN VERBOSE SELECT count(*) FROM clickhouse_tbl_ontime;
          Foreign Scan (cost=1.00..-1.00 rows=1000 width=8)
          Output: (count(*)) Relations: Aggregate on (clickhouse_tbl_ontime)
          Remote SQL: SELECT count(*) FROM "default".ontime

Join Pushdown

Again, this is a new feature in PostgreSQL FDW, and our clickhousedb_fdw also supports join push down. Here’s an example of that.

postgres=# EXPLAIN VERBOSE SELECT a."Year"
                           FROM clickhouse_tbl_ontime a
                           LEFT JOIN clickhouse_tbl_ontime b ON a."Year" = b."Year";
        Foreign Scan (cost=1.00..-1.00 rows=1000 width=50);
        Output: a."Year" Relations: (clickhouse_tbl_ontime a) LEFT JOIN (clickhouse_tbl_ontime b)
        Remote SQL: SELECT r1."Year" FROM&nbsp; "default".ontime r1 ALL LEFT JOIN "default".ontime r2 ON (((r1."Year" = r2."Year")))

Percona’s support for PostgreSQL

As part of our commitment to being unbiased champions of the open source database eco-system, Percona offers support for PostgreSQL – you can read more about that here. And as you can see, as part of our support commitment, we’re now developing our own open source PostgreSQL projects such as the clickhousedb_fdw. Subscribe to the blog to be amongst the first to know of PostgreSQL and other open source projects from Percona.

As an author of the new clickhousdb_fdw – as well as other FDWs – I’d be really happy to hear of your use cases and your experience of using this feature.

—
Photo by Hidde Rensink on Unsplash

↧

Andreas 'ads' Scherbaum: PostgreSQL Europe Community User Group Recognition Guidelines

March 29, 2019, 5:00 am

≫ Next: Hubert 'depesz' Lubaczewski: Waiting for PostgreSQL 12 – REINDEX CONCURRENTLY

≪ Previous: Ibrar Ahmed: PostgreSQL: Access ClickHouse, One of the Fastest Column DBMSs, With clickhousedb_fdw

Over the past months, a great number of PostgreSQL User Groups and Meetups showed up all over Europe. It’s good to see that interest in PostgreSQL is growing!

Some of the user groups approached the PostgreSQL Europe board, and asked for support. Mostly for swag, but also for sending speakers, or other kind of support. We are happy to help!

In order to handle all of these requests, the PostgreSQL Europe board created a set of guidelines for user group meetings. The current version can be found on the PostgreSQL Europe website, under “Community”, and then “Community User Group Recognition Guidelines”. User groups which approach the PostgreSQL Europe board for support are expected to comply by these guidelines. Every user group is self-certified under these guidelines. If you have reason to believe that a self-certified status for a user group is not correct, please contact the PostgreSQL Europe board under “Contact”.

↧

Hubert 'depesz' Lubaczewski: Waiting for PostgreSQL 12 – REINDEX CONCURRENTLY

March 29, 2019, 8:00 am

≫ Next: Craig Kerstiens: A health checkup playbook for your Postgres database

≪ Previous: Andreas 'ads' Scherbaum: PostgreSQL Europe Community User Group Recognition Guidelines

On 29th of March 2019, Peter Eisentraut committed patch: REINDEX CONCURRENTLY This adds the CONCURRENTLY option to the REINDEX command. A REINDEX CONCURRENTLY on a specific index creates a new index (like CREATE INDEX CONCURRENTLY), then renames the old index away and the new index in place and adjusts the dependencies, and then drops … Continue reading

↧

Craig Kerstiens: A health checkup playbook for your Postgres database

March 29, 2019, 9:59 am

≫ Next: Andrew Dunstan: Where and when you need a root.crt file

≪ Previous: Hubert 'depesz' Lubaczewski: Waiting for PostgreSQL 12 – REINDEX CONCURRENTLY

I talk with a lot of folks that set their database up, start working with it, and then are surprised by issues that suddenly crop up out of nowhere. The reality is, so many don’t want to have to be a DBA, instead you would rather build features and just have the database work. But your is that a database is a living breathing thing. As the data itself changes what is the right way to query and behave changes. Making sure your database is healthy and performing at it’s maximum level doesn’t require a giant overhaul constantly. In fact you can probably view it similar to how you approach personal health. Regular check-ups allow you to make small but important adjustments without having to make dramatic life altering changes to keep you on the right path.

After years of running and managing literally millions of Postgres databases, here’s my breakdown of what your regular Postgres health check should look like. Consider running this on a monthly basis to be able to make small tweaks and adjustments and avoid the drastic changes.

Cache rules everything around me

For many applications not all the data is accessed all the time. Instead certain datasets are accessed one and then for some period of time, then the data you’re accessing changes. Postgres in fact is quite good at keeping frequently accessed data in memory.

Your cache hit ratio tells you how often your data is served from in memory vs. having to go to disk. Serving from memory vs. going to disk will be orders of magnitude faster, thus the more you can keep in memory the better. Of course you could provision an instance with as much memory as you have data, but you don’t necessarily have to. Instead watching your cache hit ratio and ensuring it is at 99% is a good metric for proper performance.

You can monitor your cache hit ratio with:

SELECTsum(heap_blks_read)asheap_read,sum(heap_blks_hit)asheap_hit,sum(heap_blks_hit)/(sum(heap_blks_hit)+sum(heap_blks_read))asratioFROMpg_statio_user_tables;

Be careful of dead tuples

Under the covers Postgres is essentially a giant append only log. When you write data it appends to the log, when you update data it marks the old record as invalid and writes a new one, when you delete data it just marks it invalid. Later Postgres comes through and vacuums those dead records (also known as tuples).

All those unvacuumed dead tuples are what is known as bloat. Bloat can slow down other writes and create other issues. Paying attention to your bloat and when it is getting out of hand can be key for tuning vacuum on your database.

WITHconstantsAS(SELECTcurrent_setting('block_size')::numericASbs,23AShdr,4ASma),bloat_infoAS(SELECTma,bs,schemaname,tablename,(datawidth+(hdr+ma-(casewhenhdr%ma=0THENmaELSEhdr%maEND)))::numericASdatahdr,(maxfracsum*(nullhdr+ma-(casewhennullhdr%ma=0THENmaELSEnullhdr%maEND)))ASnullhdr2FROM(SELECTschemaname,tablename,hdr,ma,bs,SUM((1-null_frac)*avg_width)ASdatawidth,MAX(null_frac)ASmaxfracsum,hdr+(SELECT1+count(*)/8FROMpg_statss2WHEREnull_frac<>0ANDs2.schemaname=s.schemanameANDs2.tablename=s.tablename)ASnullhdrFROMpg_statss,constantsGROUPBY1,2,3,4,5)ASfoo),table_bloatAS(SELECTschemaname,tablename,cc.relpages,bs,CEIL((cc.reltuples*((datahdr+ma-(CASEWHENdatahdr%ma=0THENmaELSEdatahdr%maEND))+nullhdr2+4))/(bs-20::float))ASottaFROMbloat_infoJOINpg_classccONcc.relname=bloat_info.tablenameJOINpg_namespacennONcc.relnamespace=nn.oidANDnn.nspname=bloat_info.schemanameANDnn.nspname<>'information_schema'),index_bloatAS(SELECTschemaname,tablename,bs,COALESCE(c2.relname,'?')ASiname,COALESCE(c2.reltuples,0)ASituples,COALESCE(c2.relpages,0)ASipages,COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0)ASiotta-- very rough approximation, assumes all colsFROMbloat_infoJOINpg_classccONcc.relname=bloat_info.tablenameJOINpg_namespacennONcc.relnamespace=nn.oidANDnn.nspname=bloat_info.schemanameANDnn.nspname<>'information_schema'JOINpg_indexiONindrelid=cc.oidJOINpg_classc2ONc2.oid=i.indexrelid)SELECTtype,schemaname,object_name,bloat,pg_size_pretty(raw_waste)aswasteFROM(SELECT'table'astype,schemaname,tablenameasobject_name,ROUND(CASEWHENotta=0THEN0.0ELSEtable_bloat.relpages/otta::numericEND,1)ASbloat,CASEWHENrelpages<ottaTHEN'0'ELSE(bs*(table_bloat.relpages-otta)::bigint)::bigintENDASraw_wasteFROMtable_bloatUNIONSELECT'index'astype,schemaname,tablename||'::'||inameasobject_name,ROUND(CASEWHENiotta=0ORipages=0THEN0.0ELSEipages/iotta::numericEND,1)ASbloat,CASEWHENipages<iottaTHEN'0'ELSE(bs*(ipages-iotta))::bigintENDASraw_wasteFROMindex_bloat)bloat_summaryORDERBYraw_wasteDESC,bloatDESC

Query courtesy of Heroku’s pg-extras

Over optimizing is a thing

We always our database to be performant, so in order to do that we keep things in memory/cache (see earlier) and we index things so we don’t have to scan everything on disk. But there is a trade-off when it comes to indexing your database. Each index the system has to maintain will slow down your write throughput on the database. This is fine when you do need to speed up queries, as long as they’re being utilized. If you added an index years ago, but something within your application changed and you no longer need it best to remove it.

Postgres makes it simply to query for unused indexes so you can easily give yourself back some performance by removing them:

SELECTschemaname||'.'||relnameAStable,indexrelnameASindex,pg_size_pretty(pg_relation_size(i.indexrelid))ASindex_size,idx_scanasindex_scansFROMpg_stat_user_indexesuiJOINpg_indexiONui.indexrelid=i.indexrelidWHERENOTindisuniqueANDidx_scan<50ANDpg_relation_size(relid)>5*8192ORDERBYpg_relation_size(i.indexrelid)/nullif(idx_scan,0)DESCNULLSFIRST,pg_relation_size(i.indexrelid)DESC;

Check in on your query performance

In an earlier blog post we talked about how useful pg_stat_statements was for monitoring your database query performance. It records a lot of valuable stats about which queries are run, how fast they return, how many times their run, etc. Checking in on this set of queries regularly can tell you where is best to add indexes or optimize your application so your query calls may not be so excessive.

Thanks to a HN commenter on our earlier post we have a great query that is easy to tweak to show different views based on all that data:

SELECTquery,calls,total_time,total_time/callsastime_per,stddev_time,rows,rows/callsasrows_per,100.0*shared_blks_hit/nullif(shared_blks_hit+shared_blks_read,0)AShit_percentFROMpg_stat_statementsWHEREquerynotsimilarto'%pg_%'andcalls>500--ORDER BY calls--ORDER BY total_timeorderbytime_per--ORDER BY rows_perDESCLIMIT20;

An apple a day…

A regular health check for your database saves you a lot of time in the long run. It allows you to gradually maintain and improve things without huge re-writes. I’d personally recommend a monthly or bimonthly check-in on all of the above to ensure things are in a good state.

↧

Andrew Dunstan: Where and when you need a root.crt file

March 29, 2019, 12:09 pm

≫ Next: Raghavendra Rao: Install PL/Java 1.5.2 in PostgreSQL 11

≪ Previous: Craig Kerstiens: A health checkup playbook for your Postgres database

This is something people seem to get confused about quite often. A root.crt file is used to validate a TLS (a.k.a. SSL) certificate presented by the other end of a connection. It is usually the public certificate of the Certificate Authority (CA) that signed the presented certificate, and is used to validate that signature. If a non-root CA was used to sign the other end’s TLS certificate, the root.crt file must contain at least the root of the CA chain, and enough other elements of the chain that together with the certificate can connect the root to the signing CA.

In the simple and most common case where client certificates are not being used, only the client needs a root.crt file, to validate the server’s TLS certificate, if using 'verify-ca' or 'verify-full' ssl mode. The server doesn’t need and can’t use a root.crt file when client certificates are not being used.

On the other hand, if you are using client certificates, the server will also need a root.crt file to validate the client certificates. There is no requirement that same root.crt be used for both sides. It would be perfectly possible for the server’s certificate to be signed by one CA and the client certificates by another.

If more than one CA is used in a certain context, i.e. if the client connects to servers with certificates signed by more than one CA, or of the server accepts connections from clients with certificates signed by more than one CA, then the certificates of all the CAs can be placed in the root.crt file, one after the other. The connection will succeed as long as one of the certificates (or certificate chains) in the file is that of the relevant signing authority.

↧

Raghavendra Rao: Install PL/Java 1.5.2 in PostgreSQL 11

March 31, 2019, 4:09 pm

≫ Next: Devrim GÜNDÜZ: End of the naming game: The PostgreSQL project changes its name

≪ Previous: Andrew Dunstan: Where and when you need a root.crt file

PostgreSQL 11 includes several procedural languages with the base distribution: PL/pgSQL, PL/Tcl, PL/Perl, and PL/Python. In addition, there are a number of procedural languages that are developed and maintained outside the core PostgreSQL Distribution like PL/Java (Java), PL/Lua (Lua), PL/R (R), PL/sh (Unix Shell), and PL/v8 (JavaScript). In this post, we are going to see...

↧

Devrim GÜNDÜZ: End of the naming game: The PostgreSQL project changes its name

April 1, 2019, 2:36 am

≫ Next: Regina Obe: SQL Server on Linux

≪ Previous: Raghavendra Rao: Install PL/Java 1.5.2 in PostgreSQL 11

I started using PostgreSQL around September 1998. The first problem I had was pronouncing it, and even using right capital letters at the right place.

Was is PostGreSQL? PostgresSQL? PoStGreSQL? PostgreySQL?

Recently Craig also mentioned about the same problem.

Starting today, the PostgreSQL Global Development Group (abbreviated as PGDG) announced that the project will be written as PostgresQL. This will solve the problems (hopefully), and will also help use to drop the "QL" in 2024. Starting v12, all packages will also "provide" postgresXY as the package name, for a smooth change in 2024. Meanwhile, as of today, the project will accept "Postgre" as an alias for those who did not want to learn about the name of the software they are using. I heard rumours that they also say "Orac" or "SQ Serv" or "MyS", so they will now be free to drop SQL in our name, too.

Thanks to everyone who made this change. This was a real blocker for the community, and it will also help newbies in the PostgreSQL Facebook group -- they will now be free to use "Postgre" from now on.

↧

Regina Obe: SQL Server on Linux

April 1, 2019, 6:31 am

≫ Next: Sebastian Insausti: How to Deploy Highly Available PostgreSQL with Single Endpoint for WordPress

≪ Previous: Devrim GÜNDÜZ: End of the naming game: The PostgreSQL project changes its name

Today is April 1st. Having no thoughts on Fools jokes for today, I dug up one of our old April fools, and it was pretty scary how the joke is just about true now. Yes SQL Server now really does run on Linux and is on it's 2017th edition, but still a poor competition to PostgreSQL.

A goody from our old joke archives

CatchMe - Microsoft SQL Server for Unix and Linux

↧

Sebastian Insausti: How to Deploy Highly Available PostgreSQL with Single Endpoint for WordPress

April 1, 2019, 7:00 am

≫ Next: Koichi Suzuki: Postgres-XL and global MVCC

≪ Previous: Regina Obe: SQL Server on Linux

WordPress is an open source software you can use to create your website, blog, or application. There are many designs and features/plugins to add to your WordPress installation. WordPress is a free software, however, there are many commercial plugins to improve it depending on your requirements.

WordPress makes it easy for you to manage your content and it’s really flexible. Create drafts, schedule publication, and look at your post revisions. Make your content public or private, and secure posts and pages with a password.

To run WordPress you should have at least PHP version 5.2.4+, MySQL version 5.0+ (or MariaDB), and Apache or Nginx. Some of these versions have reached EOL and you may expose your site to security vulnerabilities, so you should install the latest version available according to your environment.

As we could see, currently, WordPress only supports the MySQL and MariaDB database engines. WPPG is a plugin based on PG4WP plugin, that gives you the possibility to install and use WordPress with a PostgreSQL database as a backend. It works by replacing calls to MySQL specific functions with generic calls that map them to other database functions and rewriting SQL queries on the fly when needed.

For this blog, we’ll install 1 Application Server with WordPress 5.1.1 and HAProxy, 1.5.18 in the same server, and 2 PostgreSQL 11 database nodes (Master-Standby). All the operating system will be CentOS 7. For the databases and load balancer deploy we’ll use the ClusterControl system.

This is a basic environment. You can improve it by adding more high availability features as you can see here. So, let’s start.

Database Deployment

First, we need to install our PostgreSQL database. For this, we’ll assume you have ClusterControl installed.

To perform a deployment from ClusterControl, simply select the option “Deploy” and follow the instructions that appear.

When selecting PostgreSQL, we must specify User, Key or Password and port to connect by SSH to our servers. We also need a name for our new cluster and if we want ClusterControl to install the corresponding software and configurations for us.

After setting up the SSH access information, we must define the database user, version and datadir (optional). We can also specify which repository to use.

In the next step, we need to add our servers to the cluster that we are going to create.

When adding our servers, we can enter IP or hostname.

In the last step, we can choose if our replication will be Synchronous or Asynchronous.

We can monitor the status of the creation of our new cluster from the ClusterControl activity monitor.

Once the task is finished, we can see our cluster in the main ClusterControl screen.

Once we have our cluster created, we can perform several tasks on it, like adding a load balancer (HAProxy) or a new replica.

Single Console for Your Entire Database Infrastructure

Find out what else is new in ClusterControl

Install ClusterControl for FREE

Load Balancer Deployment

To perform a load balancer deployment, in this case, HAProxy, select the option “Add Load Balancer” in the cluster actions and fill the asked information.

We only need to add IP/Name, port, policy and the nodes we are going to use. By default, HAProxy is configured by ClusterControl with two different ports, one read-write and one read-only. In the read-write port, only the master is UP. In case of failure, ClusterControl will promote the most advanced slave and it’ll change the HAProxy configuration to enable the new master and disable the old one. In this way, we’ll have automatic failover in case of failure.

If we followed the previous steps, we should have the following topology:

So, we have a single endpoint created in the Application Server with HAProxy. Now, we can use this endpoint in the application as a localhost connection.

WordPress Installation

Let’s install WordPress on our Application Server and configure it to connect to the PostgreSQL database by using the local HAProxy port 3307.

First, install the packages required on the Application Server.

$ yum install httpd php php-mysql php-pgsql postgresql
$ systemctl start httpd && systemctl enable httpd

Download the latest WordPress version and move it to the apache document root.

$ wget https://wordpress.org/latest.tar.gz
$ tar zxf latest.tar.gz
$ mv wordpress /var/www/html/

Download the WPPG plugin and move it into the wordpress plugins directory.

$ wget https://downloads.wordpress.org/plugin/wppg.1.0.1.zip
$ unzip wppg.1.0.1.zip
$ mv wppg /var/www/html/wordpress/wp-content/plugins/

Copy the db.php file to the wp-content directory. Then, edit it and change the 'PG4WP_ROOT' path:

$ cp /var/www/html/wordpress/wp-content/plugins/wppg/pg4wp/db.php /var/www/html/wordpress/wp-content/
$ vi /var/www/html/wordpress/wp-content/db.php
define( 'PG4WP_ROOT', ABSPATH.'wp-content/plugins/wppg/pg4wp');

Rename the wp-config.php and change the database information:

$ mv /var/www/html/wordpress/wp-config-sample.php /var/www/html/wordpress/wp-config.php
$ vi /var/www/html/wordpress/wp-config.php
define( 'DB_NAME', 'wordpressdb' );
define( 'DB_USER', 'wordpress' );
define( 'DB_PASSWORD', 'wpPassword' );
define( 'DB_HOST', 'localhost:3307' );

Then, we need to create the database and the application user in the PostgreSQL database. On the master node:

$ postgres=# CREATE DATABASE wordpressdb;
CREATE DATABASE
$ postgres=# CREATE USER wordpress WITH PASSWORD 'wpPassword';
CREATE ROLE
$ postgres=# GRANT ALL PRIVILEGES ON DATABASE wordpressdb TO wordpress;
GRANT

And edit the pg_hba.conf file to allow the connection from the Application Server.

$ Vi /var/lib/pgsql/11/data/pg_hba.conf
host  all  all  192.168.100.153/24  md5
$ systemctl reload postgresql-11

Make sure you can access it from the Application Server:

$ psql -hlocalhost -p3307 -Uwordpress wordpressdb
Password for user wordpress:
psql (9.2.24, server 11.2)
WARNING: psql version 9.2, server version 11.0.
         Some psql features might not work.
Type "help" for help.
wordpressdb=>

Now, go to the install.php in the web browser, in our case, the IP Address for the Application Server is 192.168.100.153, so, we go to:

http://192.168.100.153/wordpress/wp-admin/install.php

Add the Site Title, Username and Password to access the admin section, and your email address.

Finally, go to Plugins -> Installed Plugins and activate the WPPG plugin.

Conclusion

Now, we have WordPress running with PostgreSQL by using a single endpoint. We can monitor our cluster activity on ClusterControl checking the different metrics, dashboards or many performance and management features.

There are different ways to implement WordPress with PostgreSQL. It could be by using a different plugin, or by installing WordPress as usual and adding the plugin later, but in any case, as we mentioned, PostgreSQL is not officially supported by WordPress, so we must perform an exhaustive testing process if we want to use this topology in production.

Tags:

PostgreSQL

postgres

wordpress

cms

↧

Koichi Suzuki: Postgres-XL and global MVCC

April 1, 2019, 7:58 pm

≫ Next: Tatsuo Ishii: Statement level load balancing

≪ Previous: Sebastian Insausti: How to Deploy Highly Available PostgreSQL with Single Endpoint for WordPress

Back to the PG

I’m very excited to become a 2ndQuadrant member. I was involved in PostgreSQL activities in NTT group (Japanese leading ICT company, see here and here), including log shipping replication and PostgreSQL scale out solution as PostgresXC and PostgresXL.

At NTT I had several chances to work very closely with 2ndQuadrant. After three years involvement in deep learning and accelerator usage in various applications I’m now back to PostgreSQL world. And I’m still very interested in PostgreSQL scale out solutions and applying PostgreSQL to large scale analytic workload as well as stream analytics.

Here, I’d like to begin with a discussion of scaled out parallel distributed database with full transaction capabilities including atomic visibility.

ACID property, distributed database and atomic visibility

Traditionally, providing full ACID property is the responsibility of each database server. In scale out solutions, database consists of multiple servers and two phase commit protocol (2PC) is used to provide ACID properties in updating multiple servers.

Although 2PC provides write consistency among multiple servers, we should note that it does not provide atomic visibility which ensures global transaction update can be visible to all the other transaction in a same time. In two phase commit protocol, each server receives “COMMIT” in different time clock and this can make partial transaction update visible to others.

How Postgres-XL works, global transaction management

Postgres-XL provides atomic visibility of global transaction and this is essentially what GTM (global transaction manager) is doing. GTM helps to share the snapshot among all the transactions so that any such partial COMMIT is not visible until all the COMMITs are successful and GTM updates the snapshot. Note that GTM-proxy is used to reduce interaction between each server and GTM by copying the same snapshot to different local transactions.

This is very similar to what standalone PostgreSQL is doing as MVCC. It is somewhat centralized and limits XL to run in geographically distributed environment. For example, suppose we have distributed database with servers in Europe and the US and have GTM in London. In such environment, servers in the US cannot neglect the latency between servers and GTM. Even in local environment, the number of transaction per second is limited by the maximum number of interactions per second through network.

Towards full distribution

My ongoing interest is whether we can make global atomic read really distributed, in other words, without any centralized facility? It is very important to configure geo-distributed parallel databases and improve their performance limit. I’d like to continue this topics on the series of blog for discussion.

↧

Tatsuo Ishii: Statement level load balancing

April 1, 2019, 11:53 pm

≫ Next: Alexander Sosna: HowTo: Central and semantic logging for PostgreSQL

≪ Previous: Koichi Suzuki: Postgres-XL and global MVCC

In the previous article I wrote about one of the new features of upcoming Pgpool-II 4.1.
This time I would like to introduce "statement level load balancing" feature of 4.1.

Pgpool-II can distribute read queries among PostgreSQL backend nodes. This allows to design a scale out cluster using PostgreSQL. The particular database node used for distributing read query is determined at the session level: when a client connects to Pgpool-II. This is so called "session level load balancing". For example, if a client connects to Pgpool-II and the load balance node is node 1 (we assume that this is a streaming replication standby node), then any read query will be distributed to the primary (master) node and the load balance node (in this case node1, the standby node). The distribution ratio is determined by "backend weight" parameter in the Pgpool-II configuration file (usually named "pgpool.conf"), typically "backend_weight0" or "backend_weight1", corresponding to node 0 and node 1 respectively.

This is good as long as clients connects to Pgpool-II, issue some queries, and disconnect, since next time a client connects to Pgpool-II, different load balance node may be chosen according to the backend weight parameters.

However, if your client already has a connection pooling feature, this way (session level load balancing) might be a problem, since the selection of load balance node is performed only once when the connection pooling from client to Pgpool-II is created.

The statement level load balancing feature is created to solve the problem. Unlike the session level load balancing, the load balancing node is determined when a new query is issued. The new parameter for this is "statement_level_load_balance". If this is set to on, the feature is enabled (the parameter can be changed by reloading the pgpool.conf).

At first "select_cnt" is 0, which means no SELECTs were issued.

test=# show pool_nodes;

 node_id | hostname | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | last_status_change 
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+---------------------
 0       | /tmp     | 11002 | up     | 0.500000  | primary | 0          | true              | 0                 | 2019-04-02 15:36:58
 1       | /tmp     | 11003 | up     | 0.500000  | standby | 0          | false             | 0                 | 2019-04-02 15:36:58
(2 rows)

Let's issue a SELECT.

test=# show pool_nodes;
 node_id | hostname | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | last_status_change  
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+---------------------
 0       | /tmp     | 11002 | up     | 0.500000  | primary | 1          | true              | 0                 | 2019-04-02 15:36:58
 1       | /tmp     | 11003 | up     | 0.500000  | standby | 0          | false             | 0                 | 2019-04-02 15:36:58
(2 rows)

Now the select_cnt of node 0 is 1, which means the SELECT was sent to node 0. Also please note that "load_balance_node" colum of node 0 is "true", which means node 0 is chosen as the load balance node in the last query.

Ok, let's issue another SELECT:

test=# select 2;
 ?column? 
----------
        2
(1 row)

test=# show pool_nodes;
 node_id | hostname | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | last_status_change  
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+---------------------
 0       | /tmp     | 11002 | up     | 0.500000  | primary | 1          | false             | 0                 | 2019-04-02 15:36:58
 1       | /tmp     | 11003 | up     | 0.500000  | standby | 1          | true              | 0                 | 2019-04-02 15:36:58
(2 rows)

Now the load_balance_node is changed to node 1, and the select_cnt of node 1 becomes 1. This is how the statement level load balancing works.

↧

Alexander Sosna: HowTo: Central and semantic logging for PostgreSQL

April 2, 2019, 2:00 am

≫ Next: Dave Conlin: Postgres indexes for absolute beginners

≪ Previous: Tatsuo Ishii: Statement level load balancing

HowTo: Central and semantic logging for PostgreSQL

Today it is no longer necessary to argue why central logging makes sense or is even necessary. Most medium-sized companies now have a central logging system or are just introducing it. Once the infrastructure has been created, it must be used sensibly and efficiently! Especially as an...

02-04

Alexander Sosna

↧

Dave Conlin: Postgres indexes for absolute beginners

April 2, 2019, 6:45 am

≫ Next: Doug Hunley: Enhancing Your PostgreSQL 10 Security with the CIS Benchmark

≪ Previous: Alexander Sosna: HowTo: Central and semantic logging for PostgreSQL

Indexes are really important for Postgres performance, but they’re often misunderstood and misapplied. This post aims to give you a good grounding in indexes to avoid a lot of beginner mistakes.

Step one: understand what you want to achieve

Because indexes are such a powerful tool, a new index is often viewed as “the answer” to whatever performance problems people are experiencing. Wading straight in and creating an index for every sequential scan in sight is the simplest thing to do, but indexes have costs as well as benefits.

Not only do indexes take up memory, they raise the cost of writing to the table in question. Any speed-up an index may provide for reads isn’t free — it’s offset by more work to keep the index up to date when the data in the table change. So an unused index isn’t just useless — it’s actively harmful to your database’s performance.

First, take the time to understand which bits of your query are running slowly (use the query plan), make a hypothesis as to why they’re slow, and then validate that hypothesis by attempting to speed them up.

In order to understand when the answer might be an index, it’s important to understand the difference between sequential scans and index scans in Postgres.

Sequential scans

Sequential scans are the simplest, most obvious way of reading data from a table. Postgres jumps to the first block of memory (“page”) that holds rows for the table in question and reads in all the data, row by row, page by page, and passes it on.

Sequential scans can get a bit of a bad rap. One of the things we often hear from people when we ask them about their current performance analysis is “the first thing I do is look for sequential scans”.

It’s true that using an index on a table can make a big difference to query performance, but it’s also true that if your query just needs to get all of the data from a table in an unordered mass of rows, then things aren’t going to get any more efficient than just reading those rows in directly from consecutive pages.

Index scans

An index is just a data structure which holds references to rows in a table, stored in a way that makes it easy to find them based on the row’s values in the indexed column(s).

An index scan reads through the index and uses it to quickly look up which entries match the conditions it’s looking for, and return them in the order they’re stored in the index.

Postgres then follows these references to look up the data in the rows from the table in the heap, where it would have found them if it had done a sequential scan.

When indexes are useful — and when they’re not

This two-step process is why, even if the planner could use an index, it doesn’t always choose to. Sure, an index saves you loads of time if it cuts out an expensive sort, or allows you to only read two rows from the table instead of two million. But if you want most of the rows from a table in no particular order, then using an index just introduces an unnecessary extra step and makes Postgres read the pages the table is stored in out of order.

Mmm… lychees. Photo by Nik Ramzi Nik Hassan

Think of it like reading a book. If you want to read the whole thing, then the most efficient choice is to open it at the first page, start reading, and stop when you reach the end — a sequential scan. If you only want to read what the author has to say about lychees, then the best thing to do is to look up “lychee” in the book’s index, then turn to the pages it lists¹.

There are two common cases when indexes can make a big difference to query performance:

The first case is when a query is looking for a small subset of the rows in the indexed table. This can happen when you filter the rows using a WHERE statement, or when you are performing a JOIN on a relatively small number of rows. In these situations, Postgres can use the index to quickly find the row(s) it’s looking for, then read them in.

The second case is when the query needs rows from the table in the order that the index has stored them in — or the reverse. Postgres can scan through the index, collecting references to the rows it needs in the order it needs them in, so that when it reads the rows they will already be sorted.

In pgMustard, we recommend looking at adding an index when we see either a sequential scan which discards a high proportion of the rows it reads, or a sort. In both cases, though, we only suggest improvements to the longer-running operations in the query — it’s important to focus on the parts of the query that will make a real difference to the overall running time.

Different index types

The most common index type by far, and the Postgres default, is a binary tree. There’s a reason for this, and it’s the same as the reason that 90% of the guides, tips and advice out there focus on binary trees.

If your query revolves around filtering data in a way that can be considered “sorting” it, ie using the operators <, >, >=, <= and =, then there’s a reasonable probability that a binary tree index will be a good fit.

Since a large amount of database queries are either equality checks or ordering comparisons on numerical or date types, binary trees suit the majority of cases.

If you want to do something “unusual” — a common example is spatial or geometric data, using queries like “the nearest point to” or “within this locus” — then you should look into Postgres’s other index types, but for everyday queries like ORDER BY price DESC or WHERE id = 416, a binary tree will usually be good enough.

How to create an index

Once you’ve decided what you need, creating an index on the column you want to query on is simple. If I want to index my books table by the author column, it’s as simple as

CREATE INDEX author_idx ON books (author);

You can also use an expression, wrapped in parentheses, instead of a column. For example, I love reading a good thick fantasy book, but I always skip past the songs — there’s nothing I hate more than when Raistlin’s nefarious exploits are interrupted by some Gil-Galad type nonsense. I’m always searching my books table to find long books, with as few songs as possible. So I want an index on the ultimate fantasy book metric:

CREATE INDEX length_idx ON books ((total_pages - song_pages));

Gil-Galad was an elven king. Photo by Jeremy Bishop

Note the double parentheses — this isn’t a typo! One set of parentheses wraps the list (more on this next) of things you’re indexing on, and the other wraps the expression we’re indexing on to say “treat this as one thing”.

Often, when you create an index to a sort order, you want to sort rows by more than one column. No problem: you can create a multi-column index by listing more than one column in the creation statement. You can also specify which order you’d like the columns to be sorted in.

For example, if my antipathy towards songs in fantasy novels decreases, I might only want to use the amount of songs as a tie-breaker between equally-long books. In that case, I could create an index sorting the books on total_pages in descending order (biggest first), then break any ties using song_pages in ascending order (smallest first):

CREATE INDEX sort_idx ON books (total_pages DESC, song_pages ASC);

An important fact to remember is that Postgres can scan an index backwards or forwards, so if I have the index above, there’s no need for me also to create an index on (total_pages ASC, song_pages DESC), if for some reason I want to find short books that are full of songs.

Next time we’ll cover the other uses — and pitfalls — of multi-column indexes, but they’re perfect for when the main purpose of the index is to pre-calculate a complicated sort order.

Wrapping up

Hopefully this has given you a better understanding of when indexes can help you out. As always, benchmarking and testing is key, but it’s good to know the basics.

¹Unless it turns out to be a book about lychees, of course. As with a lot of elements of query planning, row estimates are really important.

Postgres indexes for absolute beginners was originally published in pgMustard on Medium, where people are continuing the conversation by highlighting and responding to this story.

↧

Doug Hunley: Enhancing Your PostgreSQL 10 Security with the CIS Benchmark

April 2, 2019, 7:59 am

≫ Next: Magnus Hagander: When a vulnerability is not a vulnerability

≪ Previous: Dave Conlin: Postgres indexes for absolute beginners

Crunchy Data has recently announced an update to the CIS PostgreSQL Benchmark by the Center for Internet Security, a nonprofit organization that provides publications around standards and best practices for securing technologies systems. This newly published CIS PostgreSQL 10 Benchmark joins the existing CIS Benchmarks for PostgreSQL 9.5 and 9.6 while continuing to build upon Crunchy Data's efforts with the PostgreSQL Security Technical Implementation Guide (PostgreSQL STIG).

What is a CIS Benchmark?

As mentioned in an earlier blog post, a CIS Benchmark is a set of guidelines and best practices for securely configuring a target system. The benchmark contains a series of recommendations that help test the security of the system: some of the recommendations are "scored" (where a top score of 100 is the best), while others are are provided to establish best practices for security.

↧

Magnus Hagander: When a vulnerability is not a vulnerability

April 2, 2019, 12:39 pm

≫ Next: Laurenz Albe: count(*) made fast

≪ Previous: Doug Hunley: Enhancing Your PostgreSQL 10 Security with the CIS Benchmark

Recently, references to a "new PostgreSQL vulnerability" has been circling on social media (and maybe elsewhere). It's even got it's own CVE entry. The origin appears to be a blogpost from Trustwave.

So is this actually a vulnerability? (Hint: it's not) Let's see:

↧

Laurenz Albe: count(*) made fast

April 3, 2019, 1:00 am

≫ Next: Luca Ferrari: Estimating row count from explain output...in Perl!

≪ Previous: Magnus Hagander: When a vulnerability is not a vulnerability

count(*) in a children's rhyme — © Laurenz Albe 2019

It is a frequent complaint that count(*) is so slow on PostgreSQL.

In this article I want to explore the options you have get your result as fast as possible.

Why is `count(*)` so slow?

Most people have no trouble understanding that the following is slow:

SELECT count(*)
FROM /* complicated query */;

After all, it is a complicated query, and PostgreSQL has to calculate the result before it knows how many rows it will contain.

But many people are appalled if the following is slow:

SELECT count(*) FROM large_table;

Yet if you think again, the above still holds true: PostgreSQL has to calculate the result set before it can count it. Since there is no “magical row count” stored in a table (like it is in MySQL’s MyISAM), the only way to count the rows is to go through them.

So count(*) will normally perform a sequential scan of the table, which can be quite expensive.

Is the “``” in `count()` the problem?

The “*” in SELECT * FROM ... is expanded to all columns. Consequently, many people think that using count(*) is inefficient and should be written count(id) or count(1) instead. But the “*” in count(*) is quite different, it just means “row” and is not expanded at all.

Writing count(1) is the same as count(*), but count(id) is something different: It will only count the rows where id IS NOT NULL, since most aggregates ignore NULL values.

So there is nothing to be gained by avoiding the “*”.

Using an index only scan

It is tempting to scan a small index rather then the whole table to count the number of rows.
However, this is not so simple in PostgreSQL because of its multi-version concurrency control strategy. Each row version (“tuple”) contains the information to which database snapshot it is visible. But this information is not (redundantly) stored in the indexes. So it usually isn’t enough to count the entries in an index, because PostgreSQL has to visit the table entry (“heap tuple”) to make sure an index entry is visible.

To mitigate this problem, PostgreSQL has introduced the visibility map, a data structure that stores if all tuples in a table block are visible to everybody or not.
If most table blocks are all-visible, an index scan doesn’t need to visit the heap tuple often to determine visibility. Such an index scan is called “index only scan”, and with that it is often faster to scan the index to count the rows.

Now it is VACUUM that mantains the visibility map, so make sure that autovacuum runs often enough on the table if you want to use a small index to speed up count(*).

Using an aggregate table

I wrote above that PostgreSQL does not store the row count in the table.

Maintaining such a row count would be an overhead that every data modification has to pay for a benefit that no other query can reap. This would be a bad bargain. Moreover, since different queries can see different row versions, the counter would have to be versioned as well.

But there is nothing that keeps you from implementing such a row counter yourself.
Suppose you want to keep track of the number of rows in the table mytable. You can do that as follows:

START TRANSACTION;

CREATE TABLE mytable_count(c bigint);

CREATE FUNCTION mytable_count() RETURNS trigger
   LANGUAGE plpgsql AS
$$BEGIN
   IF TG_OP = 'INSERT' THEN
      UPDATE mytable_count SET c = c + 1;

      RETURN NEW;
   ELSIF TG_OP = 'DELETE' THEN
      UPDATE mytable_count SET c = c - 1;

      RETURN OLD;
   ELSE
      UPDATE mytable_count SET c = 0;

      RETURN NULL;
   END IF;
END;$$;

CREATE TRIGGER mytable_count_mod AFTER INSERT OR DELETE ON mytable
   FOR EACH ROW EXECUTE PROCEDURE mytable_count();

-- TRUNCATE triggers must be FOR EACH STATEMENT
CREATE TRIGGER mytable_count_trunc AFTER TRUNCATE ON mytable
   FOR EACH STATEMENT EXECUTE PROCEDURE mytable_count();

-- initialize the counter table
INSERT INTO mytable_count
   SELECT count(*) FROM mytable;

COMMIT;

We do everything in a single transaction so that no data modifications by concurrent transactions can be “lost” due to race conditions.
This is guaranteed because CREATE TRIGGER locks the table in SHARE ROW EXCLUSIVE mode, which prevents all concurrent modifications.
The down side is of course that all concurrent data modifications have to wait until the SELECT count(*) is done.

This provides us with a really fast alternative to count(*), but at the price of slowing down all data modifications on the table.

Even though this counter table might receive a lot of updates, there is no danger of “table bloat” because these will all be HOT updates.

Do you really need `count(*)`

Sometimes the best solution is to look for an alternative.

Often an approximation is good enough and you don’t need the exact count. In that case you can use the estimate that PostgreSQL uses for query planning:

SELECT reltuples::bigint
FROM pg_catalog.pg_class
WHERE relname = 'mytable';

This value is updated by both autovacuum and autoanalyze, so it should never be much more than 10% off. You can reduce autovacuum_analyze_scale_factor for that table so that autoanalyze runs more often there.

Estimating query result counts

Up to now, we have investigated how to speed up counting the rows of a table.

But sometimes you want to know how many rows a SELECT statement will return without actually running it.

Obviously the only way to get an exact answer to this is to execute the query. But if an estimate is good enough, you can use PostgreSQL’s optimizer to get it for you.

The following simple function uses dynamic SQL and EXPLAIN to get the execution plan for the query passed as argument and returns the row count estimate:

CREATE FUNCTION row_estimator(query text) RETURNS bigint
   LANGUAGE plpgsql AS
$$DECLARE
   plan jsonb;
BEGIN
   EXECUTE 'EXPLAIN (FORMAT JSON) ' || query INTO plan;

   RETURN (plan->0->'Plan'->>'Plan Rows')::bigint;
END;$$;

Do not use this function to process untrusted SQL statements, since it is by nature vulnerable to SQL injection.

The post count(*) made fast appeared first on Cybertec.

↧

Luca Ferrari: Estimating row count from explain output...in Perl!

April 3, 2019, 5:00 pm

≫ Next: Craig Kerstiens: Postgres and superuser access

≪ Previous: Laurenz Albe: count(*) made fast

After having read the interesting post by Laurenz Albe on how to use ~EXPLAIN~ to get a quick estimate of a query count, I decided to implement the same feature in Perl.

Estimating row count from explain output…in Perl!

At the end of his blog post, Laurenz Albe shows how to use a quick and dirty function to estimate the number of rows returned by an arbitrary query.

While I don’t believe it is often a good idea to judge the size of a query by the optimizer guesses, the approach is interesting. Laurenz shows how to exploit the JSON format and query facilities to extract data from the EXPLAIN output, why not using Perl to crunch the textual data?

So here it is a simple implementation to extract the estimate within Perl:

CREATEORREPLACEFUNCTIONplperl_row_estimate(querytext)RETURNSBIGINTAS$PERL$my($query)=@_;return0if(!$query);$query=sprintf"EXPLAIN (FORMAT YAML) %s",$query;elog(DEBUG,"Estimating from [$query]");my@estimated_rows=map{s/Plan Rows:\s+(\d+)$/$1/;$_}grep{$_=~/Plan Rows:/}split("\n",spi_exec_query($query)->{rows}[0]->{"QUERY PLAN"});return0if(! 

↧