Joshua Drake: Thanks to EDB

May 18, 2016, 8:53 am

≫ Next: Josh Berkus: Changing PostgreSQL Version Numbering

≪ Previous: US PostgreSQL Association: PostgreSQL at OSCON 2016!

This is just a short blog to thank EDB and Robert Haas for the this blog. It is great to see all aspects of the community coming together. This is how we defeat the true competitors. Open and Closed Source products will in the long run fail to materialize a true return on investment that is anywhere close to what a coopetition based true Open Source project can.

↧

Josh Berkus: Changing PostgreSQL Version Numbering

May 18, 2016, 9:02 am

≫ Next: US PostgreSQL Association: Unsung Heroes: Steve Atkins

≪ Previous: Joshua Drake: Thanks to EDB

Per yesterday's developer meeting, the PostgreSQL Project is contemplating a change to how we do version numbers. First, let me explain how we do version numbers now. Our current version number composition is:

9 . 5 . 3

Major1 . Major2 . Minor

That is, the second number is the "major version" number, reflecting our annual release. The third number is the update release number, reflecting cumulative patch releases. Therefore "9.5.3" is the third update to to version 9.5.

The problem is the first number, in that we have no clear criteria when to advance it. Historically, we've advanced it because of major milestones in feature development: crash-proofing for 7.0, Windows port for 8.0, and in-core replication for 9.0. However, as PostgreSQL's feature set matures, it has become less and less clear on what milestones would be considered "first digit" releases. The result is arguments about version numbering on the mailing lists every year which waste time and irritate developers.

As a result, the PostgreSQL Project is proposing a version numbering change, to the following:

10 . 2

Major . Minor

Thus "10.2" would be the second update release for major version 10. The version we release in 2017 would be "10" (instead of 10.0), and the version we release in 2018 will be "11".

The idea is that this will both put an end to the annual arguments, as well as ending the need to explain to users that 9.5 to 9.6 is really a major version upgrade requiring downtime.

Obviously, there is potential for breakage of a lot of tools, scripts, automation, packaging and more in this. That's one reason we're discussing this now, almost a year before 10 beta is due to come out.

The reason for this blog post is that I'm looking for feedback on what this version number change will break for you. Particularly, I want to hear from driver authors, automation engineers, cloud owners, application stack owners, and other folks who are "downstream" of PostgreSQL. Please let us know what technical problems this will cause for you, and how difficult it will be to resolve them in the next nine months.

We are not, at this point, interested in comments on how you feel about the version change or alternate version naming schemes. That discussion has already happened, at length. You can read it here, here, and here, as well as at the developer meeting.

Places to provide feedback:

comments on this blog
posts on the pgsql-hackers or pgsql-advocacy mailing lists
PostgreSQL Facebook group

Thanks for any feedback you can provide.

Note that the next release of PostgreSQL, due later this year, will be "9.6" regardless. We're deciding what we do after that.

↧

US PostgreSQL Association: Unsung Heroes: Steve Atkins

May 18, 2016, 9:29 am

≫ Next: Oleg Bartunov: Refined search

≪ Previous: Josh Berkus: Changing PostgreSQL Version Numbering

Continuing my Building a Better community series, I contacted Steve Atkins. There was an interesting twist to this particular Unsung Hero. I initially contacted lluad on #postgresql (irc.freenode.net) due to his continuous and untiring efforts to help people on channel. What I didn't know is that Steve Atkins is actually lluad. Read on for some interesting bits.

How do you use PostgreSQL?

I use PostgreSQL almost any time I need to store or process structured data.

↧

Oleg Bartunov: Refined search

May 18, 2016, 4:44 pm

≫ Next: Vasilis Ventirozos: Repairing clog corruptions

≪ Previous: US PostgreSQL Association: Unsung Heroes: Steve Atkins

Sometimes, using too specific can be risky, since the search could return zero or little results.
Better search for a wide query and use ranking to sort the results.

Example:

SELECT title,ts_rank_cd(fts, to_tsquery('english','x-ray & russian')) AS rank 
FROM apod WHERE fts  @@ to_tsquery('english','x-ray&russian') ORDER BY rank 
DESC LIMIT 5;
                 title                  |   rank
----------------------------------------+-----------
 The High Energy Heart Of The Milky Way | 0.0240938
(1 row)

SELECT title,ts_rank_cd(fts, to_tsquery('english','x-ray & russian')) AS rank 
FROM apod WHERE fts  @@ to_tsquery('english','x-ray') ORDER BY rank DESC LIMIT 5;
                 title                  |   rank
----------------------------------------+-----------
 The High Energy Heart Of The Milky Way | 0.0240938
 X-Ray Jet From Centaurus A             |         0
 Barnacle Bill And Sojourner            |         0
 The Crab Nebula in X-Rays              |         0
 M27: The Dumbbell Nebula               |         0
(5 rows)

↧

Vasilis Ventirozos: Repairing clog corruptions

May 19, 2016, 2:23 am

≫ Next: Pallavi Sontakke: How Postgres-XL is tested

≪ Previous: Oleg Bartunov: Refined search

Yesterday, i got a page from a client about a possible corruption issue, to one of his clients, so i decided to share how i dealt with it. Before starting with how i approached the problem, i want to say that corruptions don't just happen in postgres, in OmniTI, we manage A LOT of databases, all with different major versions and in different operating systems and in my time, I haven't seen (many) cases that corruption happened because of a postgres bug (i've seen indexes getting corrupted but i don't remember ever seeing data being corrupted). What i want to say is that corruptions don't just happen, hardware failures, power outages, disks getting full are common reasons for corruptions.
A replica and backups should always be in place and the server should be properly monitored. Unfortunately this server was not managed by us so none of the above was in place..

At first I saw in the logs entries like :

From the logs:

2016-05-18 15:50:06 EDT::@:[10191]: ERROR: could not access status of transaction 128604706

2016-05-18 15:50:06 EDT::@:[10191]: DETAIL: Could not open file "pg_clog/007A": No such file or directory.

2016-05-18 15:50:06 EDT::@:[10191]: CONTEXT: automatic vacuum of table "pe.pg_toast.pg_toast_4409334"

...

much more to be expected...

...

At this point you know that something went wrong, ideally you want to copy your whole $PGDATA to somewhere else and work there while at the same time you start considering uncompressing your most recent backup. In my case, there was no backup and not enough disk space to copy $PGDATA.
FUN TIMES :)

I started by fixing all clogs missing which i found logs
dd if=/dev/zero of=/var/db/pgdata/pg_clog/0114 bs=256k count=1
dd if=/dev/zero of=/var/db/pgdata/pg_clog/00D1 bs=256k count=1
dd if=/dev/zero of=/var/db/pgdata/pg_clog/0106 bs=256k count=1

...

keep creating until logs are clean, they can be a lot, in my case they were more than 100....

...

From the logs i also found the tables :

pg_toast.pg_toast_18454

pg_toast.pg_toast_35597

pg_toast.pg_toast_35607

pg_toast.pg_toast_4409334

pg_toast.pg_toast_4409344

pg_toast.pg_toast_8817516

db=# select relname,relnamespace from pg_class where oid in (18454,35597,35607,4409334,4409344,8817516) order by relnamespace;

relname | relnamespace

------------------------+--------------

table_case1 | 16872

table_case1 | 16874

table_case2 | 16874

table_case2 | 4409063

table_case1 | 4409063

table_case2 | 8817221

(6 rows)

db=# select oid,nspname from pg_namespace where oid in (16872,16874,16874,4409063,8817221) order by oid;

oid | nspname

---------+------------

16872 | schema1

16874 | schema2

4409063 | schema3

8817221 | schema4

(4 rows)

With a join i found schema.table:

schema1.table_case1

schema2.table_case1

schema2.table_case2

schema3.table_case2

schema3.table_case1

schema4.table_case2

Basically we have an application with multiple schemas and 2 kinds of tables were corrupted across 4 schemas.

For table_case1 (simple case, table not referenced by anyone):

for each schema :

set search_path to schema1;

create table badrows (badid int);

DO $f$

declare

curid INT := 0;

vcontent TEXT;

badid INT;

begin

FOR badid IN SELECT id FROM table_case1 LOOP

curid = curid + 1;

if curid % 100000 = 0 then

raise notice '% rows inspected', curid;

end if;

begin

SELECT id

INTO vcontent

FROM table_case1 where id = badid;

vcontent := substr(vcontent,1000,2000);

exception

when others then

insert into badrows values (badid);

raise notice 'data for id % is corrupt', badid;

continue;

end;

end loop;

end;

$f$;

(This script was taken from Josh Berkus blog, and it was modified to fill my needs.)

create table table_case1_clean as select * from table_case1

where id not in (select badid from badrows);

TRUNCATE table_case1;

vacuum full verbose table_case1;

insert into table_case1 select * from table_case1_clean;

vacuum full analyze verbose table_case1;

drop table badrows;

For table_case2 (this one is being referenced by 2 other tables)

F or each org (schema):

set search_path to schema2;

create table badrows (badid int);

DO $f$

declare

curid INT := 0;

vcontent TEXT;

badid INT;

begin

FOR badid IN SELECT id FROM table_case2 LOOP

curid = curid + 1;

if curid % 100000 = 0 then

raise notice '% rows inspected', curid;

end if;

begin

SELECT id

INTO vcontent

FROM table_case2 where id = badid;

vcontent := substr(vcontent,1000,2000);

exception

when others then

insert into badrows values (badid);

raise notice 'data for id % is corrupt', badid;

continue;

end;

end loop;

end;

$f$;

create table table_case2_clean as select * from table_case2

where id not in (select badid from badrows);

alter table reference_table1 drop constraint reference_table1_fkey;

delete from reference_table1 where table_case2_id in (select badid from badrows) ;

alter table reference_table2 drop constraint reference_table2_fkey;

delete from reference_table2 where table_case2_id in (select badid from badrows);

TRUNCATE table_case2;

vacuum full verbose table_case2;

insert into table_case2 select * from table_case2_clean;

vacuum full analyze verbose table_case2;

ALTER TABLE ONLY reference_table1

ADD CONSTRAINT reference_table1_fkey FOREIGN KEY (table_case2_id) REFERENCES table_case2(id) ON DELETE CASCADE;

ALTER TABLE ONLY reference_table2

ADD CONSTRAINT reference_table2_fkey FOREIGN KEY (table_case2_id) REFERENCES table_case2(id);

drop table badrows;

(please ignore or report any typos here, i replaced the real table names while i was writing this post so i might messed up with some names).

What we basically did here was to recreate the table without the corrupted rows.

After this, tables should be corruption free with the minimum possible data loss.

To ensure that you are corruption free you should either pg_dump and restore, or vacuum full everything, normal vacuum will NOT show corruptions.

pe=# vacuum verbose schema1.table_case1;

INFO: vacuuming " schema1.table_case1"

INFO: index " schema1.table_case1_pkey" now contains 12175 row versions in 36 pages

DETAIL: 0 index row versions were removed.

0 index pages have been deleted, 0 are currently reusable.

CPU 0.00s/0.00u sec elapsed 0.15 sec.

INFO: " table_case1": found 0 removable, 12175 nonremovable row versions in 258 out of 258 pages

DETAIL: 0 dead row versions cannot be removed yet.

There were 0 unused item pointers.

0 pages are entirely empty.

CPU 0.00s/0.00u sec elapsed 0.17 sec.

INFO: vacuuming "pg_toast.pg_toast_18454"

INFO: index "pg_toast_18454_index" now contains 51370 row versions in 143 pages

DETAIL: 0 index row versions were removed.

0 index pages have been deleted, 0 are currently reusable.

CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO: "pg_toast_18454": found 0 removable, 51370 nonremovable row versions in 12331 out of 12331 pages

DETAIL: 0 dead row versions cannot be removed yet.

There were 0 unused item pointers.

0 pages are entirely empty.

CPU 0.05s/0.03u sec elapsed 0.09 sec.

VACUUM

pe=#

pe=# vacuum full verbose schema1.table_case1;

INFO: vacuuming " schema1.table_case1"

ERROR: missing chunk number 0 for toast value 9270408 in pg_toast_18454

Rows were lost, in my case that was acceptable and maybe your case is not the same as mine, but i hope this will provide some guidance in case you get into a similar situation..

Thanks for reading

- Vasilis

↧

Pallavi Sontakke: How Postgres-XL is tested

May 19, 2016, 5:31 am

≫ Next: Scott Mead: Helpful PostgreSQL Logging and Defaults

≪ Previous: Vasilis Ventirozos: Repairing clog corruptions

The purpose of this blog is to explain the process of Quality Analysis that Postgres-XL goes through internally at 2ndQuadrant. Here, I describe the bare minimum tests that each release goes through, not to mention the other tests carried out by many 2ndQuadrant and other community members.

Regression tests are carried out to ensure that the defect fixes or enhancements to PostgreSQL have not affected Postgres-XL.

We want to keep up-to-date with PostgreSQL features and performance enhancements. Also, many tools may work with only newer PostgreSQL releases. So, we merge each PostgreSQL minor release in a timely fashion. When merging the regression tests therein, we need to continuously and consciously ensure if the new features are supported in Postgres-XL fine. Sometimes, there may be a gap in the expected outputs of PostgreSQL and Postgres-XL, which has to be tracked.

Functional tests are done to validate that the functionalities of Postgres-XL are working as per business requirements. We ensure that all features are functioning as they are expected to.

Initially we created the ‘Major differences and Limitations’ tests’ module. Postgres-XL project (Postgres-XC, initially) was based on PostgreSQL 9.2. It picked up speed again, during PostgreSQL 9.5 development timeframe. Due to this, there is a known gap of features that Postgres-XL would like to support, but does not currently. We keep track of these with our xl_* functional tests. These tests cover limitations like materialized views, event triggers, foreign data wrappers, etc. Also, on the other hand they cover positive functional tests for features like BRIN, logical decoding of WAL data, jsonb, etc.

Additionally, each time a new feature or enhancement is added, we keep adding tests to validate the functionality.

Usability tests are performed to validate the ease with which the user interfaces can be used.

We have cluster-setup-utility tests. Creating a Postgres-XL cluster manually requires quite a few steps. We have automated these for ease-of-use with pgxc_ctl utility. For simple prototype use, we have added ‘prepare minimal’ way. Also, for seasoned users, we have added the ‘prepare empty’ way where they can provision node-by-node, for their specific use. We have automated TAP tests for this utility.

Recovery tests are done to test how well Postgres-XL is able to recover from crashes, hardware failures and other similar problems.

Postgres-XL being deployed as a cluster, we realize the importance of data consistency across node crashes. We have crash-recovery test scripts that crash/kill nodes and bring them up again. In parallel sessions, we keep on making database changes with SQL statements, transactions or prepared transactions. We verify that nodes (or their configured standbys) come up fine. We perform data sanity checks to verify proper recovery.

Bug tracking is done for all known/reported bugs in our internal tracking system.

Future steps

We are looking into using SQLsmith to generate random queries to detect further problems in Postgres-XL.

Also, we are in the process of setting up continuous integration server for automated builds, deployments and tests of Postgres-XL project.

↧

Scott Mead: Helpful PostgreSQL Logging and Defaults

May 19, 2016, 10:32 am

≫ Next: Tomas Vondra: Auditing Users and Roles in PostgreSQL

≪ Previous: Pallavi Sontakke: How Postgres-XL is tested

I use PostgreSQL every single day. I develop against multiple versions of the database (4 versions running in parallel as I write), I have apps that use it and, my daily DBA-ing. The biggest question I get from newbies and veterans a like is: “What are your PostgreSQL defaults?”

If you follow postgres, you already know that the default configuration (postgresql.conf) is very conservative from a resource (CPU, IOPS, Disk Space) perspective. Over the last 10 years, we [the community] have developed some straightforward and easy to understand formulas that will help you tune… shared_buffers for example. The item that always gets left out though is logging. As a developer, I’m always looking for ways to see “How the database is answering the questions I ask”. When I get a new postgres instance set up, I have a personal (somewhere in the cobwebs) checklist that I run. Some are based on the purpose of the deployment (shared_buffers, work_mem, etc…), some are things that I always set. Aside from memory, the biggest set of “standard” items I set are all related to logging. I’m big on monitoring (my pg_stat_activity patch was accepted back for 9.2) and having the right detail presented to me is important.

TL;DR

logging_collector = on

log_filename = ‘postgresql-%a.log’

log_truncate_on_rotation=on

log_line_prefix = ‘%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h ‘

log_checkpoints = on

log_autovacuum_min_duration = 0

log_temp_files = 0

log_lock_waits = on

the goal here is to have postgres ‘bubble-up’ details about what’s going on to us so

that we can see exactly what is happening when. In addition to setting the log files, you can pick up a great log analyzer (like pgBadger or pgCluu) that will break all of this down in to a gorgeous, easy-to-use report.

Scott’s logging defaults

logging_collector = on

By default, the logging_collector is disabled, this means that all of the info / error / warning, etc… logs that come out of postgres get kicked are sent to stderr. Problem is, when your postgres forks from the terminal and then you close the terminal, you lose all your logs. Turning this setting on will write all the log messages in a sub-directory of $PGDATA (log_directory = ‘pg_log’ by default, which, I like). This setting also requires a restart, so I always end up setting it before I even start postgres the first time. Postgres will automatically create the pg_log directory (or whatever log_directory is set to) in the data directory. Now, all of those info / error / warning will get written a log file in the pg_log directory.

log_filename = ‘postgresql-%a.log’

This sets the name of the actual file that log messages will be written to (in your shiny, new pg_log / log_destination directory). The %a means that you’ll see Mon, Tue, Wed, Thu, Fri, Sat, Sun. The patterns are based on standard strftime escapes (man page). The reason that I like using %a is that you get auto-rotation on the log files. You will keep one week’s worth of logs, when you rotate 7 days later, you won’t be creating a huge number of log files.

Note: Depending on my requirements, I will adjust this for production. If I have any special retention policy that I need to abide by, I’ll make the filename: postgresql-YYYY-MM-DD-HH24mmSS.log (log_filename = ‘postgresql-%Y-%m-%d-%H%M%S.log’ ). The trouble with this is that you’ll need to deal with log cleanup yourself (cron to archive logs + 30 days old … ).

log_truncate_on_rotation=on

This essentially says “when I switch log files, if a log already exists with that name, truncate the existing one and write new logs to an empty file”.

For example, on Monday, May 15th, we wrote a log file:

postgresql-Mon.log

Now, at midnight on Monday, May 22nd, postgres is going to rotate its log back to:

postgresql-Mon.log

This is our data from last week. If you leave log_truncate_on_rotation = off (the default), then postgres will append to that log file. That means, in December, you’ll have data for every Monday throughout the year in the same file. If you set this to on, it’ll will nuke the old data and give you only data from the most recent Monday. If you need keep log files for longer than 7 days, I recommend you use a more complete name for your log files (see log_filename above).

log_line_prefix = ‘%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h ‘

The log_line_prefix controls what every single line in the log file looks like. My personal favorite here is actually stolen directly from pgBadger’s suggested configuration. pgBadger is an amazing tool for parsing and analyzing the postgres log files. If you set the log_line_prefix directly, pgBadger can provide incredible detail about: “what happens where, who did it and, when the did it”. Just to show you the difference….

default log file error message:

LOG: database system was shut down at 2016-05-19 11:40:57 EDT
LOG: MultiXact member wraparound protections are now enabled
LOG: database system is ready to accept connections
LOG: autovacuum launcher started
ERROR: column "create_error_for_fun" does not exist at character 8
STATEMENT: select create_error_for_fun;

With a rich log_line_prefix

2016-05-19 11:46:22 EDT [14913]: [1-1] user=,db=,app=,client= LOG: database system was shut down at 2016-05-19 11:46:21 EDT
2016-05-19 11:46:22 EDT [14913]: [2-1] user=,db=,app=,client= LOG: MultiXact member wraparound protections are now enabled
2016-05-19 11:46:22 EDT [14911]: [3-1] user=,db=,app=,client= LOG: database system is ready to accept connections
2016-05-19 11:46:22 EDT [14917]: [1-1] user=,db=,app=,client= LOG: autovacuum launcher started
2016-05-19 11:46:27 EDT [14921]: [1-1] user=postgres,db=postgres,app=psql,client=[local] ERROR: column "create_error_for_fun" does not exist at character 8
2016-05-19 11:46:27 EDT [14921]: [2-1] user=postgres,db=postgres,app=psql,client=[local] STATEMENT: select create_error_for_fun;

Now, I know who, what, where, why and when.

log_checkpoints = on

Like any database, postgres is going to use the disks. In order to accomplish dealing with I/O efficiently, postgres (like many other databases) uses something called a checkpoint in order to synchronize it’s memory to those disks. Checkpoints occur periodically based on a number of things (load, configuration, etc…). The thing to keep in mind is that a checkpoint will use disk I/O, the busier the database, the more it requires. Setting this to on means that you know without a doubt when a checkpoint occured. It’s the same ol’ story: “Every once in a while, I get long-running queries, different ones each time!”… “I’m seeing a spike in IOPS and I don’t know why!” … “Sometimes my data load gets bogged down for some reason!” … etc…

This very well could be due to a large checkpoint occurring. Since it’s based on load / configuration / time, it’s critical that the server write a log of when checkpoint occurred so that you’re not left in the dark. There’s also useful information in these logs about how postgres is behaving (it can even help you tune your memory settings).

2016-05-19 12:01:26 EDT [15274]: [1-1] user=,db=,app=,client= LOG: checkpoint starting: immediate force wait
2016-05-19 12:01:26 EDT [15274]: [2-1] user=,db=,app=,client= LOG: checkpoint complete: wrote 0 buffers (0.0%); 
     0 transaction log file(s) added, 0 removed, 0 recycled; write=0.000 s, sync=0.000 s, total=0.001 s; 
     sync files=0, longest=0.000 s, average=0.000 s

(NB: There was some line-wrap here, I’ve manually entered carriage returns. The second line prints as one, long line)

It’s important to note that this is going to up the volume of your logs. But, it’s minimal and the benefits far outweigh the few extra bytes needed to write the message. (pgBadger will parse these up and give you a nice, clear picture of your checkpoint behavior).

log_autovacuum_min_duration = 0

(default: log_autovacuum_min_duration = -1). The tuning convention isn’t just a boolean on/ off. Essentially, you are telling postgres: “When an autovacuum runs for x milliseconds or longer, write a message to the log”. Setting this to 0 (zero) means that you will log all autovacuum operations to the log file.

For the uninitiated, autovacuum is essentially a background process that does garbage collection in postgres. If you’re just starting out, what you really need to know is the following:

It’s critical that autovacuum stay enabled
autovacuum is another background process that uses IOPS

Because autovacuum is necessary and uses IOPS, it’s critical that you know what it’s doing and when. Just like log_checkpoints (above), autovacuum runs are based on load (thresholds on update / delete velocities on each table). This means that vacuum can kick off at virtually any time.

2016-05-19 12:32:25 EDT [16040]: [4-1] user=,db=,app=,client= LOG: automatic vacuum of table "postgres.public.pgbench_branches": index scans: 1
 pages: 0 removed, 12 remain
 tuples: 423 removed, 107 remain, 3 are dead but not yet removable
 buffer usage: 52 hits, 1 misses, 1 dirtied
 avg read rate: 7.455 MB/s, avg write rate: 7.455 MB/s
 system usage: CPU 0.00s/0.00u sec elapsed 0.00 sec
2016-05-19 12:32:25 EDT [16040]: [5-1] user=,db=,app=,client= LOG: automatic analyze of table "postgres.public.pgbench_branches" system usage: CPU 0.00s/0.00u sec elapsed 0.00 sec
2016-05-19 12:32:27 EDT [16040]: [6-1] user=,db=,app=,client= LOG: automatic analyze of table "postgres.public.pgbench_history" system usage: CPU 0.01s/0.13u sec elapsed 1.70 sec

Just like with log_checkpoints the information that we receive here is very rich and can be invaluable when trying to diagnose ‘fleeting’ behavior of a system. As with log_checkpoints, you are increasing the volume of your log files. In the past 13 years, I’ve found this type of data well worth the overhead. The nice thing about this logging parameter is that you can throttle back the volume by saying “only log vacuums that take 10 seconds (10,000 milliseconds) or longer”.

log_temp_files = 0

To be as efficient as possible, postgres tries to do everything it can in memory. Sometimes, you just run out of that fickle resource. Postgres has a built-in ‘swap-like’ system that will employ temp files with the data directory to deal with the issue. If you’ve spent any time around disks (especially spinning-rust), you’ll know that swap can cause some serious performance issues. Just like checkpoint and autovacuum, temp files are going to happen automatically. Unlike these other two processes, they are going to occur if the queries you are running need the temp space. From a developer’s perspective, I want to know if the process that I’m engineering is going to use temp. From a DBA’s perspective, I want to know if the dang developers did something that needs temp space ( more likely, my dang maintenance jobs are using it). To help in tuning your queries and maintenance processes, log your temp files. It’ll tell you what size temp file was needed and which query caused it:

2016-05-19 12:31:20 EDT [15967]: [1-1] user=postgres,db=postgres,app=pgbench,client=[local] LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp15967.0", size 200204288
2016-05-19 12:31:20 EDT [15967]: [2-1] user=postgres,db=postgres,app=pgbench,client=[local] STATEMENT: alter table pgbench_accounts add primary key (aid)

The log_temp_files parameter is set in units of kilobytes. Essentially “write a log if I created a temp file that was this many KB or more in size”. I prefer 0 (zero) because I want to see all the temp files in use. Again, in my experience, the “juice is worth the squeeze” here.

log_lock_waits = on

Databases are servicing lots of clients all trying to do very similar work against the same set of data. This can cause contention (it’s the nature of the beast). log_lock_waits let’s you see where your contention is. It will give you detailed, specific information about what waits occurred and the context in which they occurred.

 2016-05-19 13:14:50 EDT [17094]: [1-1] user=postgres,db=postgres,app=psql,client=[local] LOG: process 17094 still waiting for RowExclusiveLock on relation 16847 of database 12403 after 1000.794 ms at character 13
 2016-05-19 13:14:50 EDT [17094]: [2-1] user=postgres,db=postgres,app=psql,client=[local] DETAIL: Process holding the lock: 17086. Wait queue: 17094.
 2016-05-19 13:14:50 EDT [17094]: [3-1] user=postgres,db=postgres,app=psql,client=[local] STATEMENT: delete from pgbench_tellers ;

These are what I call ‘reasonable defaults’ for logging in postgres. Again, these are the settings that I configure every time I setup a new cluster, whether it’s for dev / test / toy / prod.

Happy querying!

↧

Tomas Vondra: Auditing Users and Roles in PostgreSQL

May 20, 2016, 6:37 am

≫ Next: Shaun M. Thomas: PG Phriday: Trusty Table Tiers

≪ Previous: Scott Mead: Helpful PostgreSQL Logging and Defaults

One of the services we offer are security reviews (or audits, if you want), covering a range of areas related to security. It may be a bit surprising, but a topic that often yields the most serious issues is roles and privileges. Perhaps the reason why roles and privileges are a frequent source of issues is that it seems to be quite simple and similar to things the engineers are familiar with (e.g. Unix system of users and groups), but it turns out there are a few key differences with major consequences.

The other parts are either very straightforward and understandable even for sysadmins without much PostgreSQL experience (e.g. authentication config in pg_hba.conf), or the engineers recognize the complexity and take their time to familiarize with the details (a good example of this is Row Level Security).

That is not to say there are no interesting topics e.g. how to use RLS with application-level users but I’ll leave that for another blog post, as this one is about roles and privileges.

So let’s look at roles and privileges a bit a closer …

Owner is a small superuser

When it comes to roles, the initial checks are mostly expected. The role should not be a superuser (as superusers simply bypass various checks), and in general should not have any excessive privileges (e.g. CREATEDB, CREATEROLE and so on).

But it also should not own the database objects (tables, functions, …), since owners can simply grant themselves arbitrary privileges on the objects they own, which turns them into small superusers.

Consider the following example, where we attempt to protect the table from the owner by revoking all the privileges from that role:

db=# CREATE USER u;
CREATE ROLE

db=# SELECT rolsuper FROM pg_roles WHERE rolname = 'u';
 rolsuper
----------
 f
(1 row)

db=# \c 'user=u dbname=db'
You are now connected to database "db" as user "u".

So we have created a user who is not a superuser, and we have connected using that account (that’s the slightly cryptic psql command). Let’s create a table (so the user is an owner) and restrict our own access to it

db=> CREATE TABLE t (id INT);
CREATE TABLE

db=> REVOKE ALL ON t FROM u;
REVOKE

db=> SELECT * FROM t;
ERROR:  permission denied for relation t

So that works, right? Well, the problem is a user with access to SQL (e.g. an “attacker” that discovered a SQL injection vulnerability) can do this:

db=> GRANT ALL ON t TO u;
GRANT

db=> select * from t;
 id
----
(0 rows)

The owner can simply grant all privileges back to himself, defeating the whole privilege system. A single SQL injection vulnerability and it’s a game over. Another issue with owners is that they are not subject to RLS by default, although that can be fixed with a simple ALTER TABLE ... FORCE ROW LEVEL SECURITY.

In any case, this should be a clear hint that the application should use a dedicated role (or multiple roles), not owning any of the objects.

BTW users are often surprised when I mention that we can grant privileges to individual columns e.g. allow SELECT on a subset of columns, UPDATE on a different subset of columns, and so on.

When combined with SECURITY DEFINER functions, this is a great way to restrict access to columns the application should not access directly, but allow special operations. For example it shouldn’t be possible to select all passwords (even if hashed) or e-mails, but it should be possible to verify a password or an e-mail. SECURITY DEFINER functions are great for that, but sadly it’s one of the powerful yet severely underused features :-(

Role inheritance

Let’s assume you have a role that owns the objects, and a separate role used by the application. In fact, if you have sufficiently complex application, chances are you’ve split it into multiple parts, perhaps segregated into schemas, and each module uses a separate set of roles (owner + application, possibly more).

This gives you the ability to create application roles covering only part of the application e.g. the administration panel needs access to all modules, while a public web interface only needs read-only access to a small subset of modules.

CREATE ROLE module_users;    -- full access to user info
CREATE ROLE module_users_ro; -- limited access user info (register/verify)
CREATE ROLE module_posts;    -- full access to blog posts
CREATE ROLE module_posts_ro; -- read-only access to blog posts
... roles for additional modules ...

CREATE USER admin_user   -- full access
    IN ROLE module_users, module_posts;

CREATE USER web_user     -- limited access
    IN ROLE module_users_ro, module_posts_ro;

In other words, roles may be seen as groups and used for making the privileges easier to manage. There are two aspects that make this different from unix-like groups it’s possible to use multi-level hierarchy of roles (while Unix groups are flat), and inheritance (will get to that in a minute).

The above scheme works just fine, but only if you keep the connections for the two users (admin_user and web_user) separate. With a small number of users (modules, applications) that’s manageable, as you can maintain separate connection pools, but as the number of connection pools grows it ceases to serve the purpose. But can we use a single connection pool and keep the benefit of separate users?

Well, yes. We can create another user role for the connection pool and grant it all the existing users (admin_user and web_user).

CREATE USER pool_user IN ROLE admin_user, web_user

This seems a bit strange, because the new user becomes member of admin_user and web_user roles (users are just roles with LOGIN privilege), effectively inheriting all the privileges. Wasn’t the whole point using roles with limited privileges?

Let me introduce you the SET ROLE command, which can be used to switch the session to arbitrary role the user is member of. So as the pool_user user is member of both admin_user and web_user roles, the connection pool or application may use this:

SET ROLE admin_user

to switch it to “full” privileges for the admin interface, or

SET ROLE web_user

when the connection is intended for the website.

These commands are akin to dropping privileges in Unix. The init scripts are executed as root, but you really don’t want to run all the services as root, so the init script does something like sudo -u or chpst to switch to unprivileged user.

But wait, we can actually do the opposite. We can start with “no privileges” by default, all we need to do is create the role like this:

CREATE USER pool_user NOINHERIT IN ROLE admin_user, web_user

The user is still member of the two roles (and so can switch to them using SET ROLE), but inherits no privileges from them. This has the benefit that if the pool or application fails to do the SET ROLE, it will fail due to lack of privileges on the database objects (instead of silently proceeding with full privileges). So instead of starting with full privileges and eventually dropping most of them, with NOINHERIT we start with no privileges and then acquire a limited subset of them.

But why am I wasting time by explaining all this SET ROLE and INHERIT or NOIHERIT stuff? Well, it has implications on testing.

Note: You have to trust the pool/application to actually execute the SET ROLE command with the right target role, and the user must not be able to execute custom SQL on the connection (because then it’s just a matter of RESET ROLE to gain the full privileges, or SET ROLE to switch to another role). If that’s not the case, the shared connection pool is not a path forward for you.

Testing roles

Pretty much no one tests privileges. Or to be more accurate everyone tests the positive case implicitly, because if you don’t get the necessary privileges the application breaks down. But only very few people verify that there are no unnecessary/unexpected privileges.

The most straightforward way to test absence of privileges (user has no access) might be to walk through all existing objects (tables, columns) and try all compatible privileges. But that’s obviously a lot of combinations and a lot of additional schema-specific work (data types, constraints, …).

Luckily, PostgreSQL provides a collection of useful functions for exactly this purpose (showing just table-related ones, there are additional functions for other object types):

has_any_column_privilege(...)
has_column_privilege(...)
has_table_privilege(...)

So for example it’s trivial to check which roles have INSERT privilege on a given table:

SELECT rolname FROM pg_roles WHERE has_table_privilege(rolname, 'table', 'INSERT')

or listing tables accessible by a given role:

SELECT oid, relname FROM pg_class WHERE has_table_privilege('user', oid, 'INSERT')

And similarly for other privileges and object types. The testing seems fairly trivial – simply run a bunch of queries for the application users, check that the result matches expectation and we’re done.

Note: It’s also possible to use the information_schema, e.g. table_privileges which essentially just runs a query with has_table_privilege and formats the output nicely.

Except there’s a small catch – the inheritance. It works just fine as long as the role inherits privileges through membership, but as soon as there’s a NOINHERIT somewhere, those privileges will not be considered when checking the access (both in the functions and information_schema). Which makes sense, because the current user does not currently have the privileges, but can gain them easily using SET ROLE.

But of course, PostgreSQL also includes pg_has_role() function, so we can merge the privileges from all the roles, for example like this:

SELECT DISTINCT relname
  FROM pg_roles CROSS JOIN pg_class
 WHERE pg_has_role('user', rolname, 'MEMBER')
   AND has_table_privilege(rolname, pg_class.oid, 'SELECT')

Making this properly testable requires more work (to handle additional object types and applicable privileges), but you get the idea.

Summary

Let me briefly summarize this blog post:

separate the owner and application user– Don’t use a single role for both things.
consider using SET ROLE role– Either drop (INHERIT) or acquire (NOINHERIT).
test the expected privileges– Ideally run this as part of regular unit tests if possible.
keep it simple– It’s definitely better to have a simple hierarchy of roles you understand.

↧

Shaun M. Thomas: PG Phriday: Trusty Table Tiers

May 20, 2016, 9:00 am

≫ Next: REGINA OBE: pgRouting 2.2.3 released with support for PostgreSQL 9.6beta1

≪ Previous: Tomas Vondra: Auditing Users and Roles in PostgreSQL

I always advocate breaking up large Postgres tables for a few reasons. Beyond query performance concerns, maintaining one monolithic structure is always more time consuming and consequentially more dangerous. The time required to create a dozen small indexes may be slightly longer than a single larger one, but we can treat the smaller indexes as incremental. If we want to rebuild, add more indexes, or fix any corruption, why advocate an all-or-nothing proposition? Deleting from one large table will be positively glacial compared to simply dropping an entire expired partition. The list just goes on and on.

On the other hand, partitioning in Postgres can be pretty intimidating. There are so many manual steps involved, that it’s easy to just kick the can down the road and tackle the problem later, or not at all. Extensions like the excellent pg_partman remove much of the pain involved in wrangling an army of partitions, and we strongly suggest using some kind of tool-kit instead of reinventing the wheel.

The main limitation with most existing partition management libraries is that they never deviate from the examples listed in the Postgres documentation. It’s always: create inherited tables, add redirection triggers, automate, rinse, repeat. In most cases, this is exactly the right approach. Unfortunately triggers are slow, and especially in an OLTP context, this can introduce sufficient overhead that partitions are avoided entirely.

Well, there is another way to do partitioning that’s almost never mentioned. The idea is to actually utilize the base table as a storage target, and in lieu of triggers, schedule data movement during low-volume time periods. The primary benefit to this is that there’s no more trigger overhead. It also means we can poll the base table itself for recent data with the ONLY clause. This is a massive win for extremely active tables, and the reason tab_tier was born.

Let’s create some data for testing this out:

CREATETABLE sensor_log (
  id            INTPRIMARYKEY,
  location      VARCHARNOTNULL,
  reading       BIGINTNOTNULL,
  reading_date  TIMESTAMPNOTNULL);
 
INSERTINTO sensor_log (id, location, reading, reading_date)SELECT s.id, s.id % 1000, s.id % 100,CURRENT_DATE-((s.id *10)||'s')::INTERVALFROM generate_series(1,5000000) s(id);
 
CREATEINDEX idx_sensor_log_location ON sensor_log (location);
CREATEINDEX idx_sensor_log_date ON sensor_log (reading_date);
 
ANALYZE sensor_log;

Now we have 5-million rows in a table with a defined date column that’s a perfect candidate for partitioning. The way this data is currently distributed, we have content going back to late 2014. Imagine in this scenario we don’t need this much live information at all times. So we decide to keep one week of logs for active use, and relegate everything else into some kind of monthly partition.

This is how all of that would look in tab_tier:

CREATE EXTENSION tab_tier;
 
SELECT tab_tier.register_tier_root('public','sensor_log','reading_date');
 
UPDATE tab_tier.tier_root
   SET root_retain ='1 week'::INTERVAL,
       part_period ='1 month'::INTERVALWHERE root_schema ='public'AND root_table ='sensor_log';
 
SELECT tab_tier.bootstrap_tier_parts('public','sensor_log');
 
\dt
 
                 List OF relations
 Schema |          Name          |TYPE|  Owner   
--------+------------------------+-------+----------
 public | sensor_log             |TABLE| postgres
 public | sensor_log_part_201410 |TABLE| postgres
 public | sensor_log_part_201411 |TABLE| postgres
 public | sensor_log_part_201412 |TABLE| postgres
 public | sensor_log_part_201501 |TABLE| postgres
 public | sensor_log_part_201502 |TABLE| postgres
 public | sensor_log_part_201503 |TABLE| postgres
 public | sensor_log_part_201504 |TABLE| postgres
 public | sensor_log_part_201505 |TABLE| postgres
 public | sensor_log_part_201506 |TABLE| postgres
 public | sensor_log_part_201507 |TABLE| postgres
 public | sensor_log_part_201508 |TABLE| postgres
 public | sensor_log_part_201509 |TABLE| postgres
 public | sensor_log_part_201510 |TABLE| postgres
 public | sensor_log_part_201511 |TABLE| postgres
 public | sensor_log_part_201512 |TABLE| postgres
 public | sensor_log_part_201601 |TABLE| postgres
 public | sensor_log_part_201602 |TABLE| postgres
 public | sensor_log_part_201603 |TABLE| postgres
 public | sensor_log_part_201604 |TABLE| postgres
 public | sensor_log_part_201605 |TABLE| postgres

Taking this piece by piece, the first thing we did after creating the extension itself, was to call the register_tier_root function. This officially tells tab_tier about the table, and creates a record with configuration elements we can tweak. And that’s exactly what we do by setting the primary retention window and the partition size. Creating all of the partitions manually is pointless, so we also invoke bootstrap_tier_parts. Its job is to check the range of dates currently represented in the table, and create all of the partitions necessary to store it.

What did not happen here, is any data movement. This goes back to our original concern regarding maintenance. Some tables may be several GB or even TB in size, and moving all of that data as one gargantuan operation would be a really bad idea. Instead, tab_tier provides the migrate_tier_data function to relocate data for a specific partition.

With a bit of clever SQL, we can even generate a script for it:

COPY (SELECT'SELECT tab_tier.migrate_tier_data(''public'', ''sensor_log'', '''||REPLACE(part_table,'sensor_log_part_','')||''');'AS part_name
    FROM tab_tier.tier_part
    JOIN tab_tier.tier_root USING(tier_root_id)WHERE root_schema ='public'AND root_table ='sensor_log'ORDERBY part_table
)TO'/tmp/move_parts.sql';
 
\i /tmp/move_parts.SQL 
SELECTCOUNT(*)FROMONLY sensor_log;
 
 COUNT-------60480 
SELECTCOUNT(*)FROM sensor_log_part_201504;
 
 COUNT--------259200

Following some debugging notices, all of our data has moved to the appropriate partition. We verified that by checking the base table and a randomly chosen partition for record counts. At this point, the table is now ready for regular maintenance. In this case “maintenance” means regularly calling the cap_tier_partitions and migrate_all_tiers functions. The first ensures target partitions always exist, and the second moves any pending data to a waiting partition for all tables we’ve registered.

And that’s it. We’re completely done with this table. If we stopped here, we could be secure in the knowledge we no longer have to worry about some gigantic monolith ruining our day some time in the future. But that’s not how tab_tier got its name. One or two levels does not a tier make; the real “secret sauce” is its support for long term storage.

One thing we didn’t really cover, and most partition systems never even consider, is that partitioning is only half of the story. On an extremely active system, having months or years of data just sitting around is relatively frowned upon. The mere presence of older data might encourage using it, transforming our finely tuned OLTP engine into a mixed workload wreck. One or two queries against those archives, and suddenly our cache is tainted and everything is considerably slower.

We need to move that data off of the system, and there are quite a few ways to do that. Some might use ETL scripts or systems like talend to accomplish that goal. Or we can just use tab_tier and a Postgres foreign table. Let’s now dictate that only six months of archives should ever exist on the primary server. Given that constraint, this is how we could proceed:

-- Do this on some kind of archive server 
CREATEUSER arc_user PASSWORD 'PasswordsAreLame';
 
CREATETABLE sensor_log (
  id            INTPRIMARYKEY,
  location      VARCHARNOTNULL,
  reading       BIGINTNOTNULL,
  reading_date  TIMESTAMPNOTNULL,
  snapshot_dt   TIMESTAMPWITHOUTTIME ZONE
);
 
GRANTALLON sensor_log TO arc_user;
 
-- Back on the data source.., 
UPDATE tab_tier.tier_root
   SET lts_threshold ='6 months'::INTERVAL,
       lts_target ='public.sensor_log_archive'WHERE root_schema ='public'AND root_table ='sensor_log';
 
CREATE EXTENSION postgres_fdw;
 
CREATEUSER arc_user PASSWORD 'PasswordsAreLame';
GRANT tab_tier_role TO arc_user;
GRANTALLONALLTABLESIN SCHEMA PUBLIC TO tab_tier_role;
 
CREATE SERVER arc_srv 
  FOREIGNDATA WRAPPER postgres_fdw 
  OPTIONS (dbname 'postgres', host 'archive-host');
 
CREATEUSER MAPPING FOR arc_user 
  SERVER arc_srv 
  OPTIONS (USER'arc_user', password 'PasswordsAreLame');
 
CREATEFOREIGNTABLE sensor_log_archive (
  id            INT,
  location      VARCHARNOTNULL,
  reading       BIGINTNOTNULL,
  reading_date  TIMESTAMPNOTNULL,
  snapshot_dt   TIMESTAMPWITHOUTTIME ZONE
 
) SERVER arc_srv OPTIONS (TABLE_NAME'sensor_log');
 
GRANTINSERTON sensor_log_archive TO tab_tier_role;
 
-- Connect as arc_user, then run this: 
SELECT tab_tier.archive_tier('public','sensor_log');
 
SELECTCOUNT(*)FROM sensor_log_archive;
 
  COUNT---------3263360

Whew! That was a lot of work. Maybe a future version of tab_tier should provide a wrapper for that. In any case, all we did was set up a foreign table on a remote server, create a separate user to handle the data movement, and tell tab_tier about our six month threshold for long term storage, and the target table itself.

Using a foreign table isn’t required here, since the target can be any kind of table, but isn’t that the whole point of this exercise? The cool thing about Postgres foreign data wrappers is that we could have used any of them. In this case we’re just moving data to another remote Postgres instance, but we could have dumped everything into Cassandra or Hadoop instead. Take that, subspace!

For those who noticed all of the ridiculous GRANT statements, please remember this is only for demonstration purposes. A real system would probably use ALTER DEFAULT PRIVILEGES to give tab_tier_role more limited control over a specific schema and tables specifically designed for archival. The extension doesn’t add its own privileges—even to tables it creates—in case controls are tightly locked down. We don’t want to hijack any carefully laid down security. Instead tab_tier just propagates any ACLs it finds on root tables to new partitions.

This is the same reason we ran the archive_tier (or archive_all_tiers) routine as a different user. Since we’re using a foreign user mapping, we want to limit data leak potential by isolating the movement process from the table owner or a superuser. We recommend using this approach for any foreign table usage whenever possible.

With all of that out of the way, we still need to clean up. We archived all of the partition content, but the partitions themselves are still sitting around and gathering dust. Let’s fix that by running one final step as the owner of sensor_log or any superuser:

SELECT part_table
  FROM tab_tier.tier_part
 WHERE is_archived;
 
       part_table       
------------------------
 sensor_log_part_201410
 sensor_log_part_201411
 sensor_log_part_201412
 sensor_log_part_201501
 sensor_log_part_201502
 sensor_log_part_201503
 sensor_log_part_201504
 sensor_log_part_201505
 sensor_log_part_201506
 sensor_log_part_201507
 sensor_log_part_201508
 sensor_log_part_201509
 sensor_log_part_201510
 
SELECT tab_tier.drop_archived_tiers();
 
SELECTCOUNT(*)FROM sensor_log_archive;
 
  COUNT---------1736640

During the archival process itself, tab_tier marks the related metadata so archived tables will no longer be used in any of the data movement functions. It also makes them an easy target for removal with a maintenance function. We can see that everything worked as a large portion of our data is no longer part of the sensor_log inheritance tree. Now the archived data is securely located on another system that’s probably geared more toward OLAP use, or some incomprehensible Hive we don’t have to worry about.

I for one, welcome our incomprehensible Hive overlords.

↧

REGINA OBE: pgRouting 2.2.3 released with support for PostgreSQL 9.6beta1

May 20, 2016, 9:05 pm

≫ Next: Scott Mead: Helpful PostgreSQL Logging and Defaults

≪ Previous: Shaun M. Thomas: PG Phriday: Trusty Table Tiers

pgRouting 2.2.3 was released last week. Main change is this version now supports PostgreSQL 9.6. Many thanks to Vicky Vergara for working thru the issues with PostgreSQL 9.6 and getting it to work. Vicky has also been doing a good chunk of the coding (a lot of Boost refactoring and integrating more Boost features), testing, and documentation in pgRouting, osm2pgrouting, and QGIS pgRoutingLayer in general for pgRouting 2.1, 2.2, and upcoming 2.3. We are very indebted to her for her hard work.

If you are a windows user testing the waters of PostgreSQL 9.6beta1, we have pgRouting 2.2.3 binaries and PostGIS 2.3.0dev binaries at http://postgis.net/windows_downloads.

Continue reading "pgRouting 2.2.3 released with support for PostgreSQL 9.6beta1"

↧

Scott Mead: Helpful PostgreSQL Logging and Defaults

May 19, 2016, 10:32 am

≫ Next: Károly Nagy: Postgresql server fails to start in recovery with systemd

≪ Previous: REGINA OBE: pgRouting 2.2.3 released with support for PostgreSQL 9.6beta1

TL;DR

logging_collector = on

log_filename = ‘postgresql-%a.log’

log_truncate_on_rotation=on

log_line_prefix = ‘%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h ‘

log_checkpoints = on

log_autovacuum_min_duration = 0

log_temp_files = 0

log_lock_waits = on

the goal here is to have postgres ‘bubble-up’ details about what’s going on to us so

Scott’s logging defaults

logging_collector = on

log_filename = ‘postgresql-%a.log’

log_truncate_on_rotation=on

This essentially says “when I switch log files, if a log already exists with that name, truncate the existing one and write new logs to an empty file”.

For example, on Monday, May 15th, we wrote a log file:

postgresql-Mon.log

Now, at midnight on Monday, May 22nd, postgres is going to rotate its log back to:

postgresql-Mon.log

log_line_prefix = ‘%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h ‘

default log file error message:

LOG: database system was shut down at 2016-05-19 11:40:57 EDT
LOG: MultiXact member wraparound protections are now enabled
LOG: database system is ready to accept connections
LOG: autovacuum launcher started
ERROR: column "create_error_for_fun" does not exist at character 8
STATEMENT: select create_error_for_fun;

With a rich log_line_prefix

2016-05-19 11:46:22 EDT [14913]: [1-1] user=,db=,app=,client= LOG: database system was shut down at 2016-05-19 11:46:21 EDT
2016-05-19 11:46:22 EDT [14913]: [2-1] user=,db=,app=,client= LOG: MultiXact member wraparound protections are now enabled
2016-05-19 11:46:22 EDT [14911]: [3-1] user=,db=,app=,client= LOG: database system is ready to accept connections
2016-05-19 11:46:22 EDT [14917]: [1-1] user=,db=,app=,client= LOG: autovacuum launcher started
2016-05-19 11:46:27 EDT [14921]: [1-1] user=postgres,db=postgres,app=psql,client=[local] ERROR: column "create_error_for_fun" does not exist at character 8
2016-05-19 11:46:27 EDT [14921]: [2-1] user=postgres,db=postgres,app=psql,client=[local] STATEMENT: select create_error_for_fun;

Now, I know who, what, where, why and when.

log_checkpoints = on

2016-05-19 12:01:26 EDT [15274]: [1-1] user=,db=,app=,client= LOG: checkpoint starting: immediate force wait
2016-05-19 12:01:26 EDT [15274]: [2-1] user=,db=,app=,client= LOG: checkpoint complete: wrote 0 buffers (0.0%); 
     0 transaction log file(s) added, 0 removed, 0 recycled; write=0.000 s, sync=0.000 s, total=0.001 s; 
     sync files=0, longest=0.000 s, average=0.000 s

(NB: There was some line-wrap here, I’ve manually entered carriage returns. The second line prints as one, long line)

log_autovacuum_min_duration = 0

For the uninitiated, autovacuum is essentially a background process that does garbage collection in postgres. If you’re just starting out, what you really need to know is the following:

It’s critical that autovacuum stay enabled
autovacuum is another background process that uses IOPS

2016-05-19 12:32:25 EDT [16040]: [4-1] user=,db=,app=,client= LOG: automatic vacuum of table "postgres.public.pgbench_branches": index scans: 1
 pages: 0 removed, 12 remain
 tuples: 423 removed, 107 remain, 3 are dead but not yet removable
 buffer usage: 52 hits, 1 misses, 1 dirtied
 avg read rate: 7.455 MB/s, avg write rate: 7.455 MB/s
 system usage: CPU 0.00s/0.00u sec elapsed 0.00 sec
2016-05-19 12:32:25 EDT [16040]: [5-1] user=,db=,app=,client= LOG: automatic analyze of table "postgres.public.pgbench_branches" system usage: CPU 0.00s/0.00u sec elapsed 0.00 sec
2016-05-19 12:32:27 EDT [16040]: [6-1] user=,db=,app=,client= LOG: automatic analyze of table "postgres.public.pgbench_history" system usage: CPU 0.01s/0.13u sec elapsed 1.70 sec

log_temp_files = 0

2016-05-19 12:31:20 EDT [15967]: [1-1] user=postgres,db=postgres,app=pgbench,client=[local] LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp15967.0", size 200204288
2016-05-19 12:31:20 EDT [15967]: [2-1] user=postgres,db=postgres,app=pgbench,client=[local] STATEMENT: alter table pgbench_accounts add primary key (aid)

log_lock_waits = on

 2016-05-19 13:14:50 EDT [17094]: [1-1] user=postgres,db=postgres,app=psql,client=[local] LOG: process 17094 still waiting for RowExclusiveLock on relation 16847 of database 12403 after 1000.794 ms at character 13
 2016-05-19 13:14:50 EDT [17094]: [2-1] user=postgres,db=postgres,app=psql,client=[local] DETAIL: Process holding the lock: 17086. Wait queue: 17094.
 2016-05-19 13:14:50 EDT [17094]: [3-1] user=postgres,db=postgres,app=psql,client=[local] STATEMENT: delete from pgbench_tellers ;

These are what I call ‘reasonable defaults’ for logging in postgres. Again, these are the settings that I configure every time I setup a new cluster, whether it’s for dev / test / toy / prod.

Happy querying!

↧

Károly Nagy: Postgresql server fails to start in recovery with systemd

May 22, 2016, 5:38 am

≫ Next: Craig Ringer: PostgreSQL-based application performance: latency and hidden delays

≪ Previous: Scott Mead: Helpful PostgreSQL Logging and Defaults

I ran into an issue while trying to setup a simple test system for myself to move from CentOS 6 to 7. When I was about to start a standby slave systemd reported PostgreSQL startup timeout and stopped it. Although it was running perfectly fine doing recovery.

Issue

In var log message systemd reports failure

systemd: postgresql-9.5.service start operation timed out. Terminating.
systemd: Failed to start PostgreSQL 9.5 database server.
systemd: Unit postgresql-9.5.service entered failed state.
systemd: postgresql-9.5.service failed.

Meanwhile in postgresql

[1342]: [3-1] host=,user=,db=,tx=0,vtx= LOG:  received smart shutdown request
[1383]: [3-1] host=,user=,db=,tx=0,vtx= LOG:  shutting down
[1383]: [4-1] host=,user=,db=,tx=0,vtx= LOG:  database system is shut down

Cause

In the systemd script postgresql is being started with -w flag which means ” -w wait until operation complete”. Hence systemd fails after configured timeout.

ExecStart=/usr/pgsql-9.5/bin/pg_ctl start -D ${PGDATA} -s -w -t 300

Fix

Change the -w flag to -W and systemd service file (/usr/lib/systemd/system/postgresql-9.5.service) and reload the daemon.

systemctl daemon-reload
service postgresql-9.5 start

Hopefully this will save you couple of minutes debugging.

↧

Craig Ringer: PostgreSQL-based application performance: latency and hidden delays

May 23, 2016, 8:18 am

≫ Next: Daniel Pocock: PostBooks, PostgreSQL and pgDay.ch talk

≪ Previous: Károly Nagy: Postgresql server fails to start in recovery with systemd

Goldfields Pipeline, by SeanMac (Wikimedia Commons)

If you’re trying to optimise the performance of your PostgreSQL-based application you’re probably focusing on the usual tools: EXPLAIN (BUFFERS, ANALYZE), pg_stat_statements, auto_explain, log_statement_min_duration, etc.

Maybe you’re looking into lock contention with log_lock_waits, monitoring your checkpoint performance, etc too.

But did you think about network latency? Gamers know about network latency, but did you think it mattered for your application server?

Latency matters

Typical client/server round-trip network latencies can range from 0.01ms (localhost) through the ~0.5ms of a switched network, 5ms of WiFi, 20ms of ADSL, 300ms of intercontinental routing, and even more for things like satellite and WWAN links.

A trivial SELECT can take in the order of 0.1ms to execute server-side. A trivial INSERT can take 0.5ms.

Every time your application runs a query it has to wait for the server to respond with success/failure and possibly a result set, query metadata, etc. This incurs at least one network round trip delay.

When you’re working with small, simple queries network latency can be significant relative to the execution time of your queries if your database isn’t on the same host as your application.

Many applictions, particularly ORMs, are very prone to running lots of quite simple queries. For example, if your Hibernate app is fetching an entity with a lazily fetched @OneToMany relationship to 1000 child items it’s probably going to do 1001 queries thanks to the n+1 select problem, if not more. That means it’s probably spending 1000 times your network round trip latency just waiting. You can left join fetch to avoid that… but then you transfer the parent entity 1000 times in the join and have to deduplicate it.

Similarly, if you’re populating the database from an ORM, you’re probably doing hundreds of thousands of trivial INSERTs… and waiting after each and every one for the server to confirm it’s OK.

It’s easy to try to focus on query execution time and try to optimise that, but there’s only so much you can do with a trivial INSERT INTO ...VALUES .... Drop some indexes and constraints, make sure it’s batched into a transaction, and you’re pretty much done.

What about getting rid of all the network waits? Even on a LAN they start to add up over thousands of queries.

COPY

One way to avoid latency is to use COPY. To use PostgreSQL’s COPY support your application or driver has to produce a CSV-like set of rows and stream them to the server in a continuous sequence. Or the server can be asked to send your application a CSV-like stream.

Either way, the app can’t interleave a COPY with other queries, and copy-inserts must be loaded directly into a destination table. A common approach is to COPY into a temporary table, then from there do an INSERT INTO ... SELECT ..., UPDATE ... FROM ...., DELETE FROM ... USING..., etc to use the copied data to modify the main tables in a single operation.

That’s handy if you’re writing your own SQL directly, but many application frameworks and ORMs don’t support it, plus it can only directly replace simple INSERT. Your application, framework or client driver has to deal with conversion for the special representation needed by COPY, look up any required type metadata its self, etc.

(Notable drivers that do support COPY include libpq, PgJDBC, psycopg2, and the Pg gem… but not necessarily the frameworks and ORMs built on top of them.)

PgJDBC – batch mode

PostgreSQL’s JDBC driver has a solution for this problem. It relies on support present in PostgreSQL servers since 8.4 and on the JDBC API’s batching features to send a batch of queries to the server then wait only once for confirmation that the entire batch ran OK.

Well, in theory. In reality some implementation challenges limit this so that batches can only be done in chunks of a few hundred queries at best. The driver can also only run queries that return result rows in batched chunks if it can figure out how big the results will be ahead of time. Despite those limitations, use of Statement.executeBatch() can offer a huge performance boost to applications that are doing tasks like bulk data loading remote database instances.

Because it’s a standard API it can be used by applications that work across multiple database engines. Hibernate, for example, can use JDBC batching though it doesn’t do so by default.

libpq and batching

Most (all?) other PostgreSQL drivers have no support for batching. PgJDBC implements the PostgreSQL protocol completely independently, wheras most other drivers internally use the C library libpq that’s supplied as part of PostgreSQL.

libpq does not support batching. It does have an asynchronous non-blocking API, but the client can still only have one query “in flight” at a time. It must wait until the results of that query are received before it can send another.

The PostgreSQL server supports batching just fine, and PgJDBC uses it already. So I’ve written batch support for libpq and submitted it as a candidate for the next PostgreSQL version. Since it only changes the client, if accepted it’ll still speed things up when connecting to older servers.

I’d be really interested in feedback from authors and advanced users of libpq-based client drivers and developers of libpq-based applications. The patch applies fine on top of PostgreSQL 9.6beta1 if you want to try it out. The documentation is detailed and there’s a comprehensive example program.

Performance

I thought a hosted database service like RDS or Heroku Postgres would be a good example of where this kind of functionality would be useful. In particular, accessing them from ourside their own networks really shows how much latency can hurt.

At ~320ms network latency:

500 inserts without batching: 167.0s
500 inserts with batching: 1.2s

… which is over 120x faster.

You won’t usually be running your app over an intercontinental link between the app server and the database, but this serves to highlight the impact of latency. Even over a unix socket to localhost I saw over a 50% performance improvement for 10000 inserts.

Batching in existing apps

It is unfortunately not possible to automatically enable batching for existing applications. Apps have to use a slightly different interface where they send a series of queries and only then ask for the results.

It should be fairly simple to adapt apps that already use the asynchronous libpq interface, especially if they use non-blocking mode and a select()/poll()/epoll()/WaitForMultipleObjectsEx loop. Apps that use the synchronous libpq interfaces will require more changes.

Batching in other client drivers

Similarly, client drivers, frameworks and ORMs will generally need interface and internal changes to permit the use of batching. If they’re already using an event loop and non-blocking I/O they should be fairly simple to modify.

I’d love to see Python, Ruby, etc users able to access this functionality, so I’m curious to see who’s interested. Imagine being able to do this:

import psycopg2
conn = psycopg2.connect(...)
cur = conn.cursor()

# this is just an idea, this code does not work with psycopg2:
futures = [ cur.async_execute(sql) for sql in my_queries ]
for future in futures:
    result = future.result  # waits if result not ready yet
    ... process the result ...
conn.commit()

Asynchronous batched execution doesn’t have to be complicated at the client level.

COPY is fastest

Where practical clients should still favour COPY. Here are some results from my laptop:

inserting 1000000 rows batched, unbatched and with COPY
batch insert elapsed:      23.715315s
sequential insert elapsed: 36.150162s
COPY elapsed:              1.743593s
Done.

Batching the work provides a surprisingly large performance boost even on a local unix socket connection…. but COPY leaves both individual insert approaches far behind it in the dust.

Use COPY.

The image

The image for this post is of the Goldfields Water Supply Scheme pipeline from Mundaring Weir near Perth in Western Australia to the inland (desert) goldfields. It’s relevant because it took so long to finish and was under such intense criticism that its designer and main proponent, C. Y. O’Connor, committed suicide 12 months before it was put into commission. Locally people often (incorrectly) say that he died after the pipeline was built when no water flowed – because it just took so long everyone assumed the pipeline project had failed. Then weeks later, out the water poured.

↧

Daniel Pocock: PostBooks, PostgreSQL and pgDay.ch talk

May 23, 2016, 10:35 am

≫ Next: Pavel Stehule: plpgsql_check 1.0.5 released

≪ Previous: Craig Ringer: PostgreSQL-based application performance: latency and hidden delays

PostBooks 4.9.5 was recently released and the packages for Debian (including jessie-backports), Ubuntu and Fedora have been updated.

Postbooks at pgDay.ch in Rapperswil, Switzerland

pgDay.ch is coming on Friday, 24 June. It is at the HSR Hochschule für Technik Rapperswil, at the eastern end of Lake Zurich.

I'll be making a presentation about Postbooks in the business track at 11:00.

Getting started with accounting using free, open source software

If you are not currently using a double-entry accounting system or if you are looking to move to a system that is based on completely free, open source software, please see my comparison of free, open source accounting software.

Free and open source solutions offer significant advantages: flexibility, businesses can choose any programmer to modify the code, and use of SQL back-ends, multi-user support and multi-currency support are standard. These are all things that proprietary vendors charge extra money for.

Accounting software is the lowest common denominator in the world of business software, people keen on the success of free and open source software may find that encouraging businesses to use one of these solutions is a great way to lay a foundation where other free software solutions can thrive.

PostBooks new web and mobile front end

xTuple, the team behind Postbooks, has been busy developing a new Web and Mobile front-end for their ERP, CRM and accounting suite, powered by the same PostgreSQL backend as the Linux desktop client.

More help is needed to create official packages of the JavaScript dependencies before the Web and Mobile solution itself can be packaged.

↧

Pavel Stehule: plpgsql_check 1.0.5 released

May 24, 2016, 5:55 am

≫ Next: Raghavendra Rao: Ways to access Oracle Database in PostgreSQL

≪ Previous: Daniel Pocock: PostBooks, PostgreSQL and pgDay.ch talk

New version of plpgsql_check is available. The most important change is support for future PostgreSQL 9.6

https://manager.pgxn.org/distributions/plpgsql_check/1.0.5
https://github.com/okbob/plpgsql_check/releases/tag/v1.0.5

↧

Raghavendra Rao: Ways to access Oracle Database in PostgreSQL

May 24, 2016, 1:08 pm

≫ Next: Magnus Hagander: www.postgresql.org is now https only

≪ Previous: Pavel Stehule: plpgsql_check 1.0.5 released

Today, organizations stores information(data) in different database systems. Each database system has a set of applications that run against it. This data is just bits and bytes on a file system - and only a database can turn the bits and bytes of data into business information. Integration and consolidation of such information(data) into one database system is often difficult. Because many of the applications that run against one database may not have an equivalent application that runs against another. To consolidate the information into one database system, we need a heterogeneous database connection. In this post, I'll demo on how you may connect PostgreSQL to one of heterogeneous database Oracle using different methods.

Below are few methods to make connection to Oracle database in PostgreSQL.

Using ODBC Driver
Using Foreign DataWrappers
Using Oracle Call Interface(OCI) Driver

Softwares used in demo(included download links) - CentOS 7 64bit, PostgreSQL 9.5, EDB Postgres Advanced Server 9.5, ODBC-Link 1.0.4, unixODBC-2.3.4, Oracle Instant Client 11.x Drivers& Oracle_FDW

Using ODBC Driver

Open DataBase Connectivity(ODBC) is a standard software API for using DBMS. The ODBC driver/ODBC Data source(API) is a library that allows applications to connect to any database for which an ODBC driver is available. It's a middle layer translates the application's data queries into commands that the DBMS understands. To use this method, an open source unixODBC and Oracle ODBC driver(Basic/ODBC/Devel) packages required. Along with a module in PostgreSQL that can communicate to DSN created using unixODBC and Oracle ODBC driver. Few years back CyberTec has released a module ODBC-Link, at present it is obsolete, however, it has a dblink-style implementation for PostgreSQL to connect to any other ODBC compliant database. We can use this module for basic connectivity to Oracle. Let's see.

Install unixODBC

tar -xvf unixODBC-2.3.4.tar.gz
cd unixODBC-2.3.4/
./configure --sysconfdir=/etc
make
make install

Binary/Libraries/configuration files location: /usr/local/bin,/usr/local/lib,/etc(odbc.ini,odbcinst.ini)

Install Oracle ODBC Driver

rpm -ivh oracle-instantclient11.2-basic-11.2.0.4.0-1.x86_64.rpm
rpm -ivh oracle-instantclient11.2-odbc-11.2.0.4.0-1.x86_64.rpm
rpm -ivh oracle-instantclient11.2-devel-11.2.0.4.0-1.x86_64.rpm

Binary/Libraries location: /usr/lib/oracle/11.2/client64

Install ODBC-Link

tar -zxvf ODBC-Link-1.0.4.tar.gz
cd ODBC-Link-1.0.4
export PATH=/opt/PostgreSQL/9.5/bin:$PATH
which pg_config 
make USE_PGXS=1
make USE_PGXS=1 install

Libraries and SQL files location: /opt/PostgreSQL/9.5/share/postgresql/contrib

Installation will create a ODBC-Link module SQL file in $PGHOME/contrib directory. Load the SQL file, which will create a schema by name "odbclink" with necessary functions in it.

psql -p 5432 -d oratest -U postgres -f /opt/PostgreSQL/9.5/share/postgresql/contrib/odbclink.sql

At this point, we have installed unixODBC Drirver, Oracle ODBC driver and ODBC-Link module for PostgreSQL. As a first step, we need to create a DSN using Oracle ODBC.

Edit /etc/odbcinst.ini file and pass the drivers deifinition

## Driver for Oracle
[MyOracle]
Description     =ODBC for oracle
Driver          =/usr/lib/oracle/11.2/client64/lib/libsqora.so.11.1
UsageCount=1
FileUsage = 1
Driver Logging = 7

Edit /etc/odbc.ini file and create the DSN with driver mentioned in /etd/odbcinst.ini

## Host: pg.raghav-node1.com, PORT: 1521
## Oracle Instance Name: ORA11G, Username: mmruser, Password: mmruser
## ODBC Data source: Ora

[Ora]
Description = myoracledb database
Driver = MyOracle
Trace = yes
TraceFile = /tmp/odbc_oracle.log
Database = //pg.raghav-node1.com:1521/ORA11G
UserID = mmruser
Password = mmruser
Port = 1521

After creating DSN, load all Oracle & unix ODBC driver libraries by setting environment variables and test the connectivity using OS command line tool "dltest" & "iSQL"

[root@172.16.210.161 ~]# export ORACLE_HOME=/usr/lib/oracle/11.2/client64
[root@172.16.210.161 ~]# export LD_LIBRARY_PATH=/usr/local/unixODBC-2.3.4/lib:/usr/lib/oracle/11.2/client64/lib
[root@172.16.210.161 ~]# export ODBCINI=/etc/odbc.ini
[root@172.16.210.161 ~]# export ODBCSYSINI=/etc/
[root@172.16.210.161 ~]# export TWO_TASK=//pg.raghav-node1.com:1521/ORA11G
[root@172.16.210.161 ~]# dltest /usr/lib/oracle/11.2/client64/lib/libsqora.so.11.1
SUCCESS: Loaded /usr/lib/oracle/11.2/client64/lib/libsqora.so.11.1
[root@172.16.210.161 ~]# isql ora -v
+---------------------------------------+
| Connected!                            |
|                                       |
| sql-statement                         |
| help [tablename]                      |
| quit                                  |
|                                       |
+---------------------------------------+
SQL>

Now, set the same environment variables for postgres user for loading the libraries and restart the PostgreSQL cluster to take effect. Connect to PostgreSQL and call odbclink functions to connect to Oracle database.

[root@172.16.210.163 ~]#su - postgres
[postgres@172.16.210.163 ~]$ export ORACLE_HOME=/usr/lib/oracle/11.2/client64
[postgres@172.16.210.163 ~]$ export LD_LIBRARY_PATH=/usr/local/unixODBC-2.3.4/lib:/usr/lib/oracle/11.2/client64/lib
[postgres@172.16.210.163 ~]$ export ODBCINI=/etc/odbc.ini
[postgres@172.16.210.163 ~]$ export ODBCSYSINI=/etc/
[postgres@172.16.210.163 ~]$ export TWO_TASK=//pg.raghav-node1.com:1521/ORA11G
[postgres@172.16.210.163 ~]$ dltest /usr/lib/oracle/11.2/client64/lib/libsqora.so.11.1
SUCCESS: Loaded /usr/lib/oracle/11.2/client64/lib/libsqora.so.11.1
[postgres@172.16.210.163 ~]$ /opt/PostgreSQL/9.5/bin/pg_ctl -D /opt/PostgreSQL/9.5/data/ stop -mf
[postgres@172.16.210.163 ~]$ /opt/PostgreSQL/9.5/bin/pg_ctl -D /opt/PostgreSQL/9.5/data/ start
[postgres@172.16.210.163 ~]$ psql
psql.bin (9.5.2)
Type "help" for help.

postgres=# select odbclink.connect('DSN=Ora');
 connect
---------
       1
(1 row)

Cool right...!!!. For retrieving and manipulating data refer to ODBC-Link README file.

Using Foreign DataWrappers

An SQL/MED(SQL Management of External Data) extension to the SQL Standard allows managing external data stored outside the database. SQL/MED provides two components Foreign data wrappers and Datalink. PostgreSQL introduced Foreign Data Wrapper(FDW) in 9.1 version with read-only support and in 9.3 version write support of this SQL Standard. Today, the latest version has a number of features around it and many varieties of FDW available to access different remote SQL databases.

Oracle_fdw provides an easy and efficient way to access Oracle Database. IMO,its one of the coolest method to access the remote database. To compile Oracle_FDW with PostgreSQL 9.5, we need Oracle Instant Client libraries and pg_config set in PATH. We can use the same Oracle Instant Client libraries used for ODBC-Link. Let's see how it works.

First, set environment variables with OIC libraries and pg_config

export PATH=/opt/PostgreSQL/9.5/bin:$PATH
export ORACLE_HOME=/usr/lib/oracle/11.2/client64
export LD_LIBRARY_PATH=/usr/lib/oracle/11.2/client64/lib

Unzip the oracle_fdw module and compile it with PostgreSQL 9.5

unzip oracle_fdw-1.4.0.zip
cd oracle_fdw-1.4.0/
make 
make install

Now switch as 'postgres' user and restart the cluster by loading Oracle Instant Client libraries required for oracle_fdw extension and create the extension inside the database.

[postgres@172.16.210.161 9.5]$ export ORACLE_HOME=/usr/lib/oracle/11.2/client64/lib
[postgres@172.16.210.161 9.5]$ export LD_LIBRARY_PATH=/usr/lib/oracle/11.2/client64/lib:$LD_LIBRARY_PATH
[postgres@172.16.210.161 9.5]$ /opt/PostgreSQL/9.5/bin/pg_ctl -D /opt/PostgreSQL/9.5/data/ stop -mf
[postgres@172.16.210.161 9.5]$ /opt/PostgreSQL/9.5/bin/pg_ctl -D /opt/PostgreSQL/9.5/data/ start
[postgres@172.16.210.161 9.5]$ psql
Password:
psql.bin (9.5.2)
Type "help" for help.

postgres=# create extension oracle_fdw;
CREATE EXTENSION

Now you can access the Oracle database.

postgres=# CREATE SERVER oradb FOREIGN DATA WRAPPER oracle_fdw OPTIONS (dbserver '//pg.raghav-node1.com/ORA11G');
CREATE SERVER
postgres=# GRANT USAGE ON FOREIGN SERVER oradb TO postgres;
GRANT
postgres=# CREATE USER MAPPING FOR postgres SERVER oradb OPTIONS (user 'scott', password 'tiger');
CREATE USER MAPPING
postgres=#
postgres=# CREATE FOREIGN TABLE oratab (ecode integer,name char(30)) SERVER oradb OPTIONS(schema 'SCOTT',table 'EMP');
CREATE FOREIGN TABLE
postgres=# select * from oratab limit 3;
 ecode |              name
-------+--------------------------------
  7369 | SMITH
  7499 | ALLEN
  7521 | WARD
(3 rows)

Using Oracle Call Interface(OCI) Drivers

Oracle Call Interface(OCI) a type-2 driver freely available on Oracle site which allows the client to connect to Oracle database. EDB Postgres Advanced Server (also called EPAS) a proprietary product has built-in OCI-based database link module called dblink_ora, which connects to Oracle database using Oracle OCI drivers. All you have to do to use dblink_ora module, install EPAS(not covering installation) and tell EPAS where it can find Oracle OCI driver libraries. We can make use of same Oracle Instant Client by specifying its libraries location in LD_LIBRARY_PATH environment variable and to take effect restart the EPAS cluster.

First, switch as "enterprisedb" user, load the libraries and restart the cluster. That's all, we are good to access Oracle database.

[enterprisedb@172.16.210.129 ~]$ export LD_LIBRARY_PATH=/usr/lib/oracle/11.2/client64/lib
[enterprisedb@172.16.210.129 bin]$ /opt/PostgresPlus/9.5AS/bin/pg_ctl -D /opt/PostgresPlus/9.5AS/data/ restart
[enterprisedb@172.16.210.129 bin]$ psql
psql.bin (9.5.0.5)
Type "help" for help.

edb=# select dblink_ora_connect('oraconn','localhost','edbora','edbuser','edbuser',1521);
 dblink_ora_connect
--------------------
 OK
(1 row)

Note: EPAS connects to the Oracle Database using Oracle Instant Client library "libclntsh.so". If you won't find the library in Oracle Client Library location then create the symbolic link with libclntsh.so pointing to the libclntsh.so.version.number. Refer to documentation.

In the example, dblink_ora_connect establishes a connection to an Oracle database with the user-specified connection information. Later using link name('oraconn' in my case) we can perform operations like SELECT,INSERT,DELETE,UPDATE & COPY using dblink_ora* functions. All functions you can refer from the EnterpriseDB documentation here.

All the above methods will be very handy if you are working on migration projects. Hope its helpful. Thank you

--Raghav

↧

Magnus Hagander: www.postgresql.org is now https only

May 24, 2016, 1:09 pm

≫ Next: Umair Shahid: Using Hibernate Query Language (HQL) with PostgreSQL

≪ Previous: Raghavendra Rao: Ways to access Oracle Database in PostgreSQL

We've just flipped the switch on www.postgresql.org to be served on https only. This has been done for a number of reasons:

In response to popular request
Google, and possibly other search engines, have started to give higher scores to sites on https, and we were previously redirecting accesses to cleartext
Simplification of the site code which now doesn't have to keep track of which pages need to be secure and which does not
Prevention of evil things like WiFi hotspot providers injecting ads or javascript into the pages

We have not yet enabled HTTP Strict Transport Security, but will do so in a couple of days once we have verified all functionality. We have also not enabled HTTP/2 yet, this will probably come at a future date.

Please help us out with testing this, and let us know if you find something that's not working, by emailing the pgsql-www mailinglist.

There are still some other postgresql.org websites that are not available over https, and we will be working on those as well over the coming weeks or months.

↧

Umair Shahid: Using Hibernate Query Language (HQL) with PostgreSQL

May 25, 2016, 12:27 am

≫ Next: Pavel Stehule: Orafce 3.3.0 was released

≪ Previous: Magnus Hagander: www.postgresql.org is now https only

In my previous blog, I talked about using Java arrays to talk to PostgreSQL arrays. This blog is going to go one step further in using Java against the database. Hibernate is an ORM implementation available to use with PostgreSQL. Here we discuss its query language, HQL.

The syntax for HQL is very close to that of SQL, so anyone who knows SQL should be able to ramp up very quickly. The major difference is that rather than addressing tables and columns, HQL deals in objects and their properties. Essentially, it is a complete object oriented language to query your database using Java objects and their properties. As opposed to SQL, HQL understands inheritance, polymorphism, & association. Code written in HQL is translated by Hibernate to SQL at runtime and executed against the PostgreSQL database.

An important point to note here is, references to objects and their properties in HQL are case-sensitive; all other constructs are case insensitive.

Why Use HQL?

The main driver to using HQL would be database portability. Because its implementation is designed to be database agnostic, if your application uses HQL for querying the database, you can interchange the underlying database by making simple changes to the configuration XML file. As opposed to native SQL, the actual code will remain largely unchanged if your application starts talking to a different database.

Prominent Features

A complete list of features implemented by HQL can be found on their website. Here, we present examples of some basic and salient features that will help you get going on HQL. These examples are using a table by the name of ‘largecities’ that lists out the 10 largest metropolitans of the world. The descriptor and data are:

postgres=# \d largecities
 Table "public.largecities"
 Column |          Type          | Modifiers 
--------+------------------------+-----------
 rank   | integer                | not null
 name   | character varying(255) | 
Indexes:
 "largecities_pkey" PRIMARY KEY, btree (rank)

postgres=# select * from largecities; 
 rank | name 
------+-------------
    1 | Tokyo
    2 | Seoul
    3 | Shanghai
    4 | Guangzhou
    5 | Karachi
    6 | Delhi
    7 | Mexico City
    8 | Beijing
    9 | Lagos
   10 | Sao Paulo
(10 rows)

HQL works with a class that this table is mapped to in order to create objects in memory with its data. The class is defined as:

@Entity
public class LargeCities {
 @Id
 private int rank;
 private String name;

 public int getRank() {
 return rank;
 }
 public String getName() {
 return name;
 }
 public void setRank(int rank) {
 this.rank = rank;
 }
 public void setName(String name) {
 this.name = name;
 }
}

Notice the @Entity and @Id annotations, which are declare the class ‘LargeCities’ as an entity and the property ‘rank’ as the identifier.

The FROM Clause

The FROM clause is used if you want to load all rows of the table as objects in memory. The sample code given below retrieves all rows from table ‘largecities’ and lists out the data from objects to stdout.

try {

 SessionFactory sessionFactory = new Configuration().configure().buildSessionFactory();
 Session session = sessionFactory.openSession();
 session.beginTransaction();

 Query query = session.createQuery("FROM LargeCities");
 List<LargeCities> cities = (List<LargeCities>)query.list();

 session.close();

 for (LargeCities c : cities)
     System.out.println(c.getRank() + " " + c.getName());

} catch (Exception e) {
     System.out.println(e.getMessage()); 
}

Note that ‘LargeCities’ referred to in the HQL query is not the ‘largecities’ table but rather the ‘LargeCities’ class. This is the object oriented nature of HQL.

Output from the above program is as follows:

1 Tokyo
2 Seoul
3 Shanghai
4 Guangzhou
5 Karachi
6 Delhi
7 Mexico City
8 Beijing
9 Lagos
10 Sao Paulo

The WHERE Clause

There can be instances where you would want to specify a filter on the objects you want to see. Taking the above example forward, you might want to see just the top 5 largest metropolitans in the world. A WHERE clause can help you achieve that as follows:

try {

 SessionFactory sessionFactory = new Configuration().configure().buildSessionFactory();
 Session session = sessionFactory.openSession();
 session.beginTransaction();

 Query query = session.createQuery("FROM LargeCities WHERE rank < 6");
 List<LargeCities> cities = (List<LargeCities>)query.list(); 

 session.close(); 

 for (LargeCities c : cities)
     System.out.println(c.getRank() + " " + c.getName());

} catch (Exception e) {
     System.out.println(e.getMessage()); 
}

Output from the above code is:

1 Tokyo
2 Seoul
3 Shanghai
4 Guangzhou
5 Karachi

The SELECT Clause

The default FROM clause retrieve all columns from the table as properties of the object in Java. There are instances where you would want to retrieve only selected properties rather than all of them. In such a case, you can specify a SELECT clause that identifies the precise columns you want to retrieve.

The code below selects just the city name for retrieval. Note that, because it now just one column that is being retrieved, Hibernate loads it as a list of Strings rather than a list of LargeCities objects.

try {

 SessionFactory sessionFactory = new Configuration().configure().buildSessionFactory();
 Session session = sessionFactory.openSession();
 session.beginTransaction();

 Query query = session.createQuery("SELECT name FROM LargeCities");
 List<String> cities = (List<String>)query.list(); 

 session.close(); 

 for (String c : cities)
     System.out.println(c);

} catch (Exception e) {
     System.out.println(e.getMessage()); 
}

The output of this code is:

Tokyo
Seoul
Shanghai
Guangzhou
Karachi
Delhi
Mexico City
Beijing
Lagos
Sao Paulo

Named Parameters

Much like prepared statements, you can have named parameters through which you can use variables to assign values to HQL queries at runtime. The following example uses a named parameter to find out the rank of ‘Beijing’.

try {

 SessionFactory sessionFactory = new Configuration().configure().buildSessionFactory();
 Session session = sessionFactory.openSession();
 session.beginTransaction();

 Query query = session.createQuery("SELECT rank FROM LargeCities WHERE name = :city_name");
 query.setParameter("city_name", "Beijing");
 List<Integer> rank = (List<Integer>)query.list(); 

 session.getTransaction().commit();
 session.close(); 

 for (Integer c : rank)
     System.out.println("Rank is: " + c.toString());

} catch (Exception e) {
     System.out.println(e.getMessage()); 
}

Output for this code:

Rank is: 8

Pagination

When programming, numerous scenarios present themselves where code is required to be processed in chunks or pages. The process is called pagination of data and HQL provides a mechanism to handle that with a combination of setFirstResult and setMaxResults, methods of the Query interface. As the names suggest, setFirstResult allows you to specify which record should be the starting point for record retrieval while setMaxResults allows you to specify the maximum number of records to retrieve. This combination is very helpful in Java or in web apps where a large result set is shown split into pages and the user has the ability to specify the page size.

The following code breaks up our ‘largecities’ examples into 2 pages and retrieves data for them.

try {

 SessionFactory sessionFactory = new Configuration().configure().buildSessionFactory();
 Session session = sessionFactory.openSession();
 session.beginTransaction();

 Query query = session.createQuery("FROM LargeCities");

 query.setFirstResult(0); 
 query.setMaxResults(5);

 List<LargeCities> cities = (List<LargeCities>)query.list(); 

 System.out.println("*** Page 1 ***");
 for (LargeCities c : cities)
     System.out.println(c.getRank() + " " + c.getName());

 query.setFirstResult(5); 

 cities = (List<LargeCities>)query.list(); 

 System.out.println("*** Page 2 ***");
 for (LargeCities c : cities)
     System.out.println(c.getRank() + " " + c.getName());

 session.close(); 

} catch (Exception e) {
     System.out.println(e.getMessage()); 
}

An important point to keep in mind here is that Hibernate usually does pagination in memory rather than at the database query level. This means that for large data sets, it might be more efficient to use cursors, temp tables, or some other construct for pagination.

Other Features

A comprehensive list of features is available on the Hibernate website, but a few more worth mentioning here are:

UPDATE Clause
DELETE Clause
INSERT Clause
JOINs
Aggregate Methods
- avg
- count
- max
- min
- sum

Drawbacks of Using HQL

HQL gives its users a lot of flexibility and rich set of options to use while talking to a database. The flexibility does come at a price, however. Because HQL is designed to be generic and largely database-agnostic, you should watch out for the following when using HQL.

At times, you would need to use advanced features & functions that are specific to PostgreSQL. As an example, you might want to harness the power of the newly introduced JSONB data type. Or you might want to use window functions to analyze your data. Because HQL tries to be as generic as possible, in order to use such advanced features, you will need to fallback to native SQL.

Because of the way left joins are designed, if you are joining an object to another table / object in a one-to-many or many-to-many format, you can potentially get duplicate data. This problem is exacerbated in case of cascading left joins and HQL has to preserve references to these duplicates, essentially ending up transferring a lot of duplicate data. This has the potential to significantly impact performance.

Because HQL does the object-relational mapping itself, you don’t get full control over how and what data gets fetched. One such infamous issue is the N+1 problem. Although you can find workarounds within HQL, identifying the problem can at time get very tricky.

↧

Pavel Stehule: Orafce 3.3.0 was released

May 25, 2016, 4:36 am

≫ Next: Nikolay Shaplov: postgres: reloption ALTER INDEX bug

≪ Previous: Umair Shahid: Using Hibernate Query Language (HQL) with PostgreSQL

↧

Nikolay Shaplov: postgres: reloption ALTER INDEX bug

May 25, 2016, 8:57 am

≫ Next: Robert Haas: PostgreSQL Regression Test Coverage

≪ Previous: Pavel Stehule: Orafce 3.3.0 was released

It seems to me that I found a bug in current implementation of reloptions: When you are creating a custom index with it's own reloptions, you have no ways to prevent it from changing using ALTER INDEX .... SET (op=value);
For example if you do for bloom index

alter index bloomidx set ( length=15 );

postgres will successfully run this, change the value of reloptions attribute in pg_class, and bloom index will work wrong after it.
And there is no way to forbid this from inside of an extension.

I think I would add there a flag in reloption descriptor that will tell whether it is allowed to change this reloption using ALTER INDEX, or not

↧