Joshua Drake: Thanks to EDB
Josh Berkus: Changing PostgreSQL Version Numbering
9 . 5 . 3
Major1 . Major2 . Minor
That is, the second number is the "major version" number, reflecting our annual release. The third number is the update release number, reflecting cumulative patch releases. Therefore "9.5.3" is the third update to to version 9.5.
The problem is the first number, in that we have no clear criteria when to advance it. Historically, we've advanced it because of major milestones in feature development: crash-proofing for 7.0, Windows port for 8.0, and in-core replication for 9.0. However, as PostgreSQL's feature set matures, it has become less and less clear on what milestones would be considered "first digit" releases. The result is arguments about version numbering on the mailing lists every year which waste time and irritate developers.
As a result, the PostgreSQL Project is proposing a version numbering change, to the following:
10 . 2
Major . Minor
Thus "10.2" would be the second update release for major version 10. The version we release in 2017 would be "10" (instead of 10.0), and the version we release in 2018 will be "11".
The idea is that this will both put an end to the annual arguments, as well as ending the need to explain to users that 9.5 to 9.6 is really a major version upgrade requiring downtime.
Obviously, there is potential for breakage of a lot of tools, scripts, automation, packaging and more in this. That's one reason we're discussing this now, almost a year before 10 beta is due to come out.
The reason for this blog post is that I'm looking for feedback on what this version number change will break for you. Particularly, I want to hear from driver authors, automation engineers, cloud owners, application stack owners, and other folks who are "downstream" of PostgreSQL. Please let us know what technical problems this will cause for you, and how difficult it will be to resolve them in the next nine months.
We are not, at this point, interested in comments on how you feel about the version change or alternate version naming schemes. That discussion has already happened, at length. You can read it here, here, and here, as well as at the developer meeting.
Places to provide feedback:
- comments on this blog
- posts on the pgsql-hackers or pgsql-advocacy mailing lists
- PostgreSQL Facebook group
Thanks for any feedback you can provide.
Note that the next release of PostgreSQL, due later this year, will be "9.6" regardless. We're deciding what we do after that.
US PostgreSQL Association: Unsung Heroes: Steve Atkins
Continuing my Building a Better community series, I contacted Steve Atkins. There was an interesting twist to this particular Unsung Hero. I initially contacted lluad on #postgresql (irc.freenode.net) due to his continuous and untiring efforts to help people on channel. What I didn't know is that Steve Atkins is actually lluad. Read on for some interesting bits.
How do you use PostgreSQL?
I use PostgreSQL almost any time I need to store or process structured data.
Oleg Bartunov: Refined search
Better search for a wide query and use ranking to sort the results.
Example:
SELECT title,ts_rank_cd(fts, to_tsquery('english','x-ray & russian')) AS rank FROM apod WHERE fts @@ to_tsquery('english','x-ray&russian') ORDER BY rank DESC LIMIT 5; title | rank ----------------------------------------+----------- The High Energy Heart Of The Milky Way | 0.0240938 (1 row) SELECT title,ts_rank_cd(fts, to_tsquery('english','x-ray & russian')) AS rank FROM apod WHERE fts @@ to_tsquery('english','x-ray') ORDER BY rank DESC LIMIT 5; title | rank ----------------------------------------+----------- The High Energy Heart Of The Milky Way | 0.0240938 X-Ray Jet From Centaurus A | 0 Barnacle Bill And Sojourner | 0 The Crab Nebula in X-Rays | 0 M27: The Dumbbell Nebula | 0 (5 rows)
Vasilis Ventirozos: Repairing clog corruptions
A replica and backups should always be in place and the server should be properly monitored. Unfortunately this server was not managed by us so none of the above was in place..
At first I saw in the logs entries like :
FUN TIMES :)
I started by fixing all clogs missing which i found logs
dd if=/dev/zero of=/var/db/pgdata/pg_clog/0114 bs=256k count=1
dd if=/dev/zero of=/var/db/pgdata/pg_clog/00D1 bs=256k count=1
dd if=/dev/zero of=/var/db/pgdata/pg_clog/0106 bs=256k count=1
Pallavi Sontakke: How Postgres-XL is tested
The purpose of this blog is to explain the process of Quality Analysis that Postgres-XL goes through internally at 2ndQuadrant. Here, I describe the bare minimum tests that each release goes through, not to mention the other tests carried out by many 2ndQuadrant and other community members.
- Regression tests are carried out to ensure that the defect fixes or enhancements to PostgreSQL have not affected Postgres-XL.
We want to keep up-to-date with PostgreSQL features and performance enhancements. Also, many tools may work with only newer PostgreSQL releases. So, we merge each PostgreSQL minor release in a timely fashion. When merging the regression tests therein, we need to continuously and consciously ensure if the new features are supported in Postgres-XL fine. Sometimes, there may be a gap in the expected outputs of PostgreSQL and Postgres-XL, which has to be tracked.
- Functional tests are done to validate that the functionalities of Postgres-XL are working as per business requirements. We ensure that all features are functioning as they are expected to.
Initially we created the ‘Major differences and Limitations’ tests’ module. Postgres-XL project (Postgres-XC, initially) was based on PostgreSQL 9.2. It picked up speed again, during PostgreSQL 9.5 development timeframe. Due to this, there is a known gap of features that Postgres-XL would like to support, but does not currently. We keep track of these with our xl_* functional tests. These tests cover limitations like materialized views, event triggers, foreign data wrappers, etc. Also, on the other hand they cover positive functional tests for features like BRIN, logical decoding of WAL data, jsonb, etc.
Additionally, each time a new feature or enhancement is added, we keep adding tests to validate the functionality.
- Usability tests are performed to validate the ease with which the user interfaces can be used.
We have cluster-setup-utility tests. Creating a Postgres-XL cluster manually requires quite a few steps. We have automated these for ease-of-use with pgxc_ctl utility. For simple prototype use, we have added ‘prepare minimal’ way. Also, for seasoned users, we have added the ‘prepare empty’ way where they can provision node-by-node, for their specific use. We have automated TAP tests for this utility.
- Recovery tests are done to test how well Postgres-XL is able to recover from crashes, hardware failures and other similar problems.
Postgres-XL being deployed as a cluster, we realize the importance of data consistency across node crashes. We have crash-recovery test scripts that crash/kill nodes and bring them up again. In parallel sessions, we keep on making database changes with SQL statements, transactions or prepared transactions. We verify that nodes (or their configured standbys) come up fine. We perform data sanity checks to verify proper recovery.
- Bug tracking is done for all known/reported bugs in our internal tracking system.
- Future steps
We are looking into using SQLsmith to generate random queries to detect further problems in Postgres-XL.
Also, we are in the process of setting up continuous integration server for automated builds, deployments and tests of Postgres-XL project.
Scott Mead: Helpful PostgreSQL Logging and Defaults
I use PostgreSQL every single day. I develop against multiple versions of the database (4 versions running in parallel as I write), I have apps that use it and, my daily DBA-ing. The biggest question I get from newbies and veterans a like is: “What are your PostgreSQL defaults?”
If you follow postgres, you already know that the default configuration (postgresql.conf) is very conservative from a resource (CPU, IOPS, Disk Space) perspective. Over the last 10 years, we [the community] have developed some straightforward and easy to understand formulas that will help you tune… shared_buffers for example. The item that always gets left out though is logging. As a developer, I’m always looking for ways to see “How the database is answering the questions I ask”. When I get a new postgres instance set up, I have a personal (somewhere in the cobwebs) checklist that I run. Some are based on the purpose of the deployment (shared_buffers, work_mem, etc…), some are things that I always set. Aside from memory, the biggest set of “standard” items I set are all related to logging. I’m big on monitoring (my pg_stat_activity patch was accepted back for 9.2) and having the right detail presented to me is important.
TL;DR
that we can see exactly what is happening when. In addition to setting the log files, you can pick up a great log analyzer (like pgBadger or pgCluu) that will break all of this down in to a gorgeous, easy-to-use report.
Scott’s logging defaults
logging_collector = on
log_filename = ‘postgresql-%a.log’
This sets the name of the actual file that log messages will be written to (in your shiny, new pg_log / log_destination directory). The %a means that you’ll see Mon, Tue, Wed, Thu, Fri, Sat, Sun. The patterns are based on standard strftime escapes (man page). The reason that I like using %a is that you get auto-rotation on the log files. You will keep one week’s worth of logs, when you rotate 7 days later, you won’t be creating a huge number of log files.
Note: Depending on my requirements, I will adjust this for production. If I have any special retention policy that I need to abide by, I’ll make the filename: postgresql-YYYY-MM-DD-HH24mmSS.log (log_filename = ‘postgresql-%Y-%m-%d-%H%M%S.log’ ). The trouble with this is that you’ll need to deal with log cleanup yourself (cron to archive logs + 30 days old … ).
log_truncate_on_rotation=on
This essentially says “when I switch log files, if a log already exists with that name, truncate the existing one and write new logs to an empty file”.
For example, on Monday, May 15th, we wrote a log file:
postgresql-Mon.log
Now, at midnight on Monday, May 22nd, postgres is going to rotate its log back to:
postgresql-Mon.log
This is our data from last week. If you leave log_truncate_on_rotation = off (the default), then postgres will append to that log file. That means, in December, you’ll have data for every Monday throughout the year in the same file. If you set this to on, it’ll will nuke the old data and give you only data from the most recent Monday. If you need keep log files for longer than 7 days, I recommend you use a more complete name for your log files (see log_filename above).
log_line_prefix = ‘%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h ‘
The log_line_prefix controls what every single line in the log file looks like. My personal favorite here is actually stolen directly from pgBadger’s suggested configuration. pgBadger is an amazing tool for parsing and analyzing the postgres log files. If you set the log_line_prefix directly, pgBadger can provide incredible detail about: “what happens where, who did it and, when the did it”. Just to show you the difference….
default log file error message:
LOG: database system was shut down at 2016-05-19 11:40:57 EDT LOG: MultiXact member wraparound protections are now enabled LOG: database system is ready to accept connections LOG: autovacuum launcher started ERROR: column "create_error_for_fun" does not exist at character 8 STATEMENT: select create_error_for_fun;
With a rich log_line_prefix
2016-05-19 11:46:22 EDT [14913]: [1-1] user=,db=,app=,client= LOG: database system was shut down at 2016-05-19 11:46:21 EDT 2016-05-19 11:46:22 EDT [14913]: [2-1] user=,db=,app=,client= LOG: MultiXact member wraparound protections are now enabled 2016-05-19 11:46:22 EDT [14911]: [3-1] user=,db=,app=,client= LOG: database system is ready to accept connections 2016-05-19 11:46:22 EDT [14917]: [1-1] user=,db=,app=,client= LOG: autovacuum launcher started 2016-05-19 11:46:27 EDT [14921]: [1-1] user=postgres,db=postgres,app=psql,client=[local] ERROR: column "create_error_for_fun" does not exist at character 8 2016-05-19 11:46:27 EDT [14921]: [2-1] user=postgres,db=postgres,app=psql,client=[local] STATEMENT: select create_error_for_fun;
Now, I know who, what, where, why and when.
log_checkpoints = on
Like any database, postgres is going to use the disks. In order to accomplish dealing with I/O efficiently, postgres (like many other databases) uses something called a checkpoint in order to synchronize it’s memory to those disks. Checkpoints occur periodically based on a number of things (load, configuration, etc…). The thing to keep in mind is that a checkpoint will use disk I/O, the busier the database, the more it requires. Setting this to on means that you know without a doubt when a checkpoint occured. It’s the same ol’ story: “Every once in a while, I get long-running queries, different ones each time!”… “I’m seeing a spike in IOPS and I don’t know why!” … “Sometimes my data load gets bogged down for some reason!” … etc…
This very well could be due to a large checkpoint occurring. Since it’s based on load / configuration / time, it’s critical that the server write a log of when checkpoint occurred so that you’re not left in the dark. There’s also useful information in these logs about how postgres is behaving (it can even help you tune your memory settings).
2016-05-19 12:01:26 EDT [15274]: [1-1] user=,db=,app=,client= LOG: checkpoint starting: immediate force wait 2016-05-19 12:01:26 EDT [15274]: [2-1] user=,db=,app=,client= LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.000 s, sync=0.000 s, total=0.001 s; sync files=0, longest=0.000 s, average=0.000 s
(NB: There was some line-wrap here, I’ve manually entered carriage returns. The second line prints as one, long line)
It’s important to note that this is going to up the volume of your logs. But, it’s minimal and the benefits far outweigh the few extra bytes needed to write the message. (pgBadger will parse these up and give you a nice, clear picture of your checkpoint behavior).
log_autovacuum_min_duration = 0
(default: log_autovacuum_min_duration = -1). The tuning convention isn’t just a boolean on/ off. Essentially, you are telling postgres: “When an autovacuum runs for x milliseconds or longer, write a message to the log”. Setting this to 0 (zero) means that you will log all autovacuum operations to the log file.
For the uninitiated, autovacuum is essentially a background process that does garbage collection in postgres. If you’re just starting out, what you really need to know is the following:
- It’s critical that autovacuum stay enabled
- autovacuum is another background process that uses IOPS
Because autovacuum is necessary and uses IOPS, it’s critical that you know what it’s doing and when. Just like log_checkpoints (above), autovacuum runs are based on load (thresholds on update / delete velocities on each table). This means that vacuum can kick off at virtually any time.
2016-05-19 12:32:25 EDT [16040]: [4-1] user=,db=,app=,client= LOG: automatic vacuum of table "postgres.public.pgbench_branches": index scans: 1 pages: 0 removed, 12 remain tuples: 423 removed, 107 remain, 3 are dead but not yet removable buffer usage: 52 hits, 1 misses, 1 dirtied avg read rate: 7.455 MB/s, avg write rate: 7.455 MB/s system usage: CPU 0.00s/0.00u sec elapsed 0.00 sec 2016-05-19 12:32:25 EDT [16040]: [5-1] user=,db=,app=,client= LOG: automatic analyze of table "postgres.public.pgbench_branches" system usage: CPU 0.00s/0.00u sec elapsed 0.00 sec 2016-05-19 12:32:27 EDT [16040]: [6-1] user=,db=,app=,client= LOG: automatic analyze of table "postgres.public.pgbench_history" system usage: CPU 0.01s/0.13u sec elapsed 1.70 sec
log_temp_files = 0
To be as efficient as possible, postgres tries to do everything it can in memory. Sometimes, you just run out of that fickle resource. Postgres has a built-in ‘swap-like’ system that will employ temp files with the data directory to deal with the issue. If you’ve spent any time around disks (especially spinning-rust), you’ll know that swap can cause some serious performance issues. Just like checkpoint and autovacuum, temp files are going to happen automatically. Unlike these other two processes, they are going to occur if the queries you are running need the temp space. From a developer’s perspective, I want to know if the process that I’m engineering is going to use temp. From a DBA’s perspective, I want to know if the dang developers did something that needs temp space ( more likely, my dang maintenance jobs are using it). To help in tuning your queries and maintenance processes, log your temp files. It’ll tell you what size temp file was needed and which query caused it:
2016-05-19 12:31:20 EDT [15967]: [1-1] user=postgres,db=postgres,app=pgbench,client=[local] LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp15967.0", size 200204288 2016-05-19 12:31:20 EDT [15967]: [2-1] user=postgres,db=postgres,app=pgbench,client=[local] STATEMENT: alter table pgbench_accounts add primary key (aid)
log_lock_waits = on
Databases are servicing lots of clients all trying to do very similar work against the same set of data. This can cause contention (it’s the nature of the beast). log_lock_waits let’s you see where your contention is. It will give you detailed, specific information about what waits occurred and the context in which they occurred.
2016-05-19 13:14:50 EDT [17094]: [1-1] user=postgres,db=postgres,app=psql,client=[local] LOG: process 17094 still waiting for RowExclusiveLock on relation 16847 of database 12403 after 1000.794 ms at character 13 2016-05-19 13:14:50 EDT [17094]: [2-1] user=postgres,db=postgres,app=psql,client=[local] DETAIL: Process holding the lock: 17086. Wait queue: 17094. 2016-05-19 13:14:50 EDT [17094]: [3-1] user=postgres,db=postgres,app=psql,client=[local] STATEMENT: delete from pgbench_tellers ;
These are what I call ‘reasonable defaults’ for logging in postgres. Again, these are the settings that I configure every time I setup a new cluster, whether it’s for dev / test / toy / prod.
Happy querying!
Tomas Vondra: Auditing Users and Roles in PostgreSQL
One of the services we offer are security reviews (or audits, if you want), covering a range of areas related to security. It may be a bit surprising, but a topic that often yields the most serious issues is roles and privileges. Perhaps the reason why roles and privileges are a frequent source of issues is that it seems to be quite simple and similar to things the engineers are familiar with (e.g. Unix system of users and groups), but it turns out there are a few key differences with major consequences.
The other parts are either very straightforward and understandable even for sysadmins without much PostgreSQL experience (e.g. authentication config in pg_hba.conf
), or the engineers recognize the complexity and take their time to familiarize with the details (a good example of this is Row Level Security).
That is not to say there are no interesting topics e.g. how to use RLS with application-level users but I’ll leave that for another blog post, as this one is about roles and privileges.
So let’s look at roles and privileges a bit a closer …
Owner is a small superuser
When it comes to roles, the initial checks are mostly expected. The role should not be a superuser (as superusers simply bypass various checks), and in general should not have any excessive privileges (e.g. CREATEDB
, CREATEROLE
and so on).
But it also should not own the database objects (tables, functions, …), since owners can simply grant themselves arbitrary privileges on the objects they own, which turns them into small superusers.
Consider the following example, where we attempt to protect the table from the owner by revoking all the privileges from that role:
db=# CREATE USER u; CREATE ROLE db=# SELECT rolsuper FROM pg_roles WHERE rolname = 'u'; rolsuper ---------- f (1 row) db=# \c 'user=u dbname=db' You are now connected to database "db" as user "u".
So we have created a user who is not a superuser, and we have connected using that account (that’s the slightly cryptic psql
command). Let’s create a table (so the user is an owner) and restrict our own access to it
db=> CREATE TABLE t (id INT); CREATE TABLE db=> REVOKE ALL ON t FROM u; REVOKE db=> SELECT * FROM t; ERROR: permission denied for relation t
So that works, right? Well, the problem is a user with access to SQL (e.g. an “attacker” that discovered a SQL injection vulnerability) can do this:
db=> GRANT ALL ON t TO u; GRANT db=> select * from t; id ---- (0 rows)
The owner can simply grant all privileges back to himself, defeating the whole privilege system. A single SQL injection vulnerability and it’s a game over. Another issue with owners is that they are not subject to RLS by default, although that can be fixed with a simple ALTER TABLE ... FORCE ROW LEVEL SECURITY
.
In any case, this should be a clear hint that the application should use a dedicated role (or multiple roles), not owning any of the objects.
BTW users are often surprised when I mention that we can grant privileges to individual columns e.g. allow SELECT
on a subset of columns, UPDATE
on a different subset of columns, and so on.
When combined with SECURITY DEFINER
functions, this is a great way to restrict access to columns the application should not access directly, but allow special operations. For example it shouldn’t be possible to select all passwords (even if hashed) or e-mails, but it should be possible to verify a password or an e-mail. SECURITY DEFINER
functions are great for that, but sadly it’s one of the powerful yet severely underused features :-(
Role inheritance
Let’s assume you have a role that owns the objects, and a separate role used by the application. In fact, if you have sufficiently complex application, chances are you’ve split it into multiple parts, perhaps segregated into schemas, and each module uses a separate set of roles (owner + application, possibly more).
This gives you the ability to create application roles covering only part of the application e.g. the administration panel needs access to all modules, while a public web interface only needs read-only access to a small subset of modules.
CREATE ROLE module_users; -- full access to user info CREATE ROLE module_users_ro; -- limited access user info (register/verify) CREATE ROLE module_posts; -- full access to blog posts CREATE ROLE module_posts_ro; -- read-only access to blog posts ... roles for additional modules ... CREATE USER admin_user -- full access IN ROLE module_users, module_posts; CREATE USER web_user -- limited access IN ROLE module_users_ro, module_posts_ro;
In other words, roles may be seen as groups and used for making the privileges easier to manage. There are two aspects that make this different from unix-like groups it’s possible to use multi-level hierarchy of roles (while Unix groups are flat), and inheritance (will get to that in a minute).
The above scheme works just fine, but only if you keep the connections for the two users (admin_user
and web_user
) separate. With a small number of users (modules, applications) that’s manageable, as you can maintain separate connection pools, but as the number of connection pools grows it ceases to serve the purpose. But can we use a single connection pool and keep the benefit of separate users?
Well, yes. We can create another user role for the connection pool and grant it all the existing users (admin_user
and web_user
).
CREATE USER pool_user IN ROLE admin_user, web_user
This seems a bit strange, because the new user becomes member of admin_user
and web_user
roles (users are just roles with LOGIN
privilege), effectively inheriting all the privileges. Wasn’t the whole point using roles with limited privileges?
Let me introduce you the SET ROLE
command, which can be used to switch the session to arbitrary role the user is member of. So as the pool_user
user is member of both admin_user
and web_user
roles, the connection pool or application may use this:
SET ROLE admin_user
to switch it to “full” privileges for the admin interface, or
SET ROLE web_user
when the connection is intended for the website.
These commands are akin to dropping privileges in Unix. The init scripts are executed as root
, but you really don’t want to run all the services as root
, so the init script does something like sudo -u
or chpst
to switch to unprivileged user.
But wait, we can actually do the opposite. We can start with “no privileges” by default, all we need to do is create the role like this:
CREATE USER pool_user NOINHERIT IN ROLE admin_user, web_user
The user is still member of the two roles (and so can switch to them using SET ROLE
), but inherits no privileges from them. This has the benefit that if the pool or application fails to do the SET ROLE
, it will fail due to lack of privileges on the database objects (instead of silently proceeding with full privileges). So instead of starting with full privileges and eventually dropping most of them, with NOINHERIT
we start with no privileges and then acquire a limited subset of them.
But why am I wasting time by explaining all this SET ROLE
and INHERIT
or NOIHERIT
stuff? Well, it has implications on testing.
Note: You have to trust the pool/application to actually execute the SET ROLE
command with the right target role, and the user must not be able to execute custom SQL on the connection (because then it’s just a matter of RESET ROLE
to gain the full privileges, or SET ROLE
to switch to another role). If that’s not the case, the shared connection pool is not a path forward for you.
Testing roles
Pretty much no one tests privileges. Or to be more accurate everyone tests the positive case implicitly, because if you don’t get the necessary privileges the application breaks down. But only very few people verify that there are no unnecessary/unexpected privileges.
The most straightforward way to test absence of privileges (user has no access) might be to walk through all existing objects (tables, columns) and try all compatible privileges. But that’s obviously a lot of combinations and a lot of additional schema-specific work (data types, constraints, …).
Luckily, PostgreSQL provides a collection of useful functions for exactly this purpose (showing just table-related ones, there are additional functions for other object types):
has_any_column_privilege(...)
has_column_privilege(...)
has_table_privilege(...)
So for example it’s trivial to check which roles have INSERT
privilege on a given table:
SELECT rolname FROM pg_roles WHERE has_table_privilege(rolname, 'table', 'INSERT')
or listing tables accessible by a given role:
SELECT oid, relname FROM pg_class WHERE has_table_privilege('user', oid, 'INSERT')
And similarly for other privileges and object types. The testing seems fairly trivial – simply run a bunch of queries for the application users, check that the result matches expectation and we’re done.
Note: It’s also possible to use the information_schema
, e.g. table_privileges
which essentially just runs a query with has_table_privilege
and formats the output nicely.
Except there’s a small catch – the inheritance. It works just fine as long as the role inherits privileges through membership, but as soon as there’s a NOINHERIT
somewhere, those privileges will not be considered when checking the access (both in the functions and information_schema
). Which makes sense, because the current user does not currently have the privileges, but can gain them easily using SET ROLE
.
But of course, PostgreSQL also includes pg_has_role()
function, so we can merge the privileges from all the roles, for example like this:
SELECT DISTINCT relname FROM pg_roles CROSS JOIN pg_class WHERE pg_has_role('user', rolname, 'MEMBER') AND has_table_privilege(rolname, pg_class.oid, 'SELECT')
Making this properly testable requires more work (to handle additional object types and applicable privileges), but you get the idea.
Summary
Let me briefly summarize this blog post:
- separate the owner and application user– Don’t use a single role for both things.
- consider using SET ROLE role– Either drop (
INHERIT
) or acquire (NOINHERIT
). - test the expected privileges– Ideally run this as part of regular unit tests if possible.
- keep it simple– It’s definitely better to have a simple hierarchy of roles you understand.
Shaun M. Thomas: PG Phriday: Trusty Table Tiers
I always advocate breaking up large Postgres tables for a few reasons. Beyond query performance concerns, maintaining one monolithic structure is always more time consuming and consequentially more dangerous. The time required to create a dozen small indexes may be slightly longer than a single larger one, but we can treat the smaller indexes as incremental. If we want to rebuild, add more indexes, or fix any corruption, why advocate an all-or-nothing proposition? Deleting from one large table will be positively glacial compared to simply dropping an entire expired partition. The list just goes on and on.
On the other hand, partitioning in Postgres can be pretty intimidating. There are so many manual steps involved, that it’s easy to just kick the can down the road and tackle the problem later, or not at all. Extensions like the excellent pg_partman remove much of the pain involved in wrangling an army of partitions, and we strongly suggest using some kind of tool-kit instead of reinventing the wheel.
The main limitation with most existing partition management libraries is that they never deviate from the examples listed in the Postgres documentation. It’s always: create inherited tables, add redirection triggers, automate, rinse, repeat. In most cases, this is exactly the right approach. Unfortunately triggers are slow, and especially in an OLTP context, this can introduce sufficient overhead that partitions are avoided entirely.
Well, there is another way to do partitioning that’s almost never mentioned. The idea is to actually utilize the base table as a storage target, and in lieu of triggers, schedule data movement during low-volume time periods. The primary benefit to this is that there’s no more trigger overhead. It also means we can poll the base table itself for recent data with the ONLY
clause. This is a massive win for extremely active tables, and the reason tab_tier
was born.
Let’s create some data for testing this out:
CREATETABLE sensor_log ( id INTPRIMARYKEY, location VARCHARNOTNULL, reading BIGINTNOTNULL, reading_date TIMESTAMPNOTNULL); INSERTINTO sensor_log (id, location, reading, reading_date)SELECT s.id, s.id % 1000, s.id % 100,CURRENT_DATE-((s.id *10)||'s')::INTERVALFROM generate_series(1,5000000) s(id); CREATEINDEX idx_sensor_log_location ON sensor_log (location); CREATEINDEX idx_sensor_log_date ON sensor_log (reading_date); ANALYZE sensor_log; |
Now we have 5-million rows in a table with a defined date column that’s a perfect candidate for partitioning. The way this data is currently distributed, we have content going back to late 2014. Imagine in this scenario we don’t need this much live information at all times. So we decide to keep one week of logs for active use, and relegate everything else into some kind of monthly partition.
This is how all of that would look in tab_tier
:
CREATE EXTENSION tab_tier; SELECT tab_tier.register_tier_root('public','sensor_log','reading_date'); UPDATE tab_tier.tier_root SET root_retain ='1 week'::INTERVAL, part_period ='1 month'::INTERVALWHERE root_schema ='public'AND root_table ='sensor_log'; SELECT tab_tier.bootstrap_tier_parts('public','sensor_log'); \dt List OF relations Schema | Name |TYPE| Owner --------+------------------------+-------+---------- public | sensor_log |TABLE| postgres public | sensor_log_part_201410 |TABLE| postgres public | sensor_log_part_201411 |TABLE| postgres public | sensor_log_part_201412 |TABLE| postgres public | sensor_log_part_201501 |TABLE| postgres public | sensor_log_part_201502 |TABLE| postgres public | sensor_log_part_201503 |TABLE| postgres public | sensor_log_part_201504 |TABLE| postgres public | sensor_log_part_201505 |TABLE| postgres public | sensor_log_part_201506 |TABLE| postgres public | sensor_log_part_201507 |TABLE| postgres public | sensor_log_part_201508 |TABLE| postgres public | sensor_log_part_201509 |TABLE| postgres public | sensor_log_part_201510 |TABLE| postgres public | sensor_log_part_201511 |TABLE| postgres public | sensor_log_part_201512 |TABLE| postgres public | sensor_log_part_201601 |TABLE| postgres public | sensor_log_part_201602 |TABLE| postgres public | sensor_log_part_201603 |TABLE| postgres public | sensor_log_part_201604 |TABLE| postgres public | sensor_log_part_201605 |TABLE| postgres |
Taking this piece by piece, the first thing we did after creating the extension itself, was to call the register_tier_root
function. This officially tells tab_tier
about the table, and creates a record with configuration elements we can tweak. And that’s exactly what we do by setting the primary retention window and the partition size. Creating all of the partitions manually is pointless, so we also invoke bootstrap_tier_parts
. Its job is to check the range of dates currently represented in the table, and create all of the partitions necessary to store it.
What did not happen here, is any data movement. This goes back to our original concern regarding maintenance. Some tables may be several GB or even TB in size, and moving all of that data as one gargantuan operation would be a really bad idea. Instead, tab_tier
provides the migrate_tier_data
function to relocate data for a specific partition.
With a bit of clever SQL
, we can even generate a script for it:
COPY (SELECT'SELECT tab_tier.migrate_tier_data(''public'', ''sensor_log'', '''||REPLACE(part_table,'sensor_log_part_','')||''');'AS part_name FROM tab_tier.tier_part JOIN tab_tier.tier_root USING(tier_root_id)WHERE root_schema ='public'AND root_table ='sensor_log'ORDERBY part_table )TO'/tmp/move_parts.sql'; \i /tmp/move_parts.SQL SELECTCOUNT(*)FROMONLY sensor_log; COUNT-------60480 SELECTCOUNT(*)FROM sensor_log_part_201504; COUNT--------259200 |
Following some debugging notices, all of our data has moved to the appropriate partition. We verified that by checking the base table and a randomly chosen partition for record counts. At this point, the table is now ready for regular maintenance. In this case “maintenance” means regularly calling the cap_tier_partitions
and migrate_all_tiers
functions. The first ensures target partitions always exist, and the second moves any pending data to a waiting partition for all tables we’ve registered.
And that’s it. We’re completely done with this table. If we stopped here, we could be secure in the knowledge we no longer have to worry about some gigantic monolith ruining our day some time in the future. But that’s not how tab_tier
got its name. One or two levels does not a tier make; the real “secret sauce” is its support for long term storage.
One thing we didn’t really cover, and most partition systems never even consider, is that partitioning is only half of the story. On an extremely active system, having months or years of data just sitting around is relatively frowned upon. The mere presence of older data might encourage using it, transforming our finely tuned OLTP engine into a mixed workload wreck. One or two queries against those archives, and suddenly our cache is tainted and everything is considerably slower.
We need to move that data off of the system, and there are quite a few ways to do that. Some might use ETL scripts or systems like talend to accomplish that goal. Or we can just use tab_tier
and a Postgres foreign table. Let’s now dictate that only six months of archives should ever exist on the primary server. Given that constraint, this is how we could proceed:
-- Do this on some kind of archive server CREATEUSER arc_user PASSWORD 'PasswordsAreLame'; CREATETABLE sensor_log ( id INTPRIMARYKEY, location VARCHARNOTNULL, reading BIGINTNOTNULL, reading_date TIMESTAMPNOTNULL, snapshot_dt TIMESTAMPWITHOUTTIME ZONE ); GRANTALLON sensor_log TO arc_user; -- Back on the data source.., UPDATE tab_tier.tier_root SET lts_threshold ='6 months'::INTERVAL, lts_target ='public.sensor_log_archive'WHERE root_schema ='public'AND root_table ='sensor_log'; CREATE EXTENSION postgres_fdw; CREATEUSER arc_user PASSWORD 'PasswordsAreLame'; GRANT tab_tier_role TO arc_user; GRANTALLONALLTABLESIN SCHEMA PUBLIC TO tab_tier_role; CREATE SERVER arc_srv FOREIGNDATA WRAPPER postgres_fdw OPTIONS (dbname 'postgres', host 'archive-host'); CREATEUSER MAPPING FOR arc_user SERVER arc_srv OPTIONS (USER'arc_user', password 'PasswordsAreLame'); CREATEFOREIGNTABLE sensor_log_archive ( id INT, location VARCHARNOTNULL, reading BIGINTNOTNULL, reading_date TIMESTAMPNOTNULL, snapshot_dt TIMESTAMPWITHOUTTIME ZONE ) SERVER arc_srv OPTIONS (TABLE_NAME'sensor_log'); GRANTINSERTON sensor_log_archive TO tab_tier_role; -- Connect as arc_user, then run this: SELECT tab_tier.archive_tier('public','sensor_log'); SELECTCOUNT(*)FROM sensor_log_archive; COUNT---------3263360 |
Whew! That was a lot of work. Maybe a future version of tab_tier
should provide a wrapper for that. In any case, all we did was set up a foreign table on a remote server, create a separate user to handle the data movement, and tell tab_tier
about our six month threshold for long term storage, and the target table itself.
Using a foreign table isn’t required here, since the target can be any kind of table, but isn’t that the whole point of this exercise? The cool thing about Postgres foreign data wrappers is that we could have used any of them. In this case we’re just moving data to another remote Postgres instance, but we could have dumped everything into Cassandra or Hadoop instead. Take that, subspace!
For those who noticed all of the ridiculous GRANT
statements, please remember this is only for demonstration purposes. A real system would probably use ALTER DEFAULT PRIVILEGES
to give tab_tier_role
more limited control over a specific schema and tables specifically designed for archival. The extension doesn’t add its own privileges—even to tables it creates—in case controls are tightly locked down. We don’t want to hijack any carefully laid down security. Instead tab_tier
just propagates any ACLs it finds on root tables to new partitions.
This is the same reason we ran the archive_tier
(or archive_all_tiers
) routine as a different user. Since we’re using a foreign user mapping, we want to limit data leak potential by isolating the movement process from the table owner or a superuser. We recommend using this approach for any foreign table usage whenever possible.
With all of that out of the way, we still need to clean up. We archived all of the partition content, but the partitions themselves are still sitting around and gathering dust. Let’s fix that by running one final step as the owner of sensor_log
or any superuser:
SELECT part_table FROM tab_tier.tier_part WHERE is_archived; part_table ------------------------ sensor_log_part_201410 sensor_log_part_201411 sensor_log_part_201412 sensor_log_part_201501 sensor_log_part_201502 sensor_log_part_201503 sensor_log_part_201504 sensor_log_part_201505 sensor_log_part_201506 sensor_log_part_201507 sensor_log_part_201508 sensor_log_part_201509 sensor_log_part_201510 SELECT tab_tier.drop_archived_tiers(); SELECTCOUNT(*)FROM sensor_log_archive; COUNT---------1736640 |
During the archival process itself, tab_tier
marks the related metadata so archived tables will no longer be used in any of the data movement functions. It also makes them an easy target for removal with a maintenance function. We can see that everything worked as a large portion of our data is no longer part of the sensor_log
inheritance tree. Now the archived data is securely located on another system that’s probably geared more toward OLAP use, or some incomprehensible Hive we don’t have to worry about.
I for one, welcome our incomprehensible Hive overlords.
REGINA OBE: pgRouting 2.2.3 released with support for PostgreSQL 9.6beta1
pgRouting 2.2.3 was released last week. Main change is this version now supports PostgreSQL 9.6. Many thanks to Vicky Vergara for working thru the issues with PostgreSQL 9.6 and getting it to work. Vicky has also been doing a good chunk of the coding (a lot of Boost refactoring and integrating more Boost features), testing, and documentation in pgRouting, osm2pgrouting, and QGIS pgRoutingLayer in general for pgRouting 2.1, 2.2, and upcoming 2.3. We are very indebted to her for her hard work.
If you are a windows user testing the waters of PostgreSQL 9.6beta1, we have pgRouting 2.2.3 binaries and PostGIS 2.3.0dev binaries at http://postgis.net/windows_downloads.
Continue reading "pgRouting 2.2.3 released with support for PostgreSQL 9.6beta1"
Scott Mead: Helpful PostgreSQL Logging and Defaults
I use PostgreSQL every single day. I develop against multiple versions of the database (4 versions running in parallel as I write), I have apps that use it and, my daily DBA-ing. The biggest question I get from newbies and veterans a like is: “What are your PostgreSQL defaults?”
If you follow postgres, you already know that the default configuration (postgresql.conf) is very conservative from a resource (CPU, IOPS, Disk Space) perspective. Over the last 10 years, we [the community] have developed some straightforward and easy to understand formulas that will help you tune… shared_buffers for example. The item that always gets left out though is logging. As a developer, I’m always looking for ways to see “How the database is answering the questions I ask”. When I get a new postgres instance set up, I have a personal (somewhere in the cobwebs) checklist that I run. Some are based on the purpose of the deployment (shared_buffers, work_mem, etc…), some are things that I always set. Aside from memory, the biggest set of “standard” items I set are all related to logging. I’m big on monitoring (my pg_stat_activity patch was accepted back for 9.2) and having the right detail presented to me is important.
TL;DR
that we can see exactly what is happening when. In addition to setting the log files, you can pick up a great log analyzer (like pgBadger or pgCluu) that will break all of this down in to a gorgeous, easy-to-use report.
Scott’s logging defaults
logging_collector = on
log_filename = ‘postgresql-%a.log’
This sets the name of the actual file that log messages will be written to (in your shiny, new pg_log / log_destination directory). The %a means that you’ll see Mon, Tue, Wed, Thu, Fri, Sat, Sun. The patterns are based on standard strftime escapes (man page). The reason that I like using %a is that you get auto-rotation on the log files. You will keep one week’s worth of logs, when you rotate 7 days later, you won’t be creating a huge number of log files.
Note: Depending on my requirements, I will adjust this for production. If I have any special retention policy that I need to abide by, I’ll make the filename: postgresql-YYYY-MM-DD-HH24mmSS.log (log_filename = ‘postgresql-%Y-%m-%d-%H%M%S.log’ ). The trouble with this is that you’ll need to deal with log cleanup yourself (cron to archive logs + 30 days old … ).
log_truncate_on_rotation=on
This essentially says “when I switch log files, if a log already exists with that name, truncate the existing one and write new logs to an empty file”.
For example, on Monday, May 15th, we wrote a log file:
postgresql-Mon.log
Now, at midnight on Monday, May 22nd, postgres is going to rotate its log back to:
postgresql-Mon.log
This is our data from last week. If you leave log_truncate_on_rotation = off (the default), then postgres will append to that log file. That means, in December, you’ll have data for every Monday throughout the year in the same file. If you set this to on, it’ll will nuke the old data and give you only data from the most recent Monday. If you need keep log files for longer than 7 days, I recommend you use a more complete name for your log files (see log_filename above).
log_line_prefix = ‘%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h ‘
The log_line_prefix controls what every single line in the log file looks like. My personal favorite here is actually stolen directly from pgBadger’s suggested configuration. pgBadger is an amazing tool for parsing and analyzing the postgres log files. If you set the log_line_prefix directly, pgBadger can provide incredible detail about: “what happens where, who did it and, when the did it”. Just to show you the difference….
default log file error message:
LOG: database system was shut down at 2016-05-19 11:40:57 EDT LOG: MultiXact member wraparound protections are now enabled LOG: database system is ready to accept connections LOG: autovacuum launcher started ERROR: column "create_error_for_fun" does not exist at character 8 STATEMENT: select create_error_for_fun;
With a rich log_line_prefix
2016-05-19 11:46:22 EDT [14913]: [1-1] user=,db=,app=,client= LOG: database system was shut down at 2016-05-19 11:46:21 EDT 2016-05-19 11:46:22 EDT [14913]: [2-1] user=,db=,app=,client= LOG: MultiXact member wraparound protections are now enabled 2016-05-19 11:46:22 EDT [14911]: [3-1] user=,db=,app=,client= LOG: database system is ready to accept connections 2016-05-19 11:46:22 EDT [14917]: [1-1] user=,db=,app=,client= LOG: autovacuum launcher started 2016-05-19 11:46:27 EDT [14921]: [1-1] user=postgres,db=postgres,app=psql,client=[local] ERROR: column "create_error_for_fun" does not exist at character 8 2016-05-19 11:46:27 EDT [14921]: [2-1] user=postgres,db=postgres,app=psql,client=[local] STATEMENT: select create_error_for_fun;
Now, I know who, what, where, why and when.
log_checkpoints = on
Like any database, postgres is going to use the disks. In order to accomplish dealing with I/O efficiently, postgres (like many other databases) uses something called a checkpoint in order to synchronize it’s memory to those disks. Checkpoints occur periodically based on a number of things (load, configuration, etc…). The thing to keep in mind is that a checkpoint will use disk I/O, the busier the database, the more it requires. Setting this to on means that you know without a doubt when a checkpoint occured. It’s the same ol’ story: “Every once in a while, I get long-running queries, different ones each time!”… “I’m seeing a spike in IOPS and I don’t know why!” … “Sometimes my data load gets bogged down for some reason!” … etc…
This very well could be due to a large checkpoint occurring. Since it’s based on load / configuration / time, it’s critical that the server write a log of when checkpoint occurred so that you’re not left in the dark. There’s also useful information in these logs about how postgres is behaving (it can even help you tune your memory settings).
2016-05-19 12:01:26 EDT [15274]: [1-1] user=,db=,app=,client= LOG: checkpoint starting: immediate force wait 2016-05-19 12:01:26 EDT [15274]: [2-1] user=,db=,app=,client= LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.000 s, sync=0.000 s, total=0.001 s; sync files=0, longest=0.000 s, average=0.000 s
(NB: There was some line-wrap here, I’ve manually entered carriage returns. The second line prints as one, long line)
It’s important to note that this is going to up the volume of your logs. But, it’s minimal and the benefits far outweigh the few extra bytes needed to write the message. (pgBadger will parse these up and give you a nice, clear picture of your checkpoint behavior).
log_autovacuum_min_duration = 0
(default: log_autovacuum_min_duration = -1). The tuning convention isn’t just a boolean on/ off. Essentially, you are telling postgres: “When an autovacuum runs for x milliseconds or longer, write a message to the log”. Setting this to 0 (zero) means that you will log all autovacuum operations to the log file.
For the uninitiated, autovacuum is essentially a background process that does garbage collection in postgres. If you’re just starting out, what you really need to know is the following:
- It’s critical that autovacuum stay enabled
- autovacuum is another background process that uses IOPS
Because autovacuum is necessary and uses IOPS, it’s critical that you know what it’s doing and when. Just like log_checkpoints (above), autovacuum runs are based on load (thresholds on update / delete velocities on each table). This means that vacuum can kick off at virtually any time.
2016-05-19 12:32:25 EDT [16040]: [4-1] user=,db=,app=,client= LOG: automatic vacuum of table "postgres.public.pgbench_branches": index scans: 1 pages: 0 removed, 12 remain tuples: 423 removed, 107 remain, 3 are dead but not yet removable buffer usage: 52 hits, 1 misses, 1 dirtied avg read rate: 7.455 MB/s, avg write rate: 7.455 MB/s system usage: CPU 0.00s/0.00u sec elapsed 0.00 sec 2016-05-19 12:32:25 EDT [16040]: [5-1] user=,db=,app=,client= LOG: automatic analyze of table "postgres.public.pgbench_branches" system usage: CPU 0.00s/0.00u sec elapsed 0.00 sec 2016-05-19 12:32:27 EDT [16040]: [6-1] user=,db=,app=,client= LOG: automatic analyze of table "postgres.public.pgbench_history" system usage: CPU 0.01s/0.13u sec elapsed 1.70 sec
log_temp_files = 0
To be as efficient as possible, postgres tries to do everything it can in memory. Sometimes, you just run out of that fickle resource. Postgres has a built-in ‘swap-like’ system that will employ temp files with the data directory to deal with the issue. If you’ve spent any time around disks (especially spinning-rust), you’ll know that swap can cause some serious performance issues. Just like checkpoint and autovacuum, temp files are going to happen automatically. Unlike these other two processes, they are going to occur if the queries you are running need the temp space. From a developer’s perspective, I want to know if the process that I’m engineering is going to use temp. From a DBA’s perspective, I want to know if the dang developers did something that needs temp space ( more likely, my dang maintenance jobs are using it). To help in tuning your queries and maintenance processes, log your temp files. It’ll tell you what size temp file was needed and which query caused it:
2016-05-19 12:31:20 EDT [15967]: [1-1] user=postgres,db=postgres,app=pgbench,client=[local] LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp15967.0", size 200204288 2016-05-19 12:31:20 EDT [15967]: [2-1] user=postgres,db=postgres,app=pgbench,client=[local] STATEMENT: alter table pgbench_accounts add primary key (aid)
log_lock_waits = on
Databases are servicing lots of clients all trying to do very similar work against the same set of data. This can cause contention (it’s the nature of the beast). log_lock_waits let’s you see where your contention is. It will give you detailed, specific information about what waits occurred and the context in which they occurred.
2016-05-19 13:14:50 EDT [17094]: [1-1] user=postgres,db=postgres,app=psql,client=[local] LOG: process 17094 still waiting for RowExclusiveLock on relation 16847 of database 12403 after 1000.794 ms at character 13 2016-05-19 13:14:50 EDT [17094]: [2-1] user=postgres,db=postgres,app=psql,client=[local] DETAIL: Process holding the lock: 17086. Wait queue: 17094. 2016-05-19 13:14:50 EDT [17094]: [3-1] user=postgres,db=postgres,app=psql,client=[local] STATEMENT: delete from pgbench_tellers ;
These are what I call ‘reasonable defaults’ for logging in postgres. Again, these are the settings that I configure every time I setup a new cluster, whether it’s for dev / test / toy / prod.
Happy querying!
Károly Nagy: Postgresql server fails to start in recovery with systemd
I ran into an issue while trying to setup a simple test system for myself to move from CentOS 6 to 7. When I was about to start a standby slave systemd reported PostgreSQL startup timeout and stopped it. Although it was running perfectly fine doing recovery.
Issue
In var log message systemd reports failure
systemd: postgresql-9.5.service start operation timed out. Terminating. systemd: Failed to start PostgreSQL 9.5 database server. systemd: Unit postgresql-9.5.service entered failed state. systemd: postgresql-9.5.service failed.
Meanwhile in postgresql
[1342]: [3-1] host=,user=,db=,tx=0,vtx= LOG: received smart shutdown request [1383]: [3-1] host=,user=,db=,tx=0,vtx= LOG: shutting down [1383]: [4-1] host=,user=,db=,tx=0,vtx= LOG: database system is shut down
Cause
In the systemd script postgresql is being started with -w flag which means ” -w wait until operation complete”. Hence systemd fails after configured timeout.
ExecStart=/usr/pgsql-9.5/bin/pg_ctl start -D ${PGDATA} -s -w -t 300
Fix
Change the -w flag to -W and systemd service file (/usr/lib/systemd/system/postgresql-9.5.service) and reload the daemon.
systemctl daemon-reload service postgresql-9.5 start
Hopefully this will save you couple of minutes debugging.
Craig Ringer: PostgreSQL-based application performance: latency and hidden delays
Goldfields Pipeline, by SeanMac (Wikimedia Commons)
If you’re trying to optimise the performance of your PostgreSQL-based application you’re probably focusing on the usual tools: EXPLAIN (BUFFERS, ANALYZE), pg_stat_statements, auto_explain, log_statement_min_duration, etc.
Maybe you’re looking into lock contention with log_lock_waits, monitoring your checkpoint performance, etc too.
But did you think about network latency? Gamers know about network latency, but did you think it mattered for your application server?
Latency matters
Typical client/server round-trip network latencies can range from 0.01ms (localhost) through the ~0.5ms of a switched network, 5ms of WiFi, 20ms of ADSL, 300ms of intercontinental routing, and even more for things like satellite and WWAN links.
A trivial SELECT can take in the order of 0.1ms to execute server-side. A trivial INSERT can take 0.5ms.
Every time your application runs a query it has to wait for the server to respond with success/failure and possibly a result set, query metadata, etc. This incurs at least one network round trip delay.
When you’re working with small, simple queries network latency can be significant relative to the execution time of your queries if your database isn’t on the same host as your application.
Many applictions, particularly ORMs, are very prone to running lots of quite simple queries. For example, if your Hibernate app is fetching an entity with a lazily fetched @OneToMany relationship to 1000 child items it’s probably going to do 1001 queries thanks to the n+1 select problem, if not more. That means it’s probably spending 1000 times your network round trip latency just waiting. You can left join fetch to avoid that… but then you transfer the parent entity 1000 times in the join and have to deduplicate it.
Similarly, if you’re populating the database from an ORM, you’re probably doing hundreds of thousands of trivial INSERTs… and waiting after each and every one for the server to confirm it’s OK.
It’s easy to try to focus on query execution time and try to optimise that, but there’s only so much you can do with a trivial INSERT INTO ...VALUES .... Drop some indexes and constraints, make sure it’s batched into a transaction, and you’re pretty much done.
What about getting rid of all the network waits? Even on a LAN they start to add up over thousands of queries.
COPY
One way to avoid latency is to use COPY. To use PostgreSQL’s COPY support your application or driver has to produce a CSV-like set of rows and stream them to the server in a continuous sequence. Or the server can be asked to send your application a CSV-like stream.
Either way, the app can’t interleave a COPY with other queries, and copy-inserts must be loaded directly into a destination table. A common approach is to COPY into a temporary table, then from there do an INSERT INTO ... SELECT ..., UPDATE ... FROM ...., DELETE FROM ... USING..., etc to use the copied data to modify the main tables in a single operation.
That’s handy if you’re writing your own SQL directly, but many application frameworks and ORMs don’t support it, plus it can only directly replace simple INSERT. Your application, framework or client driver has to deal with conversion for the special representation needed by COPY, look up any required type metadata its self, etc.
(Notable drivers that do support COPY include libpq, PgJDBC, psycopg2, and the Pg gem… but not necessarily the frameworks and ORMs built on top of them.)
PgJDBC – batch mode
PostgreSQL’s JDBC driver has a solution for this problem. It relies on support present in PostgreSQL servers since 8.4 and on the JDBC API’s batching features to send a batch of queries to the server then wait only once for confirmation that the entire batch ran OK.
Well, in theory. In reality some implementation challenges limit this so that batches can only be done in chunks of a few hundred queries at best. The driver can also only run queries that return result rows in batched chunks if it can figure out how big the results will be ahead of time. Despite those limitations, use of Statement.executeBatch() can offer a huge performance boost to applications that are doing tasks like bulk data loading remote database instances.
Because it’s a standard API it can be used by applications that work across multiple database engines. Hibernate, for example, can use JDBC batching though it doesn’t do so by default.
libpq and batching
Most (all?) other PostgreSQL drivers have no support for batching. PgJDBC implements the PostgreSQL protocol completely independently, wheras most other drivers internally use the C library libpq that’s supplied as part of PostgreSQL.
libpq does not support batching. It does have an asynchronous non-blocking API, but the client can still only have one query “in flight” at a time. It must wait until the results of that query are received before it can send another.
The PostgreSQL server supports batching just fine, and PgJDBC uses it already. So I’ve written batch support for libpq and submitted it as a candidate for the next PostgreSQL version. Since it only changes the client, if accepted it’ll still speed things up when connecting to older servers.
I’d be really interested in feedback from authors and advanced users of libpq-based client drivers and developers of libpq-based applications. The patch applies fine on top of PostgreSQL 9.6beta1 if you want to try it out. The documentation is detailed and there’s a comprehensive example program.
Performance
I thought a hosted database service like RDS or Heroku Postgres would be a good example of where this kind of functionality would be useful. In particular, accessing them from ourside their own networks really shows how much latency can hurt.
At ~320ms network latency:
- 500 inserts without batching: 167.0s
- 500 inserts with batching: 1.2s
… which is over 120x faster.
You won’t usually be running your app over an intercontinental link between the app server and the database, but this serves to highlight the impact of latency. Even over a unix socket to localhost I saw over a 50% performance improvement for 10000 inserts.
Batching in existing apps
It is unfortunately not possible to automatically enable batching for existing applications. Apps have to use a slightly different interface where they send a series of queries and only then ask for the results.
It should be fairly simple to adapt apps that already use the asynchronous libpq interface, especially if they use non-blocking mode and a select()/poll()/epoll()/WaitForMultipleObjectsEx loop. Apps that use the synchronous libpq interfaces will require more changes.
Batching in other client drivers
Similarly, client drivers, frameworks and ORMs will generally need interface and internal changes to permit the use of batching. If they’re already using an event loop and non-blocking I/O they should be fairly simple to modify.
I’d love to see Python, Ruby, etc users able to access this functionality, so I’m curious to see who’s interested. Imagine being able to do this:
import psycopg2 conn = psycopg2.connect(...) cur = conn.cursor() # this is just an idea, this code does not work with psycopg2: futures = [ cur.async_execute(sql) for sql in my_queries ] for future in futures: result = future.result # waits if result not ready yet ... process the result ... conn.commit()
Asynchronous batched execution doesn’t have to be complicated at the client level.
COPY is fastest
Where practical clients should still favour COPY. Here are some results from my laptop:
inserting 1000000 rows batched, unbatched and with COPY batch insert elapsed: 23.715315s sequential insert elapsed: 36.150162s COPY elapsed: 1.743593s Done.
Batching the work provides a surprisingly large performance boost even on a local unix socket connection…. but COPY leaves both individual insert approaches far behind it in the dust.
Use COPY.
The image
The image for this post is of the Goldfields Water Supply Scheme pipeline from Mundaring Weir near Perth in Western Australia to the inland (desert) goldfields. It’s relevant because it took so long to finish and was under such intense criticism that its designer and main proponent, C. Y. O’Connor, committed suicide 12 months before it was put into commission. Locally people often (incorrectly) say that he died after the pipeline was built when no water flowed – because it just took so long everyone assumed the pipeline project had failed. Then weeks later, out the water poured.
Daniel Pocock: PostBooks, PostgreSQL and pgDay.ch talk
PostBooks 4.9.5 was recently released and the packages for Debian (including jessie-backports), Ubuntu and Fedora have been updated.
Postbooks at pgDay.ch in Rapperswil, Switzerland
pgDay.ch is coming on Friday, 24 June. It is at the HSR Hochschule für Technik Rapperswil, at the eastern end of Lake Zurich.
I'll be making a presentation about Postbooks in the business track at 11:00.
Getting started with accounting using free, open source software
If you are not currently using a double-entry accounting system or if you are looking to move to a system that is based on completely free, open source software, please see my comparison of free, open source accounting software.
Free and open source solutions offer significant advantages: flexibility, businesses can choose any programmer to modify the code, and use of SQL back-ends, multi-user support and multi-currency support are standard. These are all things that proprietary vendors charge extra money for.
Accounting software is the lowest common denominator in the world of business software, people keen on the success of free and open source software may find that encouraging businesses to use one of these solutions is a great way to lay a foundation where other free software solutions can thrive.
PostBooks new web and mobile front end
xTuple, the team behind Postbooks, has been busy developing a new Web and Mobile front-end for their ERP, CRM and accounting suite, powered by the same PostgreSQL backend as the Linux desktop client.
More help is needed to create official packages of the JavaScript dependencies before the Web and Mobile solution itself can be packaged.
Pavel Stehule: plpgsql_check 1.0.5 released
https://manager.pgxn.org/distributions/plpgsql_check/1.0.5
https://github.com/okbob/plpgsql_check/releases/tag/v1.0.5
Raghavendra Rao: Ways to access Oracle Database in PostgreSQL
Below are few methods to make connection to Oracle database in PostgreSQL.
- Using ODBC Driver
- Using Foreign DataWrappers
- Using Oracle Call Interface(OCI) Driver
tar -xvf unixODBC-2.3.4.tar.gz
cd unixODBC-2.3.4/
./configure --sysconfdir=/etc
make
make install
Install Oracle ODBC Driver
rpm -ivh oracle-instantclient11.2-basic-11.2.0.4.0-1.x86_64.rpmBinary/Libraries location: /usr/lib/oracle/11.2/client64
rpm -ivh oracle-instantclient11.2-odbc-11.2.0.4.0-1.x86_64.rpm
rpm -ivh oracle-instantclient11.2-devel-11.2.0.4.0-1.x86_64.rpm
tar -zxvf ODBC-Link-1.0.4.tar.gzLibraries and SQL files location: /opt/PostgreSQL/9.5/share/postgresql/contrib
cd ODBC-Link-1.0.4
export PATH=/opt/PostgreSQL/9.5/bin:$PATH
which pg_config
make USE_PGXS=1
make USE_PGXS=1 install
Installation will create a ODBC-Link module SQL file in $PGHOME/contrib directory. Load the SQL file, which will create a schema by name "odbclink" with necessary functions in it.
psql -p 5432 -d oratest -U postgres -f /opt/PostgreSQL/9.5/share/postgresql/contrib/odbclink.sqlAt this point, we have installed unixODBC Drirver, Oracle ODBC driver and ODBC-Link module for PostgreSQL. As a first step, we need to create a DSN using Oracle ODBC.
Edit /etc/odbcinst.ini file and pass the drivers deifinition
## Driver for OracleEdit /etc/odbc.ini file and create the DSN with driver mentioned in /etd/odbcinst.ini
[MyOracle]
Description =ODBC for oracle
Driver =/usr/lib/oracle/11.2/client64/lib/libsqora.so.11.1
UsageCount=1
FileUsage = 1
Driver Logging = 7
## Host: pg.raghav-node1.com, PORT: 1521After creating DSN, load all Oracle & unix ODBC driver libraries by setting environment variables and test the connectivity using OS command line tool "dltest" & "iSQL"
## Oracle Instance Name: ORA11G, Username: mmruser, Password: mmruser
## ODBC Data source: Ora
[Ora]
Description = myoracledb database
Driver = MyOracle
Trace = yes
TraceFile = /tmp/odbc_oracle.log
Database = //pg.raghav-node1.com:1521/ORA11G
UserID = mmruser
Password = mmruser
Port = 1521
[root@172.16.210.161 ~]# export ORACLE_HOME=/usr/lib/oracle/11.2/client64Now, set the same environment variables for postgres user for loading the libraries and restart the PostgreSQL cluster to take effect. Connect to PostgreSQL and call odbclink functions to connect to Oracle database.
[root@172.16.210.161 ~]# export LD_LIBRARY_PATH=/usr/local/unixODBC-2.3.4/lib:/usr/lib/oracle/11.2/client64/lib
[root@172.16.210.161 ~]# export ODBCINI=/etc/odbc.ini
[root@172.16.210.161 ~]# export ODBCSYSINI=/etc/
[root@172.16.210.161 ~]# export TWO_TASK=//pg.raghav-node1.com:1521/ORA11G
[root@172.16.210.161 ~]# dltest /usr/lib/oracle/11.2/client64/lib/libsqora.so.11.1
SUCCESS: Loaded /usr/lib/oracle/11.2/client64/lib/libsqora.so.11.1
[root@172.16.210.161 ~]# isql ora -v
+---------------------------------------+
| Connected! |
| |
| sql-statement |
| help [tablename] |
| quit |
| |
+---------------------------------------+
SQL>
[root@172.16.210.163 ~]#su - postgresCool right...!!!. For retrieving and manipulating data refer to ODBC-Link README file.
[postgres@172.16.210.163 ~]$ export ORACLE_HOME=/usr/lib/oracle/11.2/client64
[postgres@172.16.210.163 ~]$ export LD_LIBRARY_PATH=/usr/local/unixODBC-2.3.4/lib:/usr/lib/oracle/11.2/client64/lib
[postgres@172.16.210.163 ~]$ export ODBCINI=/etc/odbc.ini
[postgres@172.16.210.163 ~]$ export ODBCSYSINI=/etc/
[postgres@172.16.210.163 ~]$ export TWO_TASK=//pg.raghav-node1.com:1521/ORA11G
[postgres@172.16.210.163 ~]$ dltest /usr/lib/oracle/11.2/client64/lib/libsqora.so.11.1
SUCCESS: Loaded /usr/lib/oracle/11.2/client64/lib/libsqora.so.11.1
[postgres@172.16.210.163 ~]$ /opt/PostgreSQL/9.5/bin/pg_ctl -D /opt/PostgreSQL/9.5/data/ stop -mf
[postgres@172.16.210.163 ~]$ /opt/PostgreSQL/9.5/bin/pg_ctl -D /opt/PostgreSQL/9.5/data/ start
[postgres@172.16.210.163 ~]$ psql
psql.bin (9.5.2)
Type "help" for help.
postgres=# select odbclink.connect('DSN=Ora');
connect
---------
1
(1 row)
Using Foreign DataWrappers
An SQL/MED(SQL Management of External Data) extension to the SQL Standard allows managing external data stored outside the database. SQL/MED provides two components Foreign data wrappers and Datalink. PostgreSQL introduced Foreign Data Wrapper(FDW) in 9.1 version with read-only support and in 9.3 version write support of this SQL Standard. Today, the latest version has a number of features around it and many varieties of FDW available to access different remote SQL databases.
Oracle_fdw provides an easy and efficient way to access Oracle Database. IMO,its one of the coolest method to access the remote database. To compile Oracle_FDW with PostgreSQL 9.5, we need Oracle Instant Client libraries and pg_config set in PATH. We can use the same Oracle Instant Client libraries used for ODBC-Link. Let's see how it works.
First, set environment variables with OIC libraries and pg_config
export PATH=/opt/PostgreSQL/9.5/bin:$PATHUnzip the oracle_fdw module and compile it with PostgreSQL 9.5
export ORACLE_HOME=/usr/lib/oracle/11.2/client64
export LD_LIBRARY_PATH=/usr/lib/oracle/11.2/client64/lib
unzip oracle_fdw-1.4.0.zipNow switch as 'postgres' user and restart the cluster by loading Oracle Instant Client libraries required for oracle_fdw extension and create the extension inside the database.
cd oracle_fdw-1.4.0/
make
make install
[postgres@172.16.210.161 9.5]$ export ORACLE_HOME=/usr/lib/oracle/11.2/client64/libNow you can access the Oracle database.
[postgres@172.16.210.161 9.5]$ export LD_LIBRARY_PATH=/usr/lib/oracle/11.2/client64/lib:$LD_LIBRARY_PATH
[postgres@172.16.210.161 9.5]$ /opt/PostgreSQL/9.5/bin/pg_ctl -D /opt/PostgreSQL/9.5/data/ stop -mf
[postgres@172.16.210.161 9.5]$ /opt/PostgreSQL/9.5/bin/pg_ctl -D /opt/PostgreSQL/9.5/data/ start
[postgres@172.16.210.161 9.5]$ psql
Password:
psql.bin (9.5.2)
Type "help" for help.
postgres=# create extension oracle_fdw;
CREATE EXTENSION
postgres=# CREATE SERVER oradb FOREIGN DATA WRAPPER oracle_fdw OPTIONS (dbserver '//pg.raghav-node1.com/ORA11G');
CREATE SERVER
postgres=# GRANT USAGE ON FOREIGN SERVER oradb TO postgres;
GRANT
postgres=# CREATE USER MAPPING FOR postgres SERVER oradb OPTIONS (user 'scott', password 'tiger');
CREATE USER MAPPING
postgres=#
postgres=# CREATE FOREIGN TABLE oratab (ecode integer,name char(30)) SERVER oradb OPTIONS(schema 'SCOTT',table 'EMP');
CREATE FOREIGN TABLE
postgres=# select * from oratab limit 3;
ecode | name
-------+--------------------------------
7369 | SMITH
7499 | ALLEN
7521 | WARD
(3 rows)
Using Oracle Call Interface(OCI) Drivers
Oracle Call Interface(OCI) a type-2 driver freely available on Oracle site which allows the client to connect to Oracle database. EDB Postgres Advanced Server (also called EPAS) a proprietary product has built-in OCI-based database link module called dblink_ora, which connects to Oracle database using Oracle OCI drivers. All you have to do to use dblink_ora module, install EPAS(not covering installation) and tell EPAS where it can find Oracle OCI driver libraries. We can make use of same Oracle Instant Client by specifying its libraries location in LD_LIBRARY_PATH environment variable and to take effect restart the EPAS cluster.
First, switch as "enterprisedb" user, load the libraries and restart the cluster. That's all, we are good to access Oracle database.
[enterprisedb@172.16.210.129 ~]$ export LD_LIBRARY_PATH=/usr/lib/oracle/11.2/client64/libNote: EPAS connects to the Oracle Database using Oracle Instant Client library "libclntsh.so". If you won't find the library in Oracle Client Library location then create the symbolic link with libclntsh.so pointing to the libclntsh.so.version.number. Refer to documentation.
[enterprisedb@172.16.210.129 bin]$ /opt/PostgresPlus/9.5AS/bin/pg_ctl -D /opt/PostgresPlus/9.5AS/data/ restart
[enterprisedb@172.16.210.129 bin]$ psql
psql.bin (9.5.0.5)
Type "help" for help.
edb=# select dblink_ora_connect('oraconn','localhost','edbora','edbuser','edbuser',1521);
dblink_ora_connect
--------------------
OK
(1 row)
In the example, dblink_ora_connect establishes a connection to an Oracle database with the user-specified connection information. Later using link name('oraconn' in my case) we can perform operations like SELECT,INSERT,DELETE,UPDATE & COPY using dblink_ora* functions. All functions you can refer from the EnterpriseDB documentation here.
All the above methods will be very handy if you are working on migration projects. Hope its helpful. Thank you
--Raghav
Magnus Hagander: www.postgresql.org is now https only
We've just flipped the switch on www.postgresql.org to be served on https only. This has been done for a number of reasons:
- In response to popular request
- Google, and possibly other search engines, have started to give higher scores to sites on https, and we were previously redirecting accesses to cleartext
- Simplification of the site code which now doesn't have to keep track of which pages need to be secure and which does not
- Prevention of evil things like WiFi hotspot providers injecting ads or javascript into the pages
We have not yet enabled HTTP Strict Transport Security, but will do so in a couple of days once we have verified all functionality. We have also not enabled HTTP/2 yet, this will probably come at a future date.
Please help us out with testing this, and let us know if you find something that's not working, by emailing the pgsql-www mailinglist.
There are still some other postgresql.org websites that are not available over https, and we will be working on those as well over the coming weeks or months.
Umair Shahid: Using Hibernate Query Language (HQL) with PostgreSQL
In my previous blog, I talked about using Java arrays to talk to PostgreSQL arrays. This blog is going to go one step further in using Java against the database. Hibernate is an ORM implementation available to use with PostgreSQL. Here we discuss its query language, HQL.
The syntax for HQL is very close to that of SQL, so anyone who knows SQL should be able to ramp up very quickly. The major difference is that rather than addressing tables and columns, HQL deals in objects and their properties. Essentially, it is a complete object oriented language to query your database using Java objects and their properties. As opposed to SQL, HQL understands inheritance, polymorphism, & association. Code written in HQL is translated by Hibernate to SQL at runtime and executed against the PostgreSQL database.
An important point to note here is, references to objects and their properties in HQL are case-sensitive; all other constructs are case insensitive.
Why Use HQL?
The main driver to using HQL would be database portability. Because its implementation is designed to be database agnostic, if your application uses HQL for querying the database, you can interchange the underlying database by making simple changes to the configuration XML file. As opposed to native SQL, the actual code will remain largely unchanged if your application starts talking to a different database.
Prominent Features
A complete list of features implemented by HQL can be found on their website. Here, we present examples of some basic and salient features that will help you get going on HQL. These examples are using a table by the name of ‘largecities’ that lists out the 10 largest metropolitans of the world. The descriptor and data are:
postgres=# \d largecities Table "public.largecities" Column | Type | Modifiers --------+------------------------+----------- rank | integer | not null name | character varying(255) | Indexes: "largecities_pkey" PRIMARY KEY, btree (rank)
postgres=# select * from largecities; rank | name ------+------------- 1 | Tokyo 2 | Seoul 3 | Shanghai 4 | Guangzhou 5 | Karachi 6 | Delhi 7 | Mexico City 8 | Beijing 9 | Lagos 10 | Sao Paulo (10 rows)
HQL works with a class that this table is mapped to in order to create objects in memory with its data. The class is defined as:
@Entity public class LargeCities { @Id private int rank; private String name; public int getRank() { return rank; } public String getName() { return name; } public void setRank(int rank) { this.rank = rank; } public void setName(String name) { this.name = name; } }
Notice the @Entity and @Id annotations, which are declare the class ‘LargeCities’ as an entity and the property ‘rank’ as the identifier.
The FROM Clause
The FROM clause is used if you want to load all rows of the table as objects in memory. The sample code given below retrieves all rows from table ‘largecities’ and lists out the data from objects to stdout.
try { SessionFactory sessionFactory = new Configuration().configure().buildSessionFactory(); Session session = sessionFactory.openSession(); session.beginTransaction(); Query query = session.createQuery("FROM LargeCities"); List<LargeCities> cities = (List<LargeCities>)query.list(); session.close(); for (LargeCities c : cities) System.out.println(c.getRank() + " " + c.getName()); } catch (Exception e) { System.out.println(e.getMessage()); }
Note that ‘LargeCities’ referred to in the HQL query is not the ‘largecities’ table but rather the ‘LargeCities’ class. This is the object oriented nature of HQL.
Output from the above program is as follows:
1 Tokyo 2 Seoul 3 Shanghai 4 Guangzhou 5 Karachi 6 Delhi 7 Mexico City 8 Beijing 9 Lagos 10 Sao Paulo
The WHERE Clause
There can be instances where you would want to specify a filter on the objects you want to see. Taking the above example forward, you might want to see just the top 5 largest metropolitans in the world. A WHERE clause can help you achieve that as follows:
try { SessionFactory sessionFactory = new Configuration().configure().buildSessionFactory(); Session session = sessionFactory.openSession(); session.beginTransaction(); Query query = session.createQuery("FROM LargeCities WHERE rank < 6"); List<LargeCities> cities = (List<LargeCities>)query.list(); session.close(); for (LargeCities c : cities) System.out.println(c.getRank() + " " + c.getName()); } catch (Exception e) { System.out.println(e.getMessage()); }
Output from the above code is:
1 Tokyo 2 Seoul 3 Shanghai 4 Guangzhou 5 Karachi
The SELECT Clause
The default FROM clause retrieve all columns from the table as properties of the object in Java. There are instances where you would want to retrieve only selected properties rather than all of them. In such a case, you can specify a SELECT clause that identifies the precise columns you want to retrieve.
The code below selects just the city name for retrieval. Note that, because it now just one column that is being retrieved, Hibernate loads it as a list of Strings rather than a list of LargeCities objects.
try { SessionFactory sessionFactory = new Configuration().configure().buildSessionFactory(); Session session = sessionFactory.openSession(); session.beginTransaction(); Query query = session.createQuery("SELECT name FROM LargeCities"); List<String> cities = (List<String>)query.list(); session.close(); for (String c : cities) System.out.println(c); } catch (Exception e) { System.out.println(e.getMessage()); }
The output of this code is:
Tokyo Seoul Shanghai Guangzhou Karachi Delhi Mexico City Beijing Lagos Sao Paulo
Named Parameters
Much like prepared statements, you can have named parameters through which you can use variables to assign values to HQL queries at runtime. The following example uses a named parameter to find out the rank of ‘Beijing’.
try { SessionFactory sessionFactory = new Configuration().configure().buildSessionFactory(); Session session = sessionFactory.openSession(); session.beginTransaction(); Query query = session.createQuery("SELECT rank FROM LargeCities WHERE name = :city_name"); query.setParameter("city_name", "Beijing"); List<Integer> rank = (List<Integer>)query.list(); session.getTransaction().commit(); session.close(); for (Integer c : rank) System.out.println("Rank is: " + c.toString()); } catch (Exception e) { System.out.println(e.getMessage()); }
Output for this code:
Rank is: 8
Pagination
When programming, numerous scenarios present themselves where code is required to be processed in chunks or pages. The process is called pagination of data and HQL provides a mechanism to handle that with a combination of setFirstResult and setMaxResults, methods of the Query interface. As the names suggest, setFirstResult allows you to specify which record should be the starting point for record retrieval while setMaxResults allows you to specify the maximum number of records to retrieve. This combination is very helpful in Java or in web apps where a large result set is shown split into pages and the user has the ability to specify the page size.
The following code breaks up our ‘largecities’ examples into 2 pages and retrieves data for them.
try { SessionFactory sessionFactory = new Configuration().configure().buildSessionFactory(); Session session = sessionFactory.openSession(); session.beginTransaction(); Query query = session.createQuery("FROM LargeCities"); query.setFirstResult(0); query.setMaxResults(5); List<LargeCities> cities = (List<LargeCities>)query.list(); System.out.println("*** Page 1 ***"); for (LargeCities c : cities) System.out.println(c.getRank() + " " + c.getName()); query.setFirstResult(5); cities = (List<LargeCities>)query.list(); System.out.println("*** Page 2 ***"); for (LargeCities c : cities) System.out.println(c.getRank() + " " + c.getName()); session.close(); } catch (Exception e) { System.out.println(e.getMessage()); }
An important point to keep in mind here is that Hibernate usually does pagination in memory rather than at the database query level. This means that for large data sets, it might be more efficient to use cursors, temp tables, or some other construct for pagination.
Other Features
A comprehensive list of features is available on the Hibernate website, but a few more worth mentioning here are:
- UPDATE Clause
- DELETE Clause
- INSERT Clause
- JOINs
- Aggregate Methods
- avg
- count
- max
- min
- sum
Drawbacks of Using HQL
HQL gives its users a lot of flexibility and rich set of options to use while talking to a database. The flexibility does come at a price, however. Because HQL is designed to be generic and largely database-agnostic, you should watch out for the following when using HQL.
- At times, you would need to use advanced features & functions that are specific to PostgreSQL. As an example, you might want to harness the power of the newly introduced JSONB data type. Or you might want to use window functions to analyze your data. Because HQL tries to be as generic as possible, in order to use such advanced features, you will need to fallback to native SQL.
- Because of the way left joins are designed, if you are joining an object to another table / object in a one-to-many or many-to-many format, you can potentially get duplicate data. This problem is exacerbated in case of cascading left joins and HQL has to preserve references to these duplicates, essentially ending up transferring a lot of duplicate data. This has the potential to significantly impact performance.
- Because HQL does the object-relational mapping itself, you don’t get full control over how and what data gets fetched. One such infamous issue is the N+1 problem. Although you can find workarounds within HQL, identifying the problem can at time get very tricky.
Pavel Stehule: Orafce 3.3.0 was released
Nikolay Shaplov: postgres: reloption ALTER INDEX bug
For example if you do for bloom index
alter index bloomidx set ( length=15 );
postgres will successfully run this, change the value of reloptions attribute in pg_class, and bloom index will work wrong after it.
And there is no way to forbid this from inside of an extension.
I think I would add there a flag in reloption descriptor that will tell whether it is allowed to change this reloption using ALTER INDEX, or not