Egor Spivac: How to get some information about PostgreSQL structure (Part 3)

August 13, 2012, 6:51 am

≪ Previous: Chris Travers: ACID as a basic building block of eventually consistent, distributed transactions

Constraints

Get constraints list for current database:

SELECT DISTINCT (pc2.relname || '.' || r.conname) AS fullname, 
    r.conname AS constraint_name,
    r.contype AS constraint_type,
    r.condeferrable AS is_deferrable,
    r.condeferred AS is_deferred,
    r.confupdtype AS update_action,
    r.confdeltype AS delete_action,
    pc1.relname AS foreign_table,
    pc2.relname AS this_table,
    kcu1.constraint_schema AS this_schema,
    pg_catalog.pg_get_constraintdef(r.oid, true) as sqlstr 
FROM pg_constraint AS r
LEFT JOIN pg_class AS pc1 ON pc1.oid = r.confrelid
LEFT JOIN pg_class AS pc2 ON pc2.oid = r.conrelid
LEFT JOIN information_schema.key_column_usage AS kcu1 ON

(kcu1.table_name=pc2.relname AND kcu1.constraint_name=r.conname)
ORDER BY 1;

Tablespaces

Get tablespaces list:

SELECT spcname AS name FROM pg_tablespace ORDER BY spcname ASC

Views

Get all views for schema public with sql code:

SELECT c.oid, c.xmin, c.relname, pg_get_userbyid(c.relowner) AS viewowner,

c.relacl, description, pg_get_viewdef(c.oid, true) AS code 
FROM pg_class c 
LEFT OUTER JOIN pg_description des ON (des.objoid=c.oid and des.objsubid=0) 
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace  
WHERE (
    (c.relhasrules AND (EXISTS ( 
        SELECT r.rulename FROM pg_rewrite r 
        WHERE ( (r.ev_class = c.oid) AND (bpchar(r.ev_type) = '1'::bpchar))
        ))
    )
    OR (c.relkind = 'v'::char)) 
    AND n.nspname='public' 
ORDER BY relname ASC

See other parts:
Part 2
Part 1

↧

Paul Ramsey: PostGIS Apologia

August 13, 2012, 10:52 am

≫ Next: Josh Berkus: Launchpad minimizes downtime using 9.1 replication

≪ Previous: Egor Spivac: How to get some information about PostgreSQL structure (Part 3)

Nathaniel Kelso has provided feedback from an (occasionally disgruntled) users point-of-view about ways to make PostGIS friendlier. I encourage you to read the full post, since it includes explanatory material that I'm going to trim away here to explain the whys and wherefores of how we got to where we are.

TL;DR: philosophical reasons for doing things; historical reasons for doing things; not my problem; just never got around to that.

Request 1a: Core FOSS4G projects should be stable and registered with official, maintained APT Ubuntu package list.

Request 1b: The APT package distribution of core FOSS4G projects should work with the last 2 versions (equivalent to 2 years) of Ubuntu LTS support releases, not just the most recent cutting edge dot release.

Spoken like an Ubuntu user! I would put the list of "platforms that have enough users to require packaging support" at: Windows, OSX, Centos (RHEL), Fedora, Ubuntu, Debian, SUSE. Multiply by 2 for 32/64 bit support, and add a few variants for things like multiple OSX package platforms (MacPorts, HomeBrew, etc). Reality: the PostGIS team isn't going to do this, people who want support for their favourite platform have to do it themselves.

The only exception to this rule is Windows, which Regina Obe supports, but that's because she's actually a dual category person: a PostGIS developer who also really wants her platform supported.

The best Linux support is for Red Hat variants, provided by Devrim Gunduz in the PostgreSQL Yum repositories. I think Devrim's example is actually the best one, since it takes a PostgreSQL packager to do a really bang up job of packaging a PostgreSQL add-on like PostGIS. Unfortunately the Ubuntu PostgreSQL packager doesn't do PostGIS as well.

Request 1c: Backport key bug fixes to the prior release series

This is actually done as a matter of course. If you know of a fix that is not backported ticket it. In general, if you put your tickets against the earliest milestone they apply to, the odds of a fix hitting all extant versions goes up, since the developer doesn't have to go back and confirm the bug is historical rather than new to the development version. The only fixes that might not get done are ones that can't be done without major code re-structuring, since that kind of thing tends to introduce as many problems as it solves.

Request 2.1a: Include a default PostGIS spatial database as part of the basic install, called “default_postgis_db” or something similar.

This is a packaging issue, and some packagers (Windows in particular, but also the OpenGeo Suite) include a template_postgis database, since it makes it easier to create new spatial databases (create database foo template template_postgis).

Anyways, as a packaging issue unless the PostGIS team took on all packaging there would be no way to ensure this happened in a uniform way everywhere, which is what one would need to do to have it makes things easier (for it to become general knowledge so that "oh, just use the ______ database" became global advice.

More on creating spatial databases below.

Request 2.1b: Include a default PostGIS Postgres user as part of the basic install, called “postgis_user” or something similar.

I'm not sure I see the utility of this. From a data management point of view, you already have the PostgreSQL super user, postgres, around as a guaranteed-to-exist default user.

Request 2.1c: If I name a spatially enabled database in shp2pgsql that doesn’t yet exist, make one for me

Unless you have superuser credentials I can't do this. So, maybe?

Request 2.1d: It’s too hard to manually setup a spatial database, with around a printed page of instructions that vary with install. It mystifies Postgres pros as well as novices.

Indeed it is! I will hide behind the usual defence, of course, "it's not our fault!" It's just the way PostgreSQL deals with extensions, including their own (load pgcrypto, for example, or fuzzystring). The best hack we have is the packaging hack that pre-creates a template_postgis, which works pretty well.

Fortunately, as of PostgreSQL 9.1+ and PostGIS 2.0+ we have the "CREATE EXTENSION" feature, so from here on in spatializing (and unspatializing (and upgrading)) a spatial database will be blissfully easy, just CREATE EXTENSION postgis (and DROP EXTENSION postgis (and ALTER EXTENSION postgis UPDTAE TO 2.1.0)).

Request 2.1e: Default destination table names in shp2pgsql.

We have this, I just checked (1.5 and 2.0). The usage line indicates it, and it actually happens. I'm pretty sure it's worked this way for a long time too, it's not a new thing.

Request 2.1f: Automatically pipe the output to actually put the raw SQL results into PostGIS.

I'll plead historical legacy on this one. The first version (c. 2001) of the loader was just a loader, no dumper, so adding in a database connection would have been needless complexity: just pipe it to psql, right?

Then we got a dumper, so now we had database connection logic lying around, but the loader had existing semantics and users. Also the code was crufty and it would have had to be re-written to get a direct load.

Then we got a GUI (1.5), and that required re-writing the core of the loader to actually do a direct database load. But we wanted to keep the commandline version working the same as before so our existing user base wouldn't get a surprise change. So at this point doing a direct database loader is actually trivial, but we deliberately did not, to avoid tossing a change at our 10 years of legacy users.

So this is very doable, the question is whether we want to make a change like this to a utility that has been unaltered for years.

Incidentally, from an easy-to-use-for-newbies point of view the GUI is obviously way better than the command line. Why not use that? It's what I use in all my PostGIS courses now.

Request 2.1g: If my shapefile has a PRJ associated with it (as most do), auto populate the -s option.

You have no idea how long I've wanted to do this. A very long time. It is, however, very hard to do. PRJ files don't come (except the ones generated by GeoTools) with EPSG numbers in them. You have to figure out the numbers by (loosely) comparing the values in the file to the values in the full EPSG database. (That's what the http://prj2epsg.org web site does.)

Now that we've added GDAL as a dependency in 2.0 we do at least have access to an existing PRJ WKT parser. However, I don't think the OGR API provides enough hooks though to do something like load up all the WKT definitions in spatial_ref_sys (which is what we'll have to do regardless) and search through them with sufficient looseness.

So this remains an area of active research. Sadly, probably not something that anyone will ever fund, which means given the level of effort necessary to make it happen, probably won't happen.

Related 2.1h Projection on the fly: If you still can’t reproject data on the fly, something is wrong. If table X is in projection 1 (eg web merc) and table Y is in projection 2 (eg geographic), PostGIS ought to “just work”, without me resorting to a bunch of ST_Transform commands that include those flags. The SRID bits in those functions should be optional, not required.

Theoretically possible, but it has some potentially awful consequences for performance. You can only do index-assisted things with objects that share an SRS (SRID), since the indexes are built in one SRS space. So picking a side of an argument and pushing it into the same SRS as the other argument could cause you to miss out on an index opportunity. It's worth perhaps thinking more about, though, since people with heterogenous SRID situations will be stuck in low performing situations whether we auto-transform or not.

The downside of all such "automagic" is that it leads people into non-optimal set-ups very naturally so they end up wondering why PostGIS sucks for performance when actually it is there data setup that sucks.

Request 2.1i: Reasonable defaults in shp2pgsql import flags.

Agree 100%. Again, we're just not changing historical defaults fast enough. The GUI has better defaults, but it wouldn't hurt for the commandline to have them too.

Request 2.1j: Easier creation of point features from csv or dbf.

A rat-hole of unknowable depth (csv handling, etc) but agreed, a really very common and useful utility it would be. I just write a new perl script every time :)

Request 2.3a: Forward compatible pgdumps. Dumps from older PostGIS & Postgres combinations should always import into newer combinations of PostGIS and Postgres.

Upgrade has been ugly for a long time, and again it's "not our fault", in that until PostgreSQL 9.1, pg_dump always included our functions in the dump files. If you strip out the PostGIS function signature stuff, it's easy to get a clean and quiet restore into new versions, since we happily read old dumped PostGIS data and always have.

If you don't mind a noisy restore it's also always been possible to just drop a dump onto a new database and ignore the errors as function signatures collide and get a good restore.

With "CREATE EXTENSION" in PostgreSQL 9.1, we will now finally be able to pg_dump clean dumps that don't include the function information, so this story more or less goes away.

Request 2.3b: Offer an option to skip PostGIS simple feature topology checks when importing a pgdump.

It's important to note that there are two levels of validity checking in PostGIS. One level is "dumbass validity checking", which can happen at parse time. Do rings close? Do linestrings have more than one point? That kind of thing. For a brief period in PostGIS history we have had some ugly situations where it was possible to create or ingest dumbass geometry through one code path and impossible to output it or ingest it through others. This was bad and wrong. It's hopefully mostly gone. We should now mostly ingest and output dumbass things, because those things do happen. We hope you'll clean or remove them though at a later time.

Be thankful we aren't ArcSDE, which not only doesn't accept dumbass things, it doesn't accept anything that fails any rule of their whole validity model.

Request 3a: Topology should only be enforced as an optional add on, even for simple Polygon geoms. OGC’s view of polygon topology for simple polygons is wrong (or at the very least too robust).

Request 3b: Teach PostGIS the same winding rule that allows graphics software to fill complex polygons regarding self-intersections. Use that for simple point in polygon tests, etc. Only force me to clean the geometry for complicated map algebra.

Request 3c: Teach OGC a new trick about “less” simple features.

Request 3d: Beyond the simple polygon gripe, I’d love it if GEOS / PostGIS could become a little more sophisticated. Adobe Illustrator for several versions now allows users to build shapes using their ShapeBuilder tool where there are loops, gaps, overshoots, and other geometry burrs. It just works. Wouldn’t that be amazing? And it would be even better that ArcGIS.

We don't enforce validity, we just don't work very well if it's not present.

Most of these complaints stem presumably from working with Natural Earth data which, since it exists, is definitionally "real world" data, but also includes some of the most unbelievably degenerate geometry I have ever experienced.

Rather than build special cases to handle degeneracy into every geometry function in the system, the right approach, IMO, is to build two functions that convert degenerate data into structured data that captures the "intent" of the original.

One function, ST_MakeValid, has already made an appearance in PostGIS 2.0. It isn't 100% perfect, but it makes a good attempt and fixes many common invalidities that previously we had no answer for beyond "you're SOL". ST_MakeValid tries to fix invalidity without changing the input vertices at all.

The second function, ST_MakeClean, does not exist yet. ST_MakeClean would do everything ST_MakeValid does, but would also include a tolerance factor to use in tossing out unimportant structures (little spikes, tiny loops, minor ring crossings) that aren't part of the "intent" of the feature.

↧

Josh Berkus: Launchpad minimizes downtime using 9.1 replication

August 13, 2012, 11:49 am

≫ Next: Bruce Momjian: Unix Domain Socket Location

≪ Previous: Paul Ramsey: PostGIS Apologia

Canonical staff member Robert Collins has a nice blog about how Launchpad reduced downtimes for agile schema deployment from an hour down to 5 seconds. They did this over the last couple months by moving from Slony to Slony+pgBouncer and then to pgBouncer+Binary Replication. This is what we make nice tools for; to make our users' lives easier.

↧

Bruce Momjian: Unix Domain Socket Location

August 13, 2012, 1:30 pm

≫ Next: David Fetter: Give me your tired, your poor...

≪ Previous: Josh Berkus: Launchpad minimizes downtime using 9.1 replication

All Postgres servers support TCP/IP connections, including localhost connections that allow clients to connect to servers on the same machine. Unix-like operating systems also support local or Unix-domain socket connections. These connections do not use the TCP/IP stack but rather a more efficient stack for local connections. (I previously showed that Unix-domain socket communication is measurably faster.)

Unix-domain socket connections require a socket file in the local file system. These are not normal files but more like entry points to listening servers. By default, Postgres places these socket files in the /tmp directory:

srwxrwxrwx  1 postgres postgres    0 Jul 30 20:27 .s.PGSQL.5432=

↧

David Fetter: Give me your tired, your poor...

August 14, 2012, 9:44 am

≫ Next: Josh Berkus: My Autumn Travel Schedule

≪ Previous: Bruce Momjian: Unix Domain Socket Location

...your huddled checklists, yearning to breathe free.

OK, I can't actually help out with the tired and poor until I get the checklists, so let's start with those.
Continue reading "Give me your tired, your poor..."

↧

Josh Berkus: My Autumn Travel Schedule

August 15, 2012, 12:21 pm

≫ Next: Leo Hsu and Regina Obe: Feature or Frustration

≪ Previous: David Fetter: Give me your tired, your poor...

Just updating folks on where I'll be and what I'll be presenting, in case anyone wants to say "hello" or buy me a beer:

August: LinuxCon, San Diego. Presenting "The Accidental DBA".

September: Postgres Open, Chicago. Presenting an updated "Super Jumbo Deluxe". Also, doing a big PostgreSQL 9.2 talk for SFPUG.

October: pgconf.EU. Not sure what I'll be presenting; talk acceptances aren't final yet. But, hey, Prague!

November: no conferences, thank goodness.

December: Back to San Diego for Usenix LISA. Working a booth and doing a Guru Session with Selena Deckelmann and Joe Conway, and likely a BOF as well. Drop by the booth! Possible San Francisco pgDay in December; watch this space for more information.

↧

Leo Hsu and Regina Obe: Feature or Frustration

August 15, 2012, 12:38 pm

≫ Next: Selena Deckelmann: Submissions for Lightning Talks for Postgres Open being accepted

≪ Previous: Josh Berkus: My Autumn Travel Schedule

Lately I'm reminded that one person's feature is another person's frustration. I've been following Paul's PostGIS Apologia detail about why things are done a certain way in PostGIS in response to Nathaniel Kelso's: A friendlier PostGIS? Top three areas for improvement. I've also been following Henrik Ingo: comparing Open Source GIS Implementation to get a MySQL user's perspective on PostGIS / PostgreSQL. Jo Cook has some interesting thoughts as well in her PostGIS for beginners amendment to Paul's comments. I have to say that both Nathaniel, Henrik, Jo and commenters on those entries have overlapping frustrations with PostgreSQL and PostGIS. The number one frustration is caused by how bad a job we do at pointing out avenues to get a friendly installation experience. I do plan to change this to at least document the most popular PostGIS package maintainers soon.

One of the things that Henrik mentioned was his frustration with trying to install PostGIS via Yum PostgreSQL repository and in fact not even knowing about the PostgreSQL Yum repository that has current and bleeding versions of PostgreSQL. I was surprised he didn't know because as a long-time user of PostgreSQL, I dismissed this as common knowledge. This made me realize just how out of touch I've become with my former newbie self and I consider this a very bad thing. I was also very surprised about another feature he complained about - CREATE EXTENSION did not work for him because he accidentally installed the wrong version of PostGIS in his PostgreSQL 9.1. The main reason for his frustration was something I thought was a neat feature of PostGIS. That is that PostGIS is not packaged into PostgreSQL core and you can in fact have various versions of PostGIS installed in the same PostgreSQL cluster. This unlike the other OGC spatial offerings of other databases (SQL Server, Oracle, MySQL) allows the PostGIS dev group to work on their own time schedule largely apart from PostgreSQL development group pressures. It also means we can take advantage of breaking changes introduced in PostGIS 2.+ for example without impacting existing apps people have running 1.5 and also allow people to take advantage of newer features even if they are running an earlier PostgreSQL version.

Continue reading "Feature or Frustration"

↧

Selena Deckelmann: Submissions for Lightning Talks for Postgres Open being accepted

August 15, 2012, 5:38 pm

≫ Next: Tatsuo Ishii: Pgpool-II talk at PostgreSQL Conference Europe 2012

≪ Previous: Leo Hsu and Regina Obe: Feature or Frustration

By popular demand, we’re having a session of lightning talks at Postgres Open this year!

What is a lightning talk, you ask? It’s a 5-minute talk on a topic of your choosing. (For this conference, it should be at least vaguely postgres- or database-related.) Make it as serious or entertaining as you like. If you’ve never given a talk at a conference before, this is a great way to try it out. The audience is forgiving, and it’s only 5 minutes!

Slides are not required, but are helpful.

The session will be 5pm – 6pm on Tuesday, Sept 18.
HREF="https://docs.google.com/spreadsheet/viewform?formkey=dGlBcWxmeVFZZjFfenFOdXAxWjFqZVE6MQ#gid=0"Sign up today!

There’s a limited number of spaces, so get your talks in now!

(Many thanks to Gabrielle for writing this blog post!)

(And psst – don’t forget to buy your tickets!

↧

Tatsuo Ishii: Pgpool-II talk at PostgreSQL Conference Europe 2012

August 15, 2012, 7:51 pm

≫ Next: Andreas Scherbaum: New PostgreSQL pg_docbot is live

≪ Previous: Selena Deckelmann: Submissions for Lightning Talks for Postgres Open being accepted

I'm going to give a pgpool-II talk at upcoming PostgreSQL Conference Europe 2012. The talk is titled "Boosting performance and reliability by using pgpool-II" and I will explain how to enhance DB performance by using pgpool-II 3.2's "on memory query cache". Also I will explain how to set up "watchdog", which is also a new feature of pgpool-II 3.2. By using this, you can avoid SPOF(single point of failure) problem of pgpool-II itself without using extra HA software.

The conference will be held in Prague, the Czech Republic, October 23-26. I've never been to Prague, and am excited at this opportunity to visit the old beautiful city!

↧

Andreas Scherbaum: New PostgreSQL pg_docbot is live

August 16, 2012, 4:55 pm

≫ Next: Josh Berkus: Today's Security Update: XML vulnerabilities

≪ Previous: Tatsuo Ishii: Pgpool-II talk at PostgreSQL Conference Europe 2012

Andreas 'ads' Scherbaum

Last night a long-running project of mine went live: pg_docbot v2.

For years, Jan Wieck provided a helper bot (rtfm_please) in the #postgresql IRC channel in the freenode network. Because of protocol changes in the freenode network, this bot was no longer functional. Together with some others we decided to write a quick and dirty new bot. As it is with dirty hacks, not everything was optimal: after timeouts the bot was not able to reconnect - more exactly the POE framework did not even recognize the timeout. Also extending the bot and adding new functionality was complicated. For a while I collected all these problems in my personal bugtracker and about two years ago I started a full rewrite.

Some of the new key features:

pg_docbot's channel limit is gone: a user in the freenode network can only join 20something channels, the new bot was designed from the ground to handle multiple IRC connections and circumvent this problem
function to identify stale urls: the new ?lost command shows all unconnected urls
registered users are now either "op" or "admin": all operators can issue ?learn and ?forget, admins can - of course - do everything
new command to post to all channels: the ?wallchan command let the doc post to all channels
i18n: every channel has a configured language, default is English - all messages in this channel are posted in the configured language (if translation is available)
watchdog on board: every session is monitored and reconnected, if necessary - no more "ads: can you please restart the bot?"
nickname handling: every session is monitoring his (registered) nickname and will reclaim the nick if necessary, also nickserv handling is included now
commands are recognized in different languages: a nice add-on, by-product of i18n, most commands can be used in different languages - like "search" (English) and "suche" (German)
bot can join and leave channels on the fly: not much to say about, just that you can have the bot in a temporary PostgreSQL channel if you like
channels can have paswords now: this works both for configured channels as well as on-the-fly joined channels
autojoin channels: configured but not joined channels are rejoined after a while, also it is possible to configure but not autojoin channels
statistics: the bot runs anonymous stats about his usage, like ?search, ?learn, ?forget and so on

There is still a lot to do, not all of my tickets are closed. If you want pg_docbot talking in your language, please send me translations. The pg_docbot code is on git.postgresql.org.

Next things on my todo list:

verify each URL from time to time: mark unreachable as invalid
intelligent sort order: not yet sure how to solve this problem, right now there is no specific sort order
move pg_docbot to PostgreSQL infrastructure
web interface: the bot should redirect the user to his website if there are more then let's say 2 or 3 urls, to avoid flooding the IRC channels
integration in postgresql.org website: the pg_docbot database contains very useful knowledge, there are plans to integrate this into the search on the main website
integration with explain.depesz.com: every time the bot see's a link from a paste site, it should scan the content and generate a postign on explain.depesz.com
monitor planet.postgresql.org: publish new postings in IRC channels
allow better search: like using a regexp
...

Continue reading "New PostgreSQL pg_docbot is live"

↧

Josh Berkus: Today's Security Update: XML vulnerabilities

August 17, 2012, 11:38 am

≫ Next: gabrielle roth: My Picks for Postgres Open 2012

≪ Previous: Andreas Scherbaum: New PostgreSQL pg_docbot is live

If you're subscribed to the PostgreSQL News feed (and if not, why aren't you?) you already know that we released another security patch today. This update release patches two security holes having to do with XML functionality, which can be used by any authenticated user for privilege escalation. Everyone should apply the updates at the next reasonable downtime, and users of contrib/xml2 should schedule a downtime this weekend.

You'll notice that this security update release is on a Friday. This is not our normal practice, since it makes it difficult for IT departments to respond in a timely fashion (especially the folks in Australia/Japan, who will be getting the announcement on Saturday). However, as our number of packagers for different platforms has grown, it's become increasingly difficult to schedule coordinated releases, especially during the summer and the India/Pakistan holiday season. As it is, the Solaris folks will be getting their binaries late.

Anyway, about these vulnerabilities: The first one is a vulnerability in the built-in xml_parse() function. Among other things, this function used to allow users to validate the XML against a DTD contained in an external file. However, a creative user would be able to call any file which the "postgres" user has permissions on, and read parts of that file in the error messages from the DTD reader. Since "files the postgres user has access to" includes pg_hba.conf and .pgpass (if defined), this is a dangerous capability. As such, we have had to disable the DTD-validation feature. This validation feature may return at a later date if we can figure out a reasonable way to make it safe.

The second security hole is in the outdated extension (or "contrib module"), xml2. Users still use xml2 rather than the built-in XML because of its XSLT support. The problem is that xslt_process() had a documented "feature" which allowed the function to transform XML documents fetched from external resources, such as URLs -- without requiring superuser permissions. Not only could this be used to read local files on the PostgreSQL server, it could be used to write them as well, making this a much worse security hole if you have xml2 installed. As such, the xslt_process() feature has been disabled, and will probably not return.

We've been sitting on both of these patches for an embarassingly long time (for our project, at least), because we were looking for a solution which didn't involve disabling functionality which is potentially valuable to some users.

The best way to close both of these security holes is to apply the updates, of course. It's just a 5-minute downtime, and the updates contain patches for a couple dozen minor issues which can cause data loss or downtimes in certain circumstances. However, if you cannot apply the updates right away, the obvious workaround for the xml_parse() issue is to revoke EXECUTE permission on the function from "public" and all users. This doesn't help you if you are using the XML features, though.

↧

gabrielle roth: My Picks for Postgres Open 2012

August 17, 2012, 6:52 pm

≫ Next: Joel Jacobson: Production upgrade from 8.4 to 9.1

≪ Previous: Josh Berkus: Today's Security Update: XML vulnerabilities

The schedule for Postgres Open is out! Here are the talks I’m planning on attending: Tuesday: Range Types in PostgreSQL 9.2 (Jonathan S Katz) Large Scale MySQL Migration to PostgreSQL (Dimitri Fontaine) because I love case studies 12 Calm Years of PostgreSQL in Critical Messaging (John Scott) another case study This is PostGIS (Paul Ramsey) [...]

↧

Joel Jacobson: Production upgrade from 8.4 to 9.1

August 19, 2012, 4:08 am

≫ Next: Hubert 'depesz' Lubaczewski: Waiting for 9.3 – Implement SQL-standard LATERAL subqueries.

≪ Previous: gabrielle roth: My Picks for Postgres Open 2012

At Wednesday, 2012-08-15 05:00 am, the work began. Four hours of downtime later at 09:00 am the upgrade was complete.
Two brand new identical HP DL380 Gen8 servers with 400GB SSD-disks and 192GB RAM are now serving our customers.

This was one of the most nervous moments in my life, a revert would have been impossible once the API was brought back online.
Luckily, it turned out we had done a decent job testing everything, only a few glitches caused some minor problems during the following hours.

Many thanks to Magnus Hagander from Redpill Linpro who helped out with the upgrade.
I had never seen pg_basebackup in action before, impressive, very user-friendly, it even had a nice progress bar!

Trustly Group AB is a very PostgreSQL centric payment company, so a moment like this of course called for some cake and champagne!

PostgreSQL upgrade from 8.4 to 9.1

It’s been very frustrating to see all new cool features in 9.0 and 9.1, while being stuck at 8.4.
What I’ve been looking forward to the most:

Streaming replication to off-load work from the master to slaves in Hot Standby mode

Synchronous replication to get rid of DRBD

Calling functions with named parameters

↧

Hubert 'depesz' Lubaczewski: Waiting for 9.3 – Implement SQL-standard LATERAL subqueries.

August 19, 2012, 9:46 am

≫ Next: Jim Mlodgenski: Philly PUG first meeting scheduled

≪ Previous: Joel Jacobson: Production upgrade from 8.4 to 9.1

On 7th of August, Tom Lane committed patch: Implement SQL-standard LATERAL subqueries. This patch implements the standard syntax of LATERAL attached to a sub-SELECT in FROM, and also allows LATERAL attached to a function in FROM, since set-returning function calls are expected to be one of the principal use-cases. The main change here [...]

↧

Jim Mlodgenski: Philly PUG first meeting scheduled

August 20, 2012, 8:12 am

≫ Next: Szymon Guz: Using Different PostgreSQL Versions at The Same Time.

≪ Previous: Hubert 'depesz' Lubaczewski: Waiting for 9.3 – Implement SQL-standard LATERAL subqueries.

We're scheduled our first meeting of the Philly PostgreSQL User Group so if you in our around Philly on Septemeber 12, come join us. Bruce will be talking about what's new in 9.2.

http://www.phlpug.org/events/78529522/

↧

Szymon Guz: Using Different PostgreSQL Versions at The Same Time.

August 20, 2012, 8:37 am

≫ Next: Andrew Dunstan: Adding an "if not exists" option when adding an enum label.

≪ Previous: Jim Mlodgenski: Philly PUG first meeting scheduled

When I work for multiple clients on multiple different projects, I usually need a bunch of different stuff on my machine. One of the things I need is having multiple PostgreSQL versions installed.

I use Ubuntu 12.04. Installing PostgreSQL there is quite easy. Currently there are available two version out of the box: 8.4 and 9.1. For installing them I used the following command:

~$ sudo apt-get install postgresql-9.1 postgresql-8.4 postgresql-client-common

Now I have the above two version installed.

Starting the database is also very easy:

~$ sudo service postgresql restart
 * Restarting PostgreSQL 8.4 database server   [ OK ] 
 * Restarting PostgreSQL 9.1 database server   [ OK ]

The problem I had for a very long time was using the proper psql version. Both database installed their own programs like pg_dump and psql. Normally you can use pg_dump from the higher version PostgreSQL, however using different psql versions can be dangerous. You'd better don't mess with the psql versions. Psql uses a lot of queries which digg deep into the PostgreSQL internal tables for getting information about the database. Those internals sometimes change from one database version to another; so the best solution is to use the psql from the PostgreSQL installation you want to connect to.

The solution to this problem turned out to be quite is simple. There is pg_wrapper program which can take care of the different versions. It is enough to provide information about the PostgreSQL version you want to connect to and it will automatically choose correct psql version.

Below you can see the results of using psql --version command which prints the psql version. As you can see there are different psql versions chosen according to the --cluster parameter.

~$ psql --cluster 8.4/main --version
psql (PostgreSQL) 8.4.11
contains support for command-line editing
~$ psql --cluster 9.1/main --version
psql (PostgreSQL) 9.1.4
contains support for command-line editing

More information you can find at the program manual using man pg_wrapper or at pg_wrapper manual

↧

Andrew Dunstan: Adding an "if not exists" option when adding an enum label.

August 20, 2012, 1:48 pm

≫ Next: David Wheeler: Sqitch: Depend On It!

≪ Previous: Szymon Guz: Using Different PostgreSQL Versions at The Same Time.

Back in May I blogged about complaints that adding a label to an enum type isn't transactional. Today I submited a patch which will alleviate some of the pain that causes, by allowing an "if not exists" option to the command:

andrew=# alter type foo add value 'a';
ERROR:  duplicate key value violates unique constraint
"pg_enum_typid_label_index"
DETAIL:  Key (enumtypid, enumlabel)=(16386, a) already exists.
andrew=# alter type foo add value if not exists 'a';
ALTER TYPE

↧

David Wheeler: Sqitch: Depend On It!

August 20, 2012, 3:36 pm

≫ Next: Chris Travers: Intro to PostgreSQL as Object-Relational Database Management System

≪ Previous: Andrew Dunstan: Adding an "if not exists" option when adding an enum label.

Sqitch v0.90 dropped last week (updated to v0.902 today). The focus of this release of the “sane database change management” app was cross-project dependencies. Jim Nasby first put the idea for this feature into my head, and then I discovered that our first Sqitch-using project at work needs it, so blame them.

↧

Chris Travers: Intro to PostgreSQL as Object-Relational Database Management System

August 21, 2012, 9:22 am

≫ Next: Bruce Momjian: Upcoming Conferences

≪ Previous: David Wheeler: Sqitch: Depend On It!

This is a very brief intro to PostgreSQL as an object-relational database management system. In future blog posts, we will look at more hands-on examples of these features in action. Keep in mind these are advanced features typically used by advanced applications.

This is a very brief guide to the concepts we will be looking at more deeply in future posts, tying together in recipes and examples. While PostgreSQL was initially designed to explore object-relational modelling possibilities, the toolkit today is somewhat different than it was initially intended, and therefore the focus of this series will be how to use PostgreSQL in an Object-Relational manner, rather than tracking the history of various components.

How is PostgreSQL "Object-Relational?"

The term Object-Relational has been applied to databases which attempt to bridge the relational and object-oriented worlds with varying degrees of success. Bridging this gap is typically seen as desirable because object-oriented and relational models are very different paradigms and programmers often do not want to switch between them. There are, however, fundamental differences that make this a very hard thing to do well. The best way to think of PostgreSQL in this way is as a relational database management system with some object-oriented features.

By blending object-primative and relational models, it is often possible to provide much more sophisticated data models than one can using the relatively limited standard types in SQL. This can be done both as an interface between an application and database, and as intra-query logic. In future posts I will offer specific examples of each concept, explore how PostgreSQL differs from Oracle, DB2, and Informix in this area.

PostgreSQL is a development platform in a box. It supports stored procedures written in entirely procedural languages like PL/PGSQL or Perl without loaded modules, and more object-oriented languages like Python or Java, often through third party modules. To be sure you can't write a graphical interface inside PostgreSQL, and it would not be a good idea to write additional network servers, such as web servers, directly inside the database. However the environment allows you to create sophisticated interfaces for managing and transforming your data. Because it is a platform in a box the various components need to be understood as different and yet interoperable. In fact the primary concerns of object-oriented programming are all supported by PostgreSQL, but this is done in a way that is almost, but not quite, entirely unlike traditional object oriented programming. For this reason the "object-relational" label tends to be a frequent source of confusion.

Data storage in PostgreSQL is entirely relational, although this can be degraded using types which are not atomic, such as arrays, XML, JSON, and hstore. Before delving into object-oriented approaches, it is important to master the relational model of databases. For the novice, this section is therefore entirely informational. For the advanced developer, however, it is hoped that it will prove inspirational.

In object-oriented terms, very relation is a class, but not every class is a relation. Operations are performed on sets of objects (an object being a row), and new row structures can be created ad-hoc. PostgreSQL is, however, a strictly typed environment and so in many cases, polymorphism requires some work.

Data Abstraction and Encapsulation in PostgreSQL

The relational model itself provides some tools for data abstraction and encapsulation, and these features are taken to quite some length in PostgreSQL. Taken together these are very powerful tools and allow for things like calculated fields to be simulated in relations and even indexed for high performance.

Views are the primary tool here. With views, you can create an API for your data which is abstracted from the physical storage. Using the rules system, you can redirect inserts, updates, and deletes from the view into underlying relations, preferably using user defined functions. Being relations, views are also classes and methods. Views cannot simply be inherited and workarounds cause many hidden gotchas.

A second important tool here is the ability to define what appear to be calculated fields using stored procedures. If I create a table called "employee" with three fields (first_name, middle_name, last_name) among others, and create a function called "name" which accepts a single employee argument and concatenates these together as "last_name, first_name middle_name" then if I submit a query which says:

select e.name from employee e;

it will transform this into:

select name(e) from employee e;

This gives you a way to do calculated fields in PostgreSQL without resorting to views. Note that these can be done on views as well because views are relations. These are not real fields though. Without the relation reference, it will not do the transformation (so SELECT name from employee will not have the same effect).

Messaging and Class API's in PostgreSQL

A relation is a class. The class is accessed using SQL which defines a new data structure in its output. This data structure unless defined elsewhere in a relation or a complex type cannot have methods attached to it and therefore can not be used with the class.method syntax described above. There are exceptions to this rule, of course, but they are beyond the scope of this introduction. In general it is safest to assume that the output of one query, particularly one with named output fields, cannot safely be used as the input to another.

A second messaging apparatus in PostgreSQL is the LISTEN/NOTIFY framework which can be used along with triggers to issue notifications to other processes when a transaction commits. This approach allows you to create queue tables, use triggers to move data into these tables (creating 'objects' in the process) and then issuing a notification to another process when the data commits and becomes visible. This allows for very complex and and interactive environments to be built from modular pieces.

Polymorphism in PostgreSQL

PostgreSQL is very extensible in terms of all sorts of aspects of the database. Not only can types be created and defined, but also operators can be defined or overloaded.

A more important polymorphism feature is the ability to cast one data type as another. Casts can be implicit or explicit. Implicit casts, which have largely been removed from many areas of PostgreSQL, allow for PostgreSQL to cast data types when necessary to find functions or operators that are applicable. Implicit casting can be dangerous because it can lead to unexpected behavior because minor errors can lead to unexpected results. '2012-05-31' is not 2012-05-31. The latter is an integer expression that reduces to 1976. If you create an implicit cast that turns an integer into a date being the first of the year, the lack of quoting will insert incorrect dates into your database without raising an error ('1976-01-01' instead of the intended '2012-05-31'). Implicit casts can still have some uses.

Inheritance in PostgreSQL

In PostgreSQL tables can inherit from other tables. Their methods are inherited but implicit casts are not chained, nor are their indexes inherited. This allows you develop object inheritance hierarchies in PostgreSQL. Multiple inheritance is possible, unlike any other ORDBMS that I have looked at on the market (Oracle, DB2, and Informix all support single inheritance).

Table inheritance is an advanced concept and has many gotchas. Please refer to the proper sections of the manual for more on this topic. On the whole it is probably best to work with table inheritance first in areas where it is more typically used, such as table partitioning, and later look at it in terms of object-relational capabilities.

Overall the best way to look at PostgreSQL as an object-relational database is a database which provides very good relational capabilities plus some advanced features that allows one do create object-relational systems on top of it. These systems can then move freely between object-oriented and relational worldviews but are still more relational than object-oriented. At any rate they bear little resemblance to object-oriented programming environments today. With PostgreSQL this is very much a toolkit approach for object-relational databases building on a solid relational foundation. This means that these are advanced functions which are powerful in the hands of experienced architects, but may be skipped over at first.

↧

Bruce Momjian: Upcoming Conferences

August 21, 2012, 6:00 pm

≫ Next: Chris Travers: Intro to PostgreSQL as Object-Relational Database Management System

≪ Previous: Chris Travers: Intro to PostgreSQL as Object-Relational Database Management System

My conference schedule has solidified and I am presenting at events in Philadelphia, Chicago, Moscow, and Prague during the next two months.

↧