Kevin Grittner: For Better Service, Please Take a Number!

The Tipping Point

In PostgreSQL 9.1 and earlier, benchmarks that I and others did all showed that the optimal number of active database connections was usually somewhere around ((2 * core_count) + effective_spindle_count). Above this number, both throughput and latency got worse. In every version since then the tipping point has moved, but the effect is still present at some point. In graphs you often see this visually with Transactions Per Second up the y axis and Concurrency (i.e., number of active connections) across the x axis, with a steep climb followed by a "knee" and a performance drop-off. The good news is that every major release for a while has moved the knee to the right and decreased the slope past the knee -- but the knee is still there.

Users and Database Connections are Different Things

Sometimes people will say "I want to support 2000 users, with fast response time." It is pretty much guaranteed that if you try to do that with 2000 actual database connections, performance will be horrible. If you have a machine with a lot of cores and the active data set is fully cached, you will see much better performance for those 2000 users by funnelling the requests through a small number database connections -- depending on the PostgreSQL version it may be anywhere from 2 to maybe 10 times the number of cores.

To understand why that is true, this thought experiment should help. Consider a hypothetical database server machine with only one resource to share -- a single core. This core will time-slice equally among all concurrent requests with no overhead. Let's say 100 requests all come in at the same moment, each of which needs one second of CPU time. The core works on all of them, time-slicing among them until they all finish 100 seconds later. Now consider what happens if you put a connection pool in front which will accept 100 client connections but make only one request at a time to the database server, putting any requests which arrive while the connection is busy into a queue. Now when 100 requests arrive at the same time, one client gets a response in 1 second; another gets a response in 2 seconds, and the last client gets a response in 100 seconds. Nobody had to wait longer to get a response, throughput is the same, but the average latency is 50.5 seconds rather than 100 seconds.

A real database server has more resources which can be used in parallel, but the same principle holds, once they are saturated, you only hurt things by adding more concurrent database requests. It is actually worse than the above thought experiment, because with more tasks you have more task switches, increased contention for locks and cache, L2 and L3 cache line contention, and many other issues which cut into both throughput and latency. On top of that, while a high work_mem setting can help a query in a number of ways, that setting is the limit per plan node for each connection, so with a large number of connections you need to leave this very small to avoid flushing cache or causing swapping; a smaller work_memsetting, in turn, leads to the choice of slower plans or slower run times for the same plans from such things as hash tables spilling to disk.

Some database products effectively build a connection pool or some form of request queuing into the server, but the PostgreSQL community has taken the position that since the best connection pooling is done closer to the client software, they will leave it to the users to manage this. Most poolers will have some way to limit the database connections to a hard number, while allowing more concurrent client requests than that, queuing requests as necessary. This is what you want, and it should be done on a transactionalbasis, not per statement or connection. Care must be taken to handle session properties correctly, and in some cases this poses a barrier to using the ideal type of connection pooling; in such cases it is still a good idea to find ways to keep the number of database connections as small as practical, using whatever techniques are available.

Where's the Beef?

One other analogy may help point to how a connection pooler can help -- consider a butcher shop with a counter, behind which are four butchers. If any butcher is idle when a customer walks in, a butcher will immediately offer to help that customer -- no problem. Now, if it is rush hour and 20 customers walk in, you can either have them take numbers once all the butchers are busy, or you can have a mad free-for-all at the counter. A butcher is slicing a quantity of meat for one customer and another comes up and demands he gets some attention, so the butcher sets aside the first person's order and starts working on the second person's order. Then a third person comes up and that butcher puts a third order into process. As the customers vie for attention, each butcher switches from one job to another to keep all of them from feeling their respective orders are being neglected. Some customers might get neglected by chance long enough to see new customers enter the shop, get served and leave -- without yet seeing the completion of their own, smaller order. Of course, there would be overhead to keeping track of the various orders and switching among them repeatedly, but even without that customers would be waiting longer, on average, than if the shop put in a "take a number" system.

Wrap-Up

At some point PostgreSQL may add a built-in connection pool or an admission control mechanism which can queue a request to start a database transaction if the number of active transactions is at some configurable limit. If that ever happens, this could be simpler from the application side. Until then, it is often possible to handle more users with better performance by using a client-side connection pool (e.g., Apache dbcp) or an external connection pool (e.g., pgbouncer).

Kevin Grittner: For Better Service, Please Take a Number!

The Tipping Point

Users and Database Connections are Different Things

Where's the Beef?

Wrap-Up

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112