Keith Fiske: A Large Database Does Not Mean Large shared

A co-worker of mine did a blog post last year that I’ve found incredibly useful when assisting clients with getting shared_buffers tuned accurately.

Setting shared_buffers the hard way

You can follow his queries there for using pg_buffercache to find out how your shared_buffers are actually being used. But I had an incident recently that I thought would be interesting to share that shows how shared_buffers may not need to be set nearly as high as you believe it should. Or it can equally show you that you that you definitely need to increase it. Object names have been sanitized to protect the innocent.

To set the stage, the database total size is roughly 260GB and the use case is high data ingestion with some reporting done on just the most recent data at the time. shared_buffers is set to 8GB. The other thing to note is that this is the only database in the cluster. pg_buffercache is installed on a per database basis, so you’ll have to install it on each database in the cluster and do some additional totalling to figure out your optimal setting in the end.

database=# SELECT c.relname
  , pg_size_pretty(count(*) * 8192) as buffered
  , round(100.0 * count(*) / ( SELECT setting FROM pg_settings WHERE name='shared_buffers')::integer,1) AS buffers_percent
  , round(100.0 * count(*) * 8192 / pg_relation_size(c.oid),1) AS percent_of_relation
 FROM pg_class c
 INNER JOIN pg_buffercache b ON b.relfilenode = c.relfilenode
 INNER JOIN pg_database d ON (b.reldatabase = d.oid AND d.datname = current_database())
 GROUP BY c.oid, c.relname
 ORDER BY 3 DESC
 LIMIT 10;
               relname               | buffered | buffers_percent | percent_of_relation
-------------------------------------+----------+-----------------+---------------------
 table1                              | 7479 MB  |            91.3 |                 9.3
 table2                              | 362 MB   |             4.4 |               100.0
 table3                              | 311 MB   |             3.8 |                 0.8
 table4                              | 21 MB    |             0.3 |               100.0
 pg_attrdef_adrelid_adnum_index      | 16 kB    |             0.0 |               100.0
 table4                              | 152 kB   |             0.0 |                 7.7
 index5                              | 16 kB    |             0.0 |                14.3
 pg_index_indrelid_index             | 40 kB    |             0.0 |                 8.8
 pg_depend_depender_index            | 56 kB    |             0.0 |                 1.0
 pg_cast_source_target_index         | 16 kB    |             0.0 |               100.0

You can see that table1 is taking up a vast majority of the space here and it’s a large table, so only 9% of it is actually in shared_buffers. What’s more interesting though is how much of the space for that table is actually in high demand.

database=# SELECT pg_size_pretty(count(*) * 8192) 
FROM pg_class c
INNER JOIN pg_buffercache b ON b.relfilenode = c.relfilenode
INNER JOIN pg_database d ON (b.reldatabase = d.oid AND d.datname = current_database())
WHERE c.oid::regclass = 'table1'::regclass
AND usagecount >= 2;
  pg_size_pretty
----------------------
 2016 kB

Data blocks that go into and come out of postgres all go through shared_buffers. Just to review the blog post I linked to, whenever a block is used in shared memory, it increments a clock-sweep algorithm that ranges from 1-5, 5 being extremely high use data blocks. This means high usage blocks are likely to be kept in shared_buffers (if there’s room) and low usage blocks will get moved out if space for higher usage ones is needed. We believe that a simple insert or update sets a usagecount of 1. So, now we look at the difference when usage count is dropped to that.

database=# SELECT pg_size_pretty(count(*) * 8192) 
FROM pg_class c
INNER JOIN pg_buffercache b ON b.relfilenode = c.relfilenode
INNER JOIN pg_database d ON (b.reldatabase = d.oid AND d.datname = current_database())
WHERE c.oid::regclass = 'public.ip_addresses_taggings'::regclass
AND usagecount >= 1;
 pg_size_pretty
----------------------
 4946 MB

So the shared_buffers is actually getting filled mostly by the data ingestion process, but relatively very little of it is of any further use afterwards. If anything of greater importance was needed in shared_buffers, there’s plenty of higher priority space and that inserted data would quickly get flushed out of shared memory due to having a low usagecount.

So with having pg_buffercache installed, we’ve found that the below query seems to be a good estimate on an optimal, minimum shared_buffers setting

database=# SELECT pg_size_pretty(count(*) * 8192) as ideal_shared_buffers
 FROM pg_class c
 INNER JOIN pg_buffercache b ON b.relfilenode = c.relfilenode
 INNER JOIN pg_database d ON (b.reldatabase = d.oid AND d.datname = current_database())
 WHERE usagecount >= 3;
 ideal_shared_buffers
----------------------
 640 MB

This is the sort of query you would run after you have had your database running through your expected workload for a while. Also, note my use of the key word minimal. This does not account for unexpected spikes in shared_buffers usage that may occur during a session of reporting queries or something like that. So you definitely want to set it higher than this, but it can at least show you how effectively postgres is using its shared memory. In general we’ve found the typical suggestion of 8GB to be a great starting point for shared_buffers.

So, in the end, the purpose of this post was to show that shared_buffers is something that needs further investigation to really set optimally and there is a pretty easy method to figuring it out once you know where to look.