Modern applications often require the ability to track data changes along with the data itself. Think about stock prices, exchange rates and the likes. It’s also possible that the concept of “current data” don’t even have a meaning within a specific application realm.
For such cases a “history table” can be a good solution, while not the one and only. The aim of a history table is to keep a precise track of how a certain set of values has changed in time.
Let’s stick with the stock prices example and let’s start with an initial naif implementation.
create table stockprices ( stock text primary key, price numeric );
(Sorry, I cannot really bear uppercase SQL. My fault!)
This simple table can actually record the prices of stocks. But it’s way too simple as we need to know when that price has been updated. Let’s go back to the editor.
create table stockprices ( stock text primary key, price numeric, enter timestamp );
The idea is that every time you update a row with a new price or insert it, you also update the timestamp column to record when the change happened. The primary key is there because the “natural” way to query this table is to use the stock code column to get its current price.
select * from stockprices where stock='GOOGL';
Easy.
But there’s nothing historical here: you just keep the latest value. There’re no old values as after the first insertion you keep updating the very same row. So it’s time to go historical!
First things first. If there needs to be a primary key, that won’t be the stock column. We need to get more rows for the same stock and different enter timestamps. Here we go!
create table stockprices ( stock text, price numeric, enter timestamp );
Of course we will still use the stock code column to query this table so we’ll need an index over that same column. We also add an index over the timestamp column: it’ll be more clear soon.
create index on stockprices( stock ); create index on stockprices( enter desc ); -- look here!
The previous query now becomes:
select distinct on( stock ) * from stockprices where stock='GOOGL' order by stock, enter desc;
Here we sort by stock code and by reverse enter timestamp (that is “latest first”) being equal the stock code. The distinct clause will pick the first row in a group, thus that with the latest price. The main advantage here is that the same table works for both the historical data and for the current data. There’s actually no difference between older and newer data!
If you think we’re done, you’re wrong. We’re just halfway. Stay tuned then.
When dealing with history tables my experience reminds me about a few facts.
- The time a datum enters isn’t almost always the time it is valid. It can enter either earlier (then it’s a forecast or prevision) or later (then it’s a correction).
- Corrections can happen, even multiple times, and you need to keep track of all of them.
- A datum can stop to exist after certain moment as if it never existed.
- You can have two types of queries. One looking for current values and one looking for values as seen at a certain moment in the past (and sometimes also in the future).
Before going further on I’d like to remind you a few technical facts that are useful for our purpose.
When you delete
rows in a table, no row is actually deleted. Those are being marked as “recyclable”. The same happens to the related entries in the indexes.
Similarly, when you update
rows in a table, no row is actually updated. Those are “deleted” (see above) and new rows with the updated values are inserted. The same happens to the related entries in the indexes.
While these things don’t sound as bad things, if deletions and updates happen very often, the table and its indexes get “bloated” with a lot of those “recyclable” entries. Bloated tables do not represent a big performance issue. Bloated indexes do. In fact, we almost always have indexes to access rows and having “holes” punched into the table storage isn’t a big issue as the index allow direct row access. The same isn’t true for indexes. The larger they are, the more RAM they need to be stored and the more I/O you need to perform to load them. Keeping the indexes at their bare minimum storage requirements is a good thing. This is why most RDBMS have a feature, called “vacuuming”, that compacts both the table storage and the indexes and swipes those holes away. Vacuuming is also a good thing. It’s a pity you cannot really do it while the tables are being actively used: a table lock is needed in order to perform it. And, the larger the table to be vacuumed, the longer the time that lock will stay active. Vacuuming is a maintenance activity you will tend to perform during off-peak time or even with no application allowed to access those tables.
History tables help to address this problem. History tables usually don’t have any deletion or update at all. Just insertions. And queries, of course. For this reason history tables and indexes get new rows “appended”. No hole is punched into the storage and no vacuuming is needed thereafter. It’s a concept very similar to the one used with the ledgers for bookkeeping: you always append new data. if you need to correct, you add new data to cancel and yet new data for the correction. This is also why history table are well suited for the so-called “big data” applications. Time to get back to our history table implementation.
In a history table, besides the actual data we need (the stock code and it’s prices in our example) we need a few more columns for the housekeeping. A few of these needs cannot be explained at this stage and will be clearer later.
We need to be able to identify every single row unequivocally. A bigserial
will do as it’s capacity almost reaches 10^18 (billions of billions). This isn’t always needed but, I admit it, I like the idea to have it everywhere.
We need to know when a certain row has been inserted, no matter the data validity time is. So we add a timestamp
column for that.
We need to know since when a certain row is to be considered effective, so another timestamp
is needed.
Finally, we need to be able to say that a certain stock code isn’t used any more, so it’s latest price isn’t valid any more. I used a bool
for that.
Our table structure now does need some makeup.
create table stockprices ( stock text not null, price numeric not null, enter timestamp not null default now(), valid timestamp not null default '-infinity'::timestamp, erase bool not null default false, id bigserial primary key );
I have added a few default values, not really needed in the real life, but useful to remember the meaning of some columns.
Also our query needs some makeup, to become a little bit more complex.
We first select the current values for all the available rows.
select distinct on( stock ) * from stockprices where valid <= now() order by stock, valid desc, enter desc;
We will get a row for each single stock code with the latest prices as of now()
. We are ordering by reverse (desc
) validity and by reverse insertion times. Why?
Let’s imagine we have a stock code ‘GOOGL’ for EUR 400.00 on 09-04-2018. We then check that the value is wrong. It was actually 401.00 on the very same date. So we insert a new line where the enter
is newer the that of the previous row. We can even apply more “corrections” on the very same “row” all with newer and newer enter
timestamps. We are keeping track of all the values we have entered there, while the query will pick up only the latest of the latest. No deletion and no updates.
Once we are here, we can then expunge all those rows which have the erase
flag set to false
.
with x as ( select distinct on( stock ) * from stockprices where valid < now() order by stock, valid desc, enter desc ) select * from x where erase is false;
As the now()
function returns newer and newer timestamps, the query will also pull in the game all those rows that have been inserted as “future” data, those whose valid
column has been set in the future. For example, it can make a certain stock code “disappear”at a certain timestamp. With no maintenance intervention on the table. A row with the column erase
set to true
and the proper valid
timestamp will enter the query row set and will make that row to disappear at the right time!
But we haven’t finished yet. Why?
The indexes haven’t got their makeup yet. Indexes is why don’t use plain files!
Everyone should follow a simple rule while designing indexes for DBs. This rule says: “None knows about the DB more than the RDBMS itself“.
Of course, we can start from a simple index design, usually based on reasonableness. We fill the tables is with some meaningful data and then ask PostgreSQL to show you how it’s doing with those data and the available indexes. It’s the wonderful world of the explain
PostgreSQL extension. This is not (yet) part of the SQL standard and is one of the most interesting extensions.
First of all we need to use real life data. Sort of. You can write a small program to generate an SQL file to populate our table with a number of rows. My personal first test is based on 100k rows involving 500 different stock codes over 365 days with prices ranging from 0 to 500. In order to simplify the generation of the data I’ve modified the table in order to have the column stock
as int
instead of text
. All data are random so there can be stock codes that never get into the table: we also need some “not found” results. Do we?
The table, at the moment, should only have a single index needed for the primary key. It should resemble to this one:
tmp1=# \d stockprices Table "public.stockprices" Column | Type | Collation | Nullable | Default --------+-----------------------------+-----------+----------+------------------------------------------ stock | integer | | not null | price | numeric | | not null | valid | timestamp without time zone | | not null | '-infinity'::timestamp without time zone enter | timestamp without time zone | | not null | now() erase | boolean | | not null | false id | bigint | | not null | nextval('stockprices_id_seq'::regclass) Indexes: "stockprices_pkey" PRIMARY KEY, btree (id)
Our query doesn’t look for anything special. It “just” collapses the whole table to the latest values as of now()
. We see it uses all columns but price
(which is the column we will be interested in) and id
(which is a unique row identification number, useless for most of our needs).
We could then start by creating an index for each of those columns. Let’s do it.
create index on stockprices ( stock ); create index on stockprices ( valid ); create index on stockprices ( enter ); create index on stockprices ( erase );
After a while all indexes will be created. I’d like to remember that if we created those indexes before loading the data, we could get a uselessly slow “table population process”. Creating them all at once will save a lot of time. Later on this.
We can now ask our command line tool to show some timing information for query execution:
\timing
I actually have this command in my ~/.psqlrc
file so it’s always on by default.
It’s time to fire our query and see how it will be performed by PostgreSQL.
explain analyze with x as ( select distinct on( stock ) * from stockprices where valid < now() order by stock, valid desc, enter desc -- Sort on 3 columns ) select * from x where erase is false;
This is output on my system:
tmp1-# select * from x where erase is false; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------- CTE Scan on x (cost=13976.82..13986.82 rows=250 width=61) (actual time=84.501..99.082 rows=500 loops=1) Filter: (erase IS FALSE) CTE x -> Unique (cost=13476.82..13976.82 rows=500 width=33) (actual time=84.497..99.015 rows=500 loops=1) -> Sort (cost=13476.82..13726.82 rows=100000 width=33) (actual time=84.496..94.757 rows=100000 loops=1) Sort Key: stockprices.stock, stockprices.valid DESC, stockprices.enter DESC Sort Method: external merge Disk: 5680kB -> Seq Scan on stockprices (cost=0.00..2435.00 rows=100000 width=33) (actual time=0.016..23.901 rows=100000 loops=1) Filter: (valid < now()) Planning time: 0.745 ms Execution time: 99.771 ms (11 rows) Time: 101,181 ms
In order to make life easier to DBAs, there’s a neat and great online help to better understand it, written and hosted by Hubert ‘depesz‘ Lubaczewski. It’s called … uhm … explain
. You simply paste your explain output there and you get in return a more readable format.
From bottom up, I see on line #5 a sequential table scan (which isn’t a nice thing at all!) to select only those rows that are not in the future. At line #4 that that sort over three columns is run and then on line #3 the rows are “squashed” to be made unique. At line #1 the test on the column flag
is run to expunge all rows which have been erased.
As a first attempt at getting it better I try to create a single index for that sort and leave the one on the erase
column.
drop index stockprices_enter_idx; drop index stockprices_valid_idx; drop index stockprices_stock_idx; create index on stockprices( stock, valid desc, enter desc );
Then I reboot my system (to be sure all disk caches are gone) and try again the query:
QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- CTE Scan on x (cost=7736.16..7746.16 rows=250 width=61) (actual time=0.115..48.671 rows=500 loops=1) Filter: (erase IS FALSE) CTE x -> Unique (cost=0.42..7736.16 rows=500 width=33) (actual time=0.109..48.513 rows=500 loops=1) -> Index Scan using stockprices_stock_valid_enter_idx on stockprices (cost=0.42..7486.18 rows=99990 width=33) (actual time=0.107..42.855 rows=100000 loops=1) Index Cond: (valid < now()) Planning time: 0.486 ms Execution time: 48.750 ms (8 rows) Time: 49,875 ms
Bingo! I’ve cut the time in a half.
The table sequential scan is gone, replaced by an index scan plus a condition. Scanning an index should be much better that scanning a table. Shouldn’t it?
It’d be nice to squeeze some more time out of this query but I’ll leave it to the keen reader. I’ll save you some time: it’s useless (as of v10.4) to add the filter condition “erase IS FALSE” to the unified index (or to create one new). That condition is run over the CTE (common table expression, a temporary un-indexable table) we’ve called x.
For sure we can ditch the index on the flag, as it’s not used at all and would just cause more work on the RDBMS during insertions.
Of course we haven’t tried yet a complete query, that where you look for a specific stock code. The query could simply be:
explain with x as ( select distinct on( stock ) * from stockprices where valid < now() order by stock, valid desc, enter desc ) select * from x where erase is false and stock=42; -- Test here? No!
I won’t even try it. The condition on the stock code would be sequentially applied to the CTE, just like the one on the flag. It wouldn’t take advantage of any index!
The right way to do it is to do the selection within the CTE, like here:
explain analyze with x as ( select distinct on( stock ) * from stockprices where valid < now() and stock=142 -- The test is better here! order by stock, valid desc, enter desc ) select * from x where erase is false;
The results are different. Just a little bit:
QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------ CTE Scan on x (cost=501.03..504.31 rows=82 width=61) (actual time=0.840..0.895 rows=1 loops=1) Filter: (erase IS FALSE) CTE x -> Unique (cost=500.54..501.03 rows=164 width=33) (actual time=0.834..0.888 rows=1 loops=1) -> Sort (cost=500.54..501.03 rows=198 width=33) (actual time=0.833..0.862 rows=215 loops=1) Sort Key: stockprices.valid DESC, stockprices.enter DESC Sort Method: quicksort Memory: 41kB -> Bitmap Heap Scan on stockprices (cost=6.45..492.98 rows=198 width=33) (actual time=0.210..0.670 rows=215 loops=1) Recheck Cond: ((stock = 142) AND (valid Bitmap Index Scan on stockprices_stock_valid_enter_idx (cost=0.00..6.40 rows=198 width=0) (actual time=0.143..0.143 rows=215 loops=1) Index Cond: ((stock = 142) AND (valid < now())) Planning time: 0.314 ms Execution time: 0.966 ms (14 rows) Time: 2,070 ms
We are almost done. We have a query to create a view of “current values”, we have a query to select the current value for a single stock code.
What we’re missing is a query for a set of stock codes. Of course, I am not going to make use of predicates like ... where (stock=42 or stock=142 or stock=242)
. First that would require you to compose a dynamic query (which isn’t a bad thing at all in general, while still error prone). Second, it just multiplies the query (or a part of it) by the number of different stock codes you are looking for. If they’re 100, the query will be likely repeated 100 times. So what?
The answer is INNER JOIN
.
Let’s imagine we have a generic tool table we use for selections. This table can be used by multiple users (and for multiple purposes) and still keep track of all the selection that have been made.
create table stockselection ( selid bigserial not null, stock int not null, primary key( selid,stock ) );
This is how it works. You insert the first stock code to search for like this:
insert into selection ( stock ) values ( 42 ) returning selid; selid ------- 666 (1 row) INSERT 0 1
So you get the newly created value for the selid
column. That’s the “selection id” that allows multiple selection queries to be run by “just” using different selection ids.
Then you insert the remaining stock codes using that very same selection id with queries like this one:
insert into selection values ( 666,142 ); insert into selection values ( 666,242 ); ...
The query with the stock selection feature will become:
explain with x as ( select distinct on( stock ) * from stockprices natural inner join stockselection # inner join where valid < now() and selid=666 # selection id order by stock, valid desc, enter desc ) select * from x where erase is false;
This is it. More or less. Of course, there’s plenty of room for further enhancements. This article is to show ideas and to give hints, not boxed solutions. Some more work can be needed depending upon your very own case.
Enjoy your coding.