Back in 2005, I started a new job with a company that did work for other businesses. Their data model was designed by developers and they had no DBA, as is pretty common in smaller organizations. A critical part of our main application relied on an event log that captured customer activity and relayed their final decision to the client for reconciliation. One day someone noticed that there was a bug in the system that resolved the final decision from the event stack, and panic ensued. I put a lot of thought into it that night and had a fix for it the next morning. As usual, my approach relied on Postgres, and carried a fun lesson.
As it turns out, some business logic belongs in the database. Imagine an existing data model with business logic that must adhere to these criteria:
- All front-end events are captured to an event log table.
- Depending on customer activity, an event chain can be initiated.
- Within this event chain, the final decision must be captured.
- Decisions are ranked based on client pay scale for customer progress through the interface.
Nothing proprietary or complicated—just a basic ranking algorithm. Unfortunately in this case, whatever system maintained that ranking had some subtle flaw we needed to fix, and that included past data for the current billing cycle. We needed something that would fix the decision tree, and then prevent it from getting broken again in the future.
We can represent the data with a pretty simple model:
CREATETABLE event_type ( event_type_id SERIAL PRIMARYKEY, event_name VARCHARNOTNULL, score INTNOTNULL, can_init BOOLEANNOTNULLDEFAULTFALSE); CREATETABLE event_log ( event_id SERIAL PRIMARYKEY, session_id UUID NOTNULL, event_type_id INTNOTNULLREFERENCES event_type (event_type_id), event_data VARCHARNOTNULL, created_dt TIMESTAMPWITHTIME ZONE NOTNULLDEFAULT now()); CREATETABLE decision_log ( session_id UUID PRIMARYKEY, init_event_id INTNOTNULLREFERENCES event_log (event_id), decision_id INTNULLREFERENCES event_log (event_id)); INSERTINTO event_type (event_name, score, can_init)VALUES('start',0,TRUE); INSERTINTO event_type (event_name, score)VALUES('pending',10),('order',20),('cancel',30); |
The business logic requires something more, however. To a DBA, it sounds like a job for at least one trigger. But how do we organize them, and how should they fire? Going back to the criteria, we know that all events are captured to the event_log
table, so it needs a trigger. Since it’s a log of events, there are no updates or deletes. If we make the assumption that decision_log
can receive inserts from another vector, it should have a separate trigger to control ranking.
So we need one trigger after inserts on event_log
and one before updates on decision_log
. Here’s what that might resemble:
CREATEORREPLACEFUNCTION sp_capture_event()RETURNSTRIGGERAS $$ BEGIN PERFORM event_type_id FROM event_type WHERE event_type_id =NEW.event_type_id AND can_init; IF FOUND THENINSERTINTO decision_log (session_id, init_event_id)VALUES(NEW.session_id,NEW.event_id); ELSEUPDATE decision_log SET decision_id =NEW.event_id WHERE session_id =NEW.session_id; ENDIF; RETURNNULL; END; $$ LANGUAGE plpgsql; CREATETRIGGER t_capture_events_a_i AFTER INSERTON event_log FOR EACH ROWEXECUTEPROCEDURE sp_capture_event(); CREATEORREPLACEFUNCTION sp_update_decision()RETURNSTRIGGERAS $$ BEGINIFOLD.decision_id ISNULLTHENRETURNNEW; ENDIF; PERFORM n.event_id FROM event_log o JOIN event_type ot USING(event_type_id), event_log n JOIN event_type nt USING(event_type_id)WHERE o.event_id =OLD.decision_id AND n.event_id =NEW.decision_id AND nt.score > ot.score; IF FOUND THENRETURNNEW; ENDIF; RETURNNULL; END; $$ LANGUAGE plpgsql; CREATETRIGGER t_control_decisions_b_u BEFOREUPDATEON decision_log FOR EACH ROWEXECUTEPROCEDURE sp_update_decision(); |
The trigger on event_log
can issue an INSERT
or UPDATE
based on event type since only some event types can initiate a new chain, and customers can only have one chain per session. It’s job is a very simple one. The trigger on decision_log
isn’t actually very complex either, and there were several ways we could have implemented it. The query shown basically just checks if the new event outranks the old one, and only allows the update when it does.
We could have independently selected the ranks into variables and compared with an IF
statement, or used DISTINCT ON
to return the top-ranked event, and only allow the update if it was the same event, and so on. It’s been over ten years, so I have no idea how I did it back then, but the solution above relies on the assumption we only need to fetch information for two events. Using an IF
statement would mean variables and the associated overhead, and the DISTINCT ON
approach could theoretically match tens or hundreds of rows before producing the final result.
In any case, we need to test this arrangement. Let’s use the pgcrypto
extension, the Postgres 9.2+ ability to set arbitrary variables, and a prepared statement to create some event traffic:
CREATE EXTENSION pgcrypto; SELECT set_config('local.sess', gen_random_uuid()::text,FALSE); PREPARE event_ins(INT,VARCHAR)ASINSERTINTO event_log (session_id, event_type_id, event_data)SELECT current_setting('local.sess')::UUID, $1, $2; EXECUTE event_ins(1,'starting up...'); EXECUTE event_ins(2,'ordering stuff...'); EXECUTE event_ins(3,'Ordered!'); SELECT init_event_id, decision_id FROM decision_log; init_event_id | decision_id ---------------+-------------1|3 |
Not bad. By this point, we’ve created a few basic events in the event log, and the decision log appears to be working as intended. But we have one more test to verify: will it reject outranked events? That’s important, because that’s what got is into trouble in the first place. If we can prove the trigger will directly reject bad changes, the integrity of the decision tree is preserved in a way the application can’t circumvent.
Let’s see what happens if the customer tries to screw up the process:
EXECUTE event_ins(2,'No... wait...'); SELECT init_event_id, decision_id FROM decision_log; init_event_id | decision_id ---------------+-------------1|3 EXECUTE event_ins(4,'I changed my mind.'); SELECT init_event_id, decision_id FROM decision_log; init_event_id | decision_id ---------------+-------------1|5 |
Pending events have less priority than order events, so the decision tree did not change when the customer went back in the chain. The events are still logged and stamped so we can see the entire session, but the final state which the client pays for is still preserved. In this particular instance, the customer decided they didn’t want whatever service was offered, and the client got a free pass.
That solved the conundrum of preventing further corruption of our data, but we still had to fix the existing billing cycle. Luckily, the DISTINCT ON
approach that isn’t quite correct for the trigger, works great on rebuilding the ideal rows. A fix would go something like this:
WITH ranked AS(SELECTDISTINCTON(session_id) e.session_id, e.event_id AS decision_id FROM event_log e JOIN event_type t USING(event_type_id)WHERE e.created_dt >= date_trunc('month', now())ANDNOT t.can_init ORDERBY e.session_id, t.score DESC)UPDATE decision_log d SET decision_id = r.decision_id FROM ranked r WHERE d.session_id = r.session_id AND d.decision_id != r.decision_id; |
That single update statement finds all sessions for the current billing cycle and gets the proper ranked decision event. Then it only changes the decisions that were wrong due to the bug. Even if that fix contains millions of rows, it’s a one-time operation. Our trigger ensures only valid entries will make it into the decision table from now on, thus ending the chaos. The only remaining task was to retire the code that previously maintained the decision stack.
I came up with that fix soon after starting at the company, and it remained in place until at least the day I left almost five years later. It wasn’t a particularly difficult solution, and it would be trivial to argue that more complicated processes should be implemented at the application level instead. But for simple integrity-enforcing rules, some business logic works best when it can’t be circumvented, even accidentally.
In the end, as basic as this approach was, it uses several Postgres features and SQL syntax. Triggers, functions, DISTINCT ON
, and UPDATE ... FROM
are only the beginning. Being able to use all available tools greatly enhances developer agility and understanding, as well as the quality of the final result. So don’t shy away from putting some code in the database; sometimes, that’s where it belongs.