The number one problem with people is that they're not computers. Ok, ok, thats a little broad.
We'll be more precise than that. The problem with many developers writing queries and procedures for databases is that they have trouble breaking away from the mindset of writing procedural application code. For example, explicit FOR loops absolutely necessary are among the C/C++, Java, PHP, .NET, etc. world but have a much more limited use when dealing with databases. By all means, they still have their place, but in a much more limited fashion. This is mostly relevant in modern database procedural languages. The goal is to think in sets, not loops.
Let's go over a pretty extreme example (in Postgres, I LOVE Postgres! Yay for 9.0 recently out.):
Here's a stripped down, real life example of going variable crazy and using an explicit loop in PLPGSQL without any good reason:
As a designer for some new functionality in our main customer application, I essentially asked one of our developers to make a procedure that retrieved two values ( a date and a number of hours ) from a row, construct a timestamp value from them, and inserted that value into another table. (In reality, its a bit more complicated, but essentially that.) What I got above was not what I expected.
Much better and obvious is:
Now I know most people are thinking, "Wow, thats pretty extreme an example." or "That rarely happens.". Unfortunately, in my case, it is not. The first was provided to me by a developer of 20+ years.
Lets compare some speed differences using Postgres' new DO:
So, about 5 times faster.
(I didn't need to make the second one a PLPGSQL obviously, but in reality it was - and needed to be, so I left it as such.)
This result should be obvious and not surprising to anybody, hopefully.
So....think in sets and don't do dumb stuff.
In other news...sooooooooooooo glad I've moved all my important apps from MySQL to Postgres awhile back.
We'll be more precise than that. The problem with many developers writing queries and procedures for databases is that they have trouble breaking away from the mindset of writing procedural application code. For example, explicit FOR loops absolutely necessary are among the C/C++, Java, PHP, .NET, etc. world but have a much more limited use when dealing with databases. By all means, they still have their place, but in a much more limited fashion. This is mostly relevant in modern database procedural languages. The goal is to think in sets, not loops.
Let's go over a pretty extreme example (in Postgres, I LOVE Postgres! Yay for 9.0 recently out.):
Here's a stripped down, real life example of going variable crazy and using an explicit loop in PLPGSQL without any good reason:
DECLARE
v_date date;
v_hours int;
v_datetime timestamp;
BEGIN
FOR v_date,v_hours IN SELECT date_value, hours_value FROM table_foo LOOP
v_datetime := v_date + (v_hours::text||' hours')::interval;
INSERT INTO table_bar VALUES (v_datetime);
END LOOP;
END
As a designer for some new functionality in our main customer application, I essentially asked one of our developers to make a procedure that retrieved two values ( a date and a number of hours ) from a row, construct a timestamp value from them, and inserted that value into another table. (In reality, its a bit more complicated, but essentially that.) What I got above was not what I expected.
Much better and obvious is:
INSERT INTO table_bar SELECT date_value + (interval '1 hour' * hours_value) FROM table_foo;
Now I know most people are thinking, "Wow, thats pretty extreme an example." or "That rarely happens.". Unfortunately, in my case, it is not. The first was provided to me by a developer of 20+ years.
Lets compare some speed differences using Postgres' new DO:
CREATE TABLE table_foo
(date_value DATE,
hours_value INT);
CREATE TABLE table_bar
(datetime_value TIMESTAMP);
INSERT INTO table_foo
SELECT (current_date + interval '1 day' * round(random() * 100))::date,
round(random()*100)
FROM generate_series(1,1000000) as t(v);
postgres=# DO $$
DECLARE
v_date date;
v_hours int;
v_datetime timestamp;
BEGIN
FOR v_date,v_hours IN SELECT date_value, hours_value FROM table_foo LOOP
v_datetime := v_date + (v_hours::text||' hours')::interval;
INSERT INTO table_bar VALUES (v_datetime);
END LOOP;
END;$$ LANGUAGE PLPGSQL;
DO
Time: 15726.413 ms
postgres=# DO $$
postgres$# BEGIN INSERT INTO table_bar
postgres$# SELECT date_value + (interval '1 hour' * hours_value)
postgres$# FROM table_foo; END;
postgres$# $$ LANGUAGE PLPGSQL;
DO
Time: 3267.074 ms
So, about 5 times faster.
(I didn't need to make the second one a PLPGSQL obviously, but in reality it was - and needed to be, so I left it as such.)
This result should be obvious and not surprising to anybody, hopefully.
So....think in sets and don't do dumb stuff.
In other news...sooooooooooooo glad I've moved all my important apps from MySQL to Postgres awhile back.