"The database is growing at 2GB a minute. We're 40 minutes away from running out of disk space."
"Sounds like I should probably take a look."
I looked at the database size, which was 160GB. But the database SAN share was up to 1.4GB used out of 1.6GB. WTF?
Then I looked at the filesystem and did directory sizes. pg_tmp was over a terabyte. Oooooooohhh.
Apparently they'd accidentally pushed a new report to the application which worked OK in testing, but with certain parameters created a 15 billion item sort. And since it was slow, users called it several times. Ooops.
Enter temp_file_limit, a parameter added by Mark Kirkwood to PostgreSQL 9.2. This is a limit on per-session usage of temporary files for sorts, hashes, and similar operations. If a user goes over the limit, their query gets cancelled and they see an error.
This is an excellent way to prevent a single user, or a bad application change, from DOSing your database server. Set it to something high; I'm using 10GB or 20GB, or 10% of available disks space, whichever is less. But even a high limit like that will save you from some unexpected downtime.
"Sounds like I should probably take a look."
I looked at the database size, which was 160GB. But the database SAN share was up to 1.4GB used out of 1.6GB. WTF?
Then I looked at the filesystem and did directory sizes. pg_tmp was over a terabyte. Oooooooohhh.
Apparently they'd accidentally pushed a new report to the application which worked OK in testing, but with certain parameters created a 15 billion item sort. And since it was slow, users called it several times. Ooops.
Enter temp_file_limit, a parameter added by Mark Kirkwood to PostgreSQL 9.2. This is a limit on per-session usage of temporary files for sorts, hashes, and similar operations. If a user goes over the limit, their query gets cancelled and they see an error.
This is an excellent way to prevent a single user, or a bad application change, from DOSing your database server. Set it to something high; I'm using 10GB or 20GB, or 10% of available disks space, whichever is less. But even a high limit like that will save you from some unexpected downtime.