On Linux, a "compiler" is usually a synonym to gcc, but clang is gaining more and more adoption. Over the years, phoronix published several articles comparing of performance of various clang and gcc versions, suggesting that while clang improves over time, gcc still wins in most benchmarks - except maybe "compilation time" where clang is a clear winner. But none of the benchmarks is really a database-style application, so the question is how much difference can you get by switching a compiler (or a compiler version). So I did a bunch of tests, with gcc versions 4.1-4.9, clang 3.1-3.5, and just for fun with icc 2013 and 2015. And here are the results.

I did two usual types of tests - pgbench, representing a transactional workload (lots of small queries), and a subset of TPC-DS benchmark, representing analytical workloads (a few queries chewing large amounts of data).

I'll present results from a machine with i5-2500k CPU, 8GB RAM and an SSD drive, running Gentoo with kernel 3.12.20. I did rudimentary PostgreSQL tuning, mostly by tweaking postgresql.conf like this:

shared_buffers=1GBwork_mem=128MBmaintenance_work_mem=256MBcheckpoint_segments=64effective_io_concurrency=32

I do have results from another machine, but in general it confirms the results presented here. The PostgreSQL was compiled like this

./configure--prefix=...makeinstall

i.e. nothing special (no custom tweaks, etc.). The rest of the system is compiled with gcc 4.7.

pgbench

I did pgbench with three dataset sizes - small (~150MB), medium (~25% RAM) and large (~200% RAM). For each scale I ran pgbench with 4 clients (which is the number of cores on the CPU) for 15 minutes, repeated 3x, and averaged the results. And all this in read-write and read-only mode.

The first observation is that once you start hitting the drives, compiler makes absolutely no measurable difference. That makes results from all the read-write tests (for all scales) uninteresting, as well as the read-only test on large dataset - for all these tests the I/O is the main bottleneck (and that's something the compiler can't really influence).

So we're left with just the read-only benchmark on small and medium datasets, where the results look like this:

compiler	tps (small scale=10)	tps (medium scale=140)
gcc 4.1.2	52932	49837
gcc 4.2.4	53071	50219
gcc 4.3.6	52147	49396
gcc 4.4.7	52597	49834
gcc 4.5.4	53537	50143
gcc 4.6.4	53238	49959
gcc 4.7.4	54383	51033
gcc 4.8.3	54494	51627
gcc 4.9.2	55084	52515
clang 3.1	55160	51748
clang 3.2	55848	52197
clang 3.3	54946	51906
clang 3.4	55297	52306
clang 3.5	55800	52458
icc 2013	52249	49197
icc 2015	52064	49064

Let's use the gcc 4.1.2 results as a baseline, and express the other results as a percentage of the baseline. So 100 means "same as gcc 4.1.2", 90 means "10% slower than gcc 4.1.2" and so on. On a chart it then looks like this (the higher the number, the better):

Not really a dramatic difference:

gcc 4.9 and clang 3.5 are winners, with ~4-5% improvement over gcc 4.1.2
gcc improves over time, with the exception of 4.3/4.4, where the performance dropped below 4.1
clang is very fast right from 3.1, peaking at 3.2 (which is slightly better than 3.5)
surprisingly, icc gives the worst results here

TPC-DS

Now, the data warehouse benchmark. I've used a small dataset (1GB), so that it fits into memory - otherwise we'd hit the I/O bottlenecks and the compilers would make no difference. First, lest's load the data - the script performs these operations:

COPY data into all the tables
create indexes
VACUUM FULL (not really necessary)
VACUUM FREEZE
ANALYZE

The results (in seconds) look like this:

compiler	copy	indexes	vacuum full	vacuum freeze	analyze	total
gcc 4.1.2	110	131	168	5	8	422
gcc 4.2.4	105	128	162	5	8	408
gcc 4.3.6	103	127	160	4	7	401
gcc 4.4.7	102	127	160	4	7	400
gcc 4.5.4	101	126	160	4	6	397
gcc 4.6.4	103	128	162	5	8	406
gcc 4.7.4	100	122	156	3	6	387
gcc 4.8.3	101	122	155	3	6	387
gcc 4.9.2	102	118	150	3	8	381
clang 3.1	108	129	162	4	8	411
clang 3.2	104	125	160	4	6	399
clang 3.3	105	125	160	3	6	399
clang 3.4	106	126	161	3	8	404
clang 3.5	105	127	162	4	8	406
icc 2013	106	129	163	4	8	410
icc 2015	105	125	160	4	6	400

According to the totals, the difference between the slowest (gcc 4.1.2) and fastest (gcc 4.9.2) is ~10%. Again, gcc continuously improves, which is nice. Clang actually slightly slows down since 3.2, which is not so nice, and clang 3.5 is ~6.5% slower than gcc 4.9.2. And icc is somewhere in between, with a nice speedup between 2013 and 2015 versions.

But that was just loading the data, what about the actual queries? TPC-DS specifies 99 query templates. Some of those use features not yet available in PostgreSQL, leaving us with 61 PostgreSQL-compatible templates. Sadly 2 of those did not complete within 30 minutes on the 1GB dataset (clearly, room for improvement), so the actual benchmark consists of 59 queries.

Chart of total duration of three runs per query, using gcc 4.1.2 as a baseline (just like the pgbench, but this time lower numbers are better) looks like this:

Clearly, the differences are much more significant than in the pgbench results. Again, gcc continuously improves over time, with 4.9.2 being the winner here - the difference between 4.1.2 and 4.9.2 is astonishing ~15%. That's pretty significant improvement - good work, GCC developers!

Clang results fluctuate a lot - 3.1, 3.3 and 3.5 are quite good (not as good as gcc 4.9.2, though).

And icc is again somewhere in the middle - faster than gcc 4.1.2 but nowehere as fast as gcc 4.9.2 or the "good" clang versions. And this time 2015 actually slowed down (contrary to the previous results).

Summary

If your workload is transactional (pgbench-like), the compiler does not matter that much - either you're hitting disks (and the compiler does not matter at all), or the differences are within 5% from gcc 4.1.2. But if a gain this small is significant for you enough to warrant switching a compiler, you should probably consider getting a slightly more powerful hardware (CPU with more cores, faster RAM, better storage, ...).

Analytical workloads are a different case - gcc is a clear winner, and if you're using an ancient version (say, 4.3 or older), you can get ~10% speedup by switching to 4.7, or ~15% to 4.9. In any case, the newer the version, the better.

Tomas Vondra: PostgreSQL performance with gcc, clang and icc

pgbench

TPC-DS

Summary

Trending Articles

Scuffham Amps - S-GEAR 2.6.0 VST, AAX, STANDALONE x86 x64 (R2R NO iLok2, +NO...

Practice Sheet of Right form of verbs for HSC Students

VHSE First (1st) Allotment 2025 - vhscap.kerala.gov.in

UNIVERSE LEAGUE – UNIVERSE LEAGUE – WAR (We Are Ready) – EP [iTunes Plus M4A]

City Hunter Teledrama – Episode 18 – 07th May 2016

Comment on Proposed Criteria for Identifying Predatory Conferences by Luke...

Bureau of Internal Revenue: Regional Offices (Directory)

Kendrick Lamar – Not Like Us (2024) [24Bit-88.2kHz] [PMEDIA] ⭐️

Inception 2010 Hindi Dual Audio 650MB BRRip 720p ESubs HEVC

East Hull MD admits sexual assaults after another victim comes forward

Download: Ziba Zako ft Rich Bizzy & General Kanene – Chikwati (Prod by: Bicko...

R. v. Sargeant, 2023 ONSC 6406 (CanLII)

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Who’s been sentenced at Northampton Magistrates’ Court

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Family cries out as traditional ruler allegedly abducts brother, extorts N2.5m

Long-Running Conflict In Springfield (MA) Gangland Sphere Has Manzi Family &...

Wondershare Filmora X v10.1.20.16 x64

Man arrested after fracas in flat

Man charged in ongoing Sexual Assault Investigation Derek Nyilas, 46, Faces...