Jim Mlodgenski: Fluming your PostgreSQL logs

With a default configuration of PostgreSQL, the information in the PostgreSQL logs is generally pretty small with some warnings and the occasional typo in psql. However, once you start tweaking the configuration parameters a bit and you start logging more things like query times, the logs get a lot bigger. Having PostgreSQL generate over 10GB of logs per day is not unheard of if you set log_min_duration_statement low enough on a busy server. You can analyze logs of that size with pgBadger pretty easily, but to get the real value, you want to look at the logs of the course of time. Things like how does the query load look compared to the same time last week, last month or last year. If you save the daily reports from pgBadger, you can look at them over time, but that’s a manaul process. Another option is to put your logs into Hadoop and use MapReduce to analyze the information over time.

The Hadoop project has a service that is built for exactly that purpose called Flume. It lets you setup a Flume agent that will listen for the logs and then push them into Hadoop. Setting everything up is pretty simple.

First, you’ll want to setup PostgreSQL to send the logs to syslog. In postgresql.conf, you’ll set:

log_destination = 'syslog'

Then you want to setup your syslog server to forward messages to Flume. Depending on your syslog server, you’ll setup a rule like this in rsyslog.conf

*.* @127.0.0.1:5140

Finally, you’ll need to start Flume where it will listen for syslog messages and push the messages into an HFDS sink.

# pglogs.conf: A single-node Flume configuration
# Name the components on this agent
pglogs.sources = sl
pglogs.sinks = k1
pglogs.channels = c1
# Describe/configure the source
pglogs.sources.sl.type = syslogudp
pglogs.sources.sl.port = 5140
pglogs.sources.sl.host = 0.0.0.0
# Describe the sink
pglogs.sinks.k1.type = hdfs
# Use a channel which buffers events in memory
pglogs.channels.c1.type = memory
pglogs.channels.c1.capacity = 1000
pglogs.channels.c1.transactionCapacity = 100
#Describe the HDFS Sink 
pglogs.sinks.k1.hdfs.path = /user/postgres/logs
pglogs.sinks.k1.hdfs.filePrefix = pg
pglogs.sinks.k1.hdfs.fileSuffix = .log
pglogs.sinks.k1.hdfs.fileType = DataStream
# Bind the source and sink to the channel
pglogs.sources.sl.channels = c1
pglogs.sinks.k1.channel = c1

Jim Mlodgenski: Fluming your PostgreSQL logs

Trending Articles

Moondru Mudichu 07-06-2016 – Polimer tv Serial

Practice Sheet of Right form of verbs for HSC Students

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Young Qualified Chinese Masseuse Erotic or Authentic

Ae Dil Hai Mushkil (2016) (Music.Videos) Untouched - BluRay - AVC - TrueHD...

Shatta Wale – You Shock Me (Prod. by Willis Beatz)

Neem Baba Extra Questions Answer Class 6 English Poorvi

Bas Tum Tak Lyrics Translation (Raanjhnaa/ Raanjhanaa/ Raanjhana)

Elle Duncan’s Husband Omar Abdul Ali

Cannot insert multiple records in SysCompileILTable (SysCompileILTable).The...

A Rage Up In The Raleigh House: Man Says He Watched Jimmy Hoffa’s Body Get...

Greg Gutfeld

Nottingham court listings: Who has been appearing before magistrates?

Boyfriend charged with murder of teen footballer

Download: Shimpanzi – Exra Time (Prod by: Studio 26)

£2 million worth of cocaine estimated in supply plot by jailed Grantham men

Property developer set up cannabis factory to help pay off debts...

Korean Sex Porn Videos: XXX Videos & Free Porn Movies

Camila Ballon Arrested by Miami-Dade County Corrections on May 06, 2020

'Gleeful' street attacker broke face of victim who was on night out in...