bokamba / logforge / parse / PostgreSQL

$ logforge parse postgresql

Parse PostgreSQL logs → regex, Grok, Wazuh & rsyslog

PostgreSQL's log format is not one fixed layout — it is whatever log_line_prefix says it is, and that is the single most important fact for parsing it. The prefix is a printf-style template of %-escapes the DBA configures, and it prepends a header to every log line before the LEVEL and message. A very common setting is '%m [%p] %q%u@%d ' which produces a millisecond timestamp with timezone, the process ID in brackets, and then (only for session-bound lines, thanks to %q) the user@database — giving a header like '2026-07-03 14:22:15.003 UTC [4821] jdoe@appdb '. Other deployments add %a (application name), %h (client host), %l (session line number), or %c (session ID), and some omit the user@db entirely, so there is no universal Postgres pattern: a parser MUST be matched to the deployment's actual log_line_prefix.

After the configurable prefix comes the part that is stable: a LEVEL: followed by the message. The levels are Postgres's own set — LOG, ERROR, FATAL, PANIC, WARNING, STATEMENT, DETAIL, HINT, CONTEXT — and they drive both severity and multi-line structure. ERROR/FATAL lines are frequently followed by continuation lines at DETAIL:, HINT:, STATEMENT:, and CONTEXT: levels that belong to the same event, so Postgres logging is genuinely multi-line: an ERROR and its trailing STATEMENT: (which echoes the offending SQL) must be correlated to be useful. The other high-value line type comes from log_min_duration_statement: when a query exceeds the configured threshold Postgres emits a LOG line of the form 'duration: 42.318 ms statement: SELECT …', and those 'duration:' lines are the raw material for slow-query analysis. Every field before the 'duration:' — the timestamp, PID, and user@db — comes from the prefix, so again the prefix governs the parse.

For operations and security the load-bearing fields are the level (isolate ERROR/FATAL/PANIC from routine LOG noise), the user@database and client host from the prefix (who connected and from where — repeated FATAL 'password authentication failed for user' lines are a brute-force signal), the PID (to thread a session's lines together), and the duration plus statement text (slow-query and, when auditing, the actual SQL executed). Because the header is deployment-defined and the event body can span several LEVEL-tagged lines, a robust Postgres parser is two things at once: a prefix matcher tuned to log_line_prefix, and a multi-line correlator that attaches DETAIL/HINT/STATEMENT continuations to their parent ERROR.

Open this in LogForge →

What a PostgreSQL line looks like

The freeform sample below is fed verbatim into the engine to produce every parser on this page.

2026-07-03 14:22:15.123 UTC [1234] jdoe@onber LOG:  duration: 1201.334 ms  statement: SELECT * FROM orders WHERE id = 42
2026-07-03 14:22:19.884 UTC [1290] admin@onber LOG:  duration: 843.201 ms  statement: SELECT * FROM users WHERE id = 7

Detected fields

The engine classified this sample as freeform and consolidated 18 fields across 2 lines. Fields marked literal were identical on every sample line, so they are baked into the pattern as anchors rather than captured.

  • timestamp : timestamp · literal
  • timestamp2 : timestamp
  • _lit1 : literal · literal
  • number : number
  • literal : literal
  • _lit2 : literal · literal
  • _lit3 : literal · literal
  • number2 : number
  • _lit4 : literal · literal
  • _lit5 : literal · literal
  • _lit6 : literal · literal
  • _lit7 : literal · literal
  • _lit8 : literal · literal
  • literal2 : literal
  • _lit9 : literal · literal
  • _lit10 : literal · literal
  • _lit11 : literal · literal
  • number3 : number

Regex (named capture groups)

# sample: 2026-07-03 14:22:15.123 UTC [1234] jdoe@onber LOG:  duration: 1201.334 ms  statement: SELECT * FROM orders WHERE id = 42
# groups: timestamp2=14:22:15.123, number=1234, literal=jdoe@onber, number2=1201.334, literal2=orders, number3=42
^2026-07-03 (?<timestamp2>\d+:\d+:\d+\.\d+) UTC \[(?<number>-?\d+(?:\.\d+)?)\] (?<literal>[A-Za-z]+@[A-Za-z]+) LOG:  duration: (?<number2>-?\d+(?:\.\d+)?) ms  statement: SELECT \* FROM (?<literal2>[A-Za-z]+) WHERE id = (?<number3>-?\d+(?:\.\d+)?)$

Grok pattern (Logstash / Elastic)

2026-07-03 %{TIME:timestamp2} UTC \[%{NUMBER:number}\] %{NOTSPACE:literal} LOG:  duration: %{NUMBER:number2} ms  statement: SELECT \* FROM %{NOTSPACE:literal2} WHERE id = %{NUMBER:number3}
  • note constant field "timestamp" embedded as literal anchor "2026-07-03" (varying=false)

Wazuh decoder (OS_Regex XML)

<!--
  Generated by LogForge - Wazuh decoder (OS_Regex dialect, not PCRE)
  sample: 2026-07-03 14:22:15.123 UTC [1234] jdoe@onber LOG:  duration: 1201.334 ms  statement: SELECT * FROM orders WHERE id = 42
  test with: /var/ossec/bin/wazuh-logtest
-->

<decoder name="postgresql-freeform">
  <prematch>^\d+-\d+-\d+ </prematch>
</decoder>

<decoder name="postgresql-freeform">
  <parent>postgresql-freeform</parent>
  <regex>^2026-07-03 (\S+) UTC [(\d+)] (\w+) LOG:  duration: (\d+.\d+) ms  statement: SELECT \p FROM (\w+) WHERE id = (\d+)</regex>
  <order>timestamp2, number, literal, number2, literal2, number3</order>
</decoder>
  • note literal '*' in the log text has no exact OS_Regex representation — matched with \p (any punctuation character)
  • note constant field "timestamp" embedded as literal anchor "2026-07-03"
  • note decoder order and prematch specificity may need site-specific tuning (other decoders in your ruleset can shadow these) — validate with /var/ossec/bin/wazuh-logtest

rsyslog template / liblognorm rulebase

version=2
# postgresql — liblognorm v2 rulebase (generated by LogForge)
# Usage with rsyslog (mmnormalize runs liblognorm):
#   module(load="mmnormalize")
#   action(type="mmnormalize" rulebase="/etc/rsyslog.d/postgresql.rb" useRawMsg="on")
# Literal "%" is escaped as "%%"; raw tabs are written as \x09.
rule=postgresql:2026-07-03 %timestamp2:word% UTC [%number:number%] %literal:word% LOG:  duration: %number2:float% ms  statement: SELECT * FROM %literal2:word% WHERE id = %number3:number%
  • note field "timestamp2": samples do not uniformly match engine type "timestamp"; using a generic parser
  • note chosen parser types: timestamp2=word, number=number, literal=word, number2=float, literal2=word, number3=number

FAQ

Why does no single regex parse all PostgreSQL logs?
Because the line header is defined by the log_line_prefix setting, which each DBA configures differently (%m, %p, %u, %d, %h, %a, and more, in any order). Two Postgres servers can produce structurally different headers. A parser must be matched to the specific deployment's log_line_prefix; there is no universal Postgres pattern.
What do the LOG, ERROR, FATAL, and PANIC levels mean?
They are severity levels after the prefix. LOG is routine operational info, WARNING is a concern, ERROR aborts the current statement, FATAL aborts the session (e.g. failed authentication), and PANIC aborts all sessions and restarts the server. ERROR/FATAL/PANIC are what you alert on; a burst of FATAL "password authentication failed" is a brute-force signal.
Where do the "duration:" slow-query lines come from?
From log_min_duration_statement. When a statement runs longer than the configured threshold, Postgres logs a LOG line like "duration: 42.318 ms statement: SELECT …". Everything before "duration:" is the configured prefix; the duration and statement text are what you extract for slow-query analysis.
How do I handle multi-line PostgreSQL error events?
An ERROR or FATAL line is often followed by continuation lines tagged DETAIL:, HINT:, STATEMENT:, and CONTEXT: that belong to the same event — the STATEMENT: line echoes the offending SQL. Correlate these back to their parent by proximity and shared PID/session, rather than treating each line as an independent event.

Try it on your own PostgreSQL lines

Paste a few real lines, review the detected fields, and copy whichever format your stack needs. Free, no account, nothing uploaded.

Open this sample in LogForge →