bokamba / logforge / parse / Docker container

$ logforge parse docker

Parse Docker container logs → regex, Grok, Wazuh & rsyslog

"Docker logs" means two different formats, and conflating them is the first mistake. This page targets the Docker DAEMON — dockerd itself and the many Go-based services (containerd, and most CNCF tooling) that log with the logrus library. Logrus's text formatter writes a line of key=value pairs led by a quoted RFC3339Nano timestamp and a level: time="2026-07-03T14:22:15.003042Z" level=info msg="API listen on /var/run/docker.sock" and so on. That is distinct from the OTHER thing people call container logs: the stdout/stderr of the application inside a container, which the default json-file logging driver wraps as one JSON object per line — {"log":"the app's actual output line\n","stream":"stdout","time":"2026-07-03T14:22:15.003Z"} — where your real application log is a string inside the "log" key. Know which one you have before choosing a parser; this page is the daemon/logrus key=value form.

The logrus format's hazards are the usual key=value ones plus a few specific to it. The time value is quoted and uses nanosecond precision with a Z or numeric offset, so it must be captured as a quoted field, not split on the 'T'. level is one of a fixed set (trace, debug, info, warning, error, fatal, panic) and is a reliable field to branch severity on. msg is quoted whenever it contains spaces (which is almost always) and can itself contain escaped quotes, so a naive space-split shreds it — capture the quoted value as a unit. Beyond those three, logrus emits arbitrary structured fields the developer attached — a component=, a container= ID, an error= — in the same key=value style, in an order that is not guaranteed, so parse them order-independently.

For operations and detection the fields that carry weight are level (isolate warning/error/fatal from the info noise), msg (the event), and whichever structured keys the daemon attached: a container or image ID to localize which workload emitted the event, an error string for root cause, and component to tell dockerd subsystems apart. If instead you are parsing the json-file container output, the meaningful work is unwrapping the "log" field and then parsing THAT according to whatever the application inside emits (often its own JSON or text format) — a two-stage parse the daemon format does not require.

Open this in LogForge →

What a Docker container line looks like

The key=value sample below is fed verbatim into the engine to produce every parser on this page.

time="2026-07-03T14:22:15.123456789Z" level=info msg="Container started" container=9f3c2d1a4b7e image="nginx:1.27" name=web-1
time="2026-07-03T14:22:19.884211003Z" level=error msg="Health check failed" container=ab112cd3f001 image="api:2.4.1" name=api-2 exitCode=1

Detected fields

The engine classified this sample as kv and consolidated 7 fields across 2 lines. Fields marked literal were identical on every sample line, so they are baked into the pattern as anchors rather than captured.

  • time : timestamp
  • level : severity
  • msg : quoted_string
  • container : literal
  • image : quoted_string
  • name : literal
  • exitcode : number

Regex (named capture groups)

# sample: time="2026-07-03T14:22:15.123456789Z" level=info msg="Container started" container=9f3c2d1a4b7e image="nginx:1.27" name=web-1
# groups: time=2026-07-03T14:22:15.123456789Z, level=info, msg=Container started, container=9f3c2d1a4b7e, image=nginx:1.27, name=web-1
^time="(?<time>[^"]*)" level=(?<level>[A-Za-z]+) msg="(?<msg>[^"]*)" container=(?<container>(?:[A-Za-z]+\d+[A-Za-z]+\d+[A-Za-z]+\d+|\d+[A-Za-z]+\d+[A-Za-z]+\d+[A-Za-z]+\d+[A-Za-z]+\d+[A-Za-z]+\d+[A-Za-z]+)) image="(?<image>[^"]*)" name=(?<name>[A-Za-z]+-\d+)(?: exitCode=(?<exitcode>-?\d+(?:\.\d+)?))?$

Grok pattern (Logstash / Elastic)

# custom patterns
DOCKER_NOTDQUOTE [^"]*

time="%{TIMESTAMP_ISO8601:time}" level=%{LOGLEVEL:level} msg="%{DOCKER_NOTDQUOTE:msg}" container=%{NOTSPACE:container} image="%{DOCKER_NOTDQUOTE:image}" name=%{NOTSPACE:name}(?: exitCode=%{NUMBER:exitcode})?
  • note kv-structured input — consider the Logstash kv filter instead of (or after) grok
  • note 1 optional field(s) wrapped in (?:…)? inline regex — grok has no native optional syntax
  • note custom patterns emitted — save the '# custom patterns' block to a file in your patterns_dir

Wazuh decoder (OS_Regex XML)

<!--
  Generated by LogForge - Wazuh decoder (OS_Regex dialect, not PCRE)
  sample: time="2026-07-03T14:22:15.123456789Z" level=info msg="Container started" container=9f3c2d1a4b7e image="nginx:1.27" name=web-1
  test with: /var/ossec/bin/wazuh-logtest
-->

<decoder name="docker-kv">
  <prematch>^time="</prematch>
</decoder>

<decoder name="docker-kv">
  <parent>docker-kv</parent>
  <regex offset="after_parent">^(\d+-\d+-\d+T\d+:\d+:\d+.\d+Z)" level=(\w+)</regex>
  <order>time, level</order>
</decoder>

<decoder name="docker-kv">
  <parent>docker-kv</parent>
  <regex offset="after_parent"> container=(\w+)</regex>
  <order>container</order>
</decoder>

<decoder name="docker-kv">
  <parent>docker-kv</parent>
  <regex offset="after_parent"> name=(\w+)</regex>
  <order>name</order>
</decoder>

<decoder name="docker-kv">
  <parent>docker-kv</parent>
  <regex offset="after_parent"> exitCode=(\d+)</regex>
  <order>exitcode</order>
</decoder>
  • note field "msg" skipped: no safe OS_Regex pattern for its values (mixed shapes / mid-line free text)
  • note field "image" skipped: no safe OS_Regex pattern for its values (mixed shapes / mid-line free text)
  • note kv fields are extracted by same-named sibling decoders (offset="after_parent"), so per-line field order/absence is tolerated — the shared name is what makes Wazuh evaluate every sibling
  • note decoder order and prematch specificity may need site-specific tuning (other decoders in your ruleset can shadow these) — validate with /var/ossec/bin/wazuh-logtest

rsyslog template / liblognorm rulebase

version=2
# docker — liblognorm v2 rulebase (generated by LogForge)
# Usage with rsyslog (mmnormalize runs liblognorm):
#   module(load="mmnormalize")
#   action(type="mmnormalize" rulebase="/etc/rsyslog.d/docker.rb" useRawMsg="on")
# Literal "%" is escaped as "%%"; raw tabs are written as \x09.
rule=docker:time="%time:date-rfc5424%" level=%level:word% msg="%msg:char-to{"extradata":"\""}%" container=%container:word% image="%image:char-to{"extradata":"\""}%" name=%name:word% exitCode=%exitcode:number%
rule=docker:time="%time:date-rfc5424%" level=%level:word% msg="%msg:char-to{"extradata":"\""}%" container=%container:word% image="%image:char-to{"extradata":"\""}%" name=%name:word%
  • note kv structure: rsyslog offers mmfields (fast, fixed single-char separator, untyped) and mmnormalize (this rulebase, typed fields + literal anchors); mmnormalize was chosen for typed extraction
  • note chosen parser types: time=date-rfc5424, level=word, msg=char-to("), container=word, image=char-to("), name=word, exitcode=number
  • note optional columns (exitcode): liblognorm has no optional parts within a single rule — emitted a second rule variant with only the always-present columns (max 2 variants; lines with other column combinations will not match and need extra rule= lines)

FAQ

What is the difference between dockerd logs and container stdout logs?
They are two formats. dockerd (the daemon) and Go tooling use logrus key=value with a quoted time="…" and level=. A container's own stdout/stderr, captured by the default json-file driver, is wrapped as {"log":"…","stream":"stdout","time":"…"} — your app's line lives inside the "log" key. This page parses the daemon/logrus form; container output needs the JSON unwrap first.
Why does my Docker daemon msg field get split incorrectly?
Because logrus quotes the msg value when it contains spaces (msg="API listen on /var/run/docker.sock") and it usually does. Splitting the line on every space breaks it apart. Capture the quoted value as a single unit, and handle escaped quotes inside it, rather than treating each word as a token.
What timestamp format does the Docker daemon use?
RFC3339Nano — an ISO 8601 timestamp with nanosecond fractional seconds and a Z or numeric UTC offset, emitted as a quoted value: time="2026-07-03T14:22:15.003042Z". Capture the whole quoted string, then parse it as RFC 3339; do not split on the internal T or the fractional dot.
How do I unwrap application logs from the json-file driver?
Each line is a JSON object with log, stream, and time keys. Parse the JSON, take the "log" value (which ends in a newline), and that string is your application's actual output — which you then parse a second time according to whatever format the app emits (its own JSON, a text line, etc.). It is a two-stage parse.

Try it on your own Docker container lines

Paste a few real lines, review the detected fields, and copy whichever format your stack needs. Free, no account, nothing uploaded.

Open this sample in LogForge →