bokamba / logforge / parse / Kubernetes (klog)

$ logforge parse kubernetes

Parse Kubernetes (klog) logs → regex, Grok, Wazuh & rsyslog

Kubernetes control-plane and node components — the kube-apiserver, kube-scheduler, kube-controller-manager, kubelet, and kube-proxy — log through klog, the successor to Google's glog, and its line format is unmistakable once you have seen it. Each line opens with a single severity LETTER (I, W, E, or F for Info, Warning, Error, Fatal), immediately followed by the date as MMDD, a space, then the time as HH:MM:SS.microseconds, the thread ID, and the source file:line in the form I0703 14:22:15.003042 1 controller.go:214] before the actual message. So a line like 'W0703 14:22:15.884219 1 reflector.go:324] watch of *v1.Pod ended with: too old resource version' packs severity, sub-second time, and the exact source location into a rigid prefix.

The detail that catches everyone is the timestamp: klog does NOT record a year. It writes only month and day (0703), so any date math has to infer the year from context and cope with the December-to-January boundary exactly like RFC 3164 syslog. The microsecond precision after the seconds is present and useful for ordering high-rate control-plane events. The thread ID and file:line are diagnostic gold — they point you straight at the source line that emitted the message — but they are positional, not keyed, so the parser matches them by position in the prefix. Modern Kubernetes components increasingly use STRUCTURED klog, where the message is followed by "msg" plus explicit key="value" pairs (msg="Pod status updated" pod="default/web-abc" status="Running"), so a robust parser handles both the classic free-text tail and the structured key=value tail after the same fixed prefix.

For operations the severity letter is the first triage axis — a burst of E-lines from the apiserver or a single F (fatal, which precedes a component crash) is what you alert on. The file:line tells you which subsystem is unhappy (reflector.go and its 'too old resource version' warnings are normal watch churn; authentication.go errors are not). When the structured form is in play, the key=value pairs give you the object identity — pod, namespace, node — to correlate an event to a specific workload. Because the prefix is fixed-width and positional while the tail is either free text or key=value, klog is a two-mode parse: one pattern for the severity/time/threadid/file:line header, then a mode switch on the message body.

Open this in LogForge →

What a Kubernetes (klog) line looks like

The freeform sample below is fed verbatim into the engine to produce every parser on this page.

I0703 14:22:15.123456    1234 controller.go:210] "Reconciling object" namespace="shop" name="web" reason="Scheduled"
E0703 14:22:19.884211    1290 controller.go:245] "Sync failed" namespace="infra" name="api" reason="ImagePullBackOff"

Detected fields

The engine classified this sample as freeform and consolidated 8 fields across 2 lines. Fields marked literal were identical on every sample line, so they are baked into the pattern as anchors rather than captured.

  • literal : literal
  • timestamp : timestamp
  • number : number
  • literal2 : literal
  • quoted_string : quoted_string
  • literal3 : literal
  • literal4 : literal
  • literal5 : literal

Regex (named capture groups)

# sample: I0703 14:22:15.123456    1234 controller.go:210] "Reconciling object" namespace="shop" name="web" reason="Scheduled"
# groups: literal=I0703, timestamp=14:22:15.123456, number=1234, literal2=controller.go:210], quoted_string=Reconciling object, literal3=namespace="shop", literal4=name="web", literal5=reason="Scheduled"
^(?<literal>[A-Za-z]+\d+) (?<timestamp>\d+:\d+:\d+\.\d+)    (?<number>-?\d+(?:\.\d+)?) (?<literal2>[A-Za-z]+\.[A-Za-z]+:\d+\]) "(?<quoted_string>[^"]*)" (?<literal3>[A-Za-z]+="[A-Za-z]+") (?<literal4>[A-Za-z]+="[A-Za-z]+") (?<literal5>[A-Za-z]+="[A-Za-z]+")$

Grok pattern (Logstash / Elastic)

# custom patterns
KUBERNETES_NOTDQUOTE [^"]*

%{NOTSPACE:literal} %{TIME:timestamp}    %{NUMBER:number} %{NOTSPACE:literal2} "%{KUBERNETES_NOTDQUOTE:quoted_string}" %{NOTSPACE:literal3} %{NOTSPACE:literal4} %{GREEDYDATA:literal5}
  • note custom patterns emitted — save the '# custom patterns' block to a file in your patterns_dir

Wazuh decoder (OS_Regex XML)

<!--
  Generated by LogForge - Wazuh decoder (OS_Regex dialect, not PCRE)
  sample: I0703 14:22:15.123456    1234 controller.go:210] "Reconciling object" namespace="shop" name="web" reason="Scheduled"
  test with: /var/ossec/bin/wazuh-logtest
-->

<decoder name="kubernetes-freeform">
  <prematch>^\w+ </prematch>
</decoder>

<decoder name="kubernetes-freeform">
  <parent>kubernetes-freeform</parent>
  <regex>^(\w+) (\S+)    (\d+) (\S+) "(\.+)" (\S+) (\S+) (\S+)</regex>
  <order>literal, timestamp, number, literal2, quoted_string, literal3, literal4, literal5</order>
</decoder>
  • note no stable literal prefix found — <prematch> anchors on the leading field pattern; tighten it for your environment
  • note field "quoted_string": free-text capture (\.+) bounded by a quote anchor — OS_Regex greediness may over-consume if the anchor repeats
  • note decoder order and prematch specificity may need site-specific tuning (other decoders in your ruleset can shadow these) — validate with /var/ossec/bin/wazuh-logtest

rsyslog template / liblognorm rulebase

version=2
# kubernetes — liblognorm v2 rulebase (generated by LogForge)
# Usage with rsyslog (mmnormalize runs liblognorm):
#   module(load="mmnormalize")
#   action(type="mmnormalize" rulebase="/etc/rsyslog.d/kubernetes.rb" useRawMsg="on")
# Literal "%" is escaped as "%%"; raw tabs are written as \x09.
rule=kubernetes:%literal:word% %timestamp:word%    %number:number% %literal2:word% "%quoted_string:char-to{"extradata":"\""}%" %literal3:word% %literal4:word% %literal5:word%
  • note field "timestamp": samples do not uniformly match engine type "timestamp"; using a generic parser
  • note chosen parser types: literal=word, timestamp=word, number=number, literal2=word, quoted_string=char-to("), literal3=word, literal4=word, literal5=word

FAQ

How do I read the klog prefix on a Kubernetes log line?
The prefix is Lmmdd hh:mm:ss.uuuuuu threadid file:line]. The first character is the severity letter (I/W/E/F = Info/Warning/Error/Fatal), then the month-day, the time with microseconds, the thread ID, and the source file and line. Everything after the ] is the message. It is positional, so match it by position rather than by keys.
Why is there no year in a Kubernetes (klog) timestamp?
klog inherits glog's format, which records only MMDD and the time — never the year. Parsers must supply the year from context and handle the December-to-January rollover, the same problem RFC 3164 syslog has. The microsecond precision is present, which helps order rapid control-plane events.
What is structured klog and how does it change parsing?
Newer Kubernetes components emit the message as an "msg" value followed by explicit key="value" pairs (pod=, namespace=, node=). This sits after the same fixed klog prefix, so a robust parser reads the header positionally, then switches modes: free-text tail for classic klog, key=value tail for structured klog.
What do the severity letters mean and which should I alert on?
I is informational, W a warning, E an error, and F fatal (the component logs it and then exits). Alert on F immediately — it means a crash — and on sustained bursts of E from the apiserver or controllers. Many W lines (like reflector "too old resource version") are normal watch churn and are not by themselves actionable.

Try it on your own Kubernetes (klog) lines

Paste a few real lines, review the detected fields, and copy whichever format your stack needs. Free, no account, nothing uploaded.

Open this sample in LogForge →