What is NDJSON / JSON Lines and why do logs use it?

NDJSON (newline-delimited JSON), also called JSON Lines, is one complete JSON object per line with no enclosing array. Logs use it because each line is independently valid and parseable — a tail can stream them, a single corrupt line does not break the rest, and appending is trivial. A whole-file JSON array would require rewriting the closing bracket on every append.

How are nested JSON log fields flattened for a SIEM?

Nested objects are collapsed into dotted or underscored keys — gateway.name becomes gateway.name or gateway_name, and gateway.code becomes gateway.code / gateway_code. The exact delimiter depends on the shipper (Filebeat, Fluent Bit, Vector) and its config. Flattening lets flat-schema stores index and filter on what were nested paths.

Why do my JSON log lines have different fields from each other?

Because the schema is per-event: an error event may carry order_id, user, and an error object while an info event carries none of them. This is normal for structured logging. Downstream systems should treat the union of possible keys as optional, not require a fixed set — the presence or absence of a key is itself signal.

Which key holds the timestamp in a JSON log?

There is no universal key — it depends on the logging library. Common ones are ts, time, @timestamp (the Elastic convention), and t. Check what your framework emits; the value is usually ISO 8601 with milliseconds (2026-07-03T14:22:15.003Z), but some libraries write epoch seconds or milliseconds instead.

Parse JSON application logs → regex, Grok, Wazuh & rsyslog

What a JSON application line looks like

The JSON sample below is fed verbatim into the engine to produce every parser on this page.

{"ts":"2026-07-03T14:22:15.003Z","level":"error","service":"checkout","msg":"payment failed","order_id":"ord_9f3c","user":"jdoe","ip":"203.0.113.45","gateway":{"name":"stripe","code":"card_declined"}}
{"ts":"2026-07-03T14:22:18.220Z","level":"info","service":"auth","msg":"login ok","user":"berkay","ip":"192.0.2.10","mfa":true}

Detected fields

The engine classified this sample as json and consolidated 10 fields across 2 lines. Fields marked literal were identical on every sample line, so they are baked into the pattern as anchors rather than captured.

ts : timestamp
level : severity
service : quoted_string
msg : quoted_string
order_id : quoted_string
user : username
ip : ipv4
gateway_name : quoted_string
gateway_code : quoted_string
mfa : literal

Regex (named capture groups)

# sample: {"ts":"2026-07-03T14:22:15.003Z","level":"error","service":"checkout","msg":"payment failed","order_id":"ord_9f3c","user":"jdoe","ip":"203.0.113.45","gateway":{"name":"stripe","code":"card_declined"}}
# groups: ts=2026-07-03T14:22:15.003Z, level=error, service=checkout, msg=payment failed, order_id=ord_9f3c, user=jdoe, ip=203.0.113.45, gateway_name=stripe, gateway_code=card_declined
^(?=.*?"ts":"(?<ts>[^"]*)")(?=.*?"level":"(?<level>[^"]*)")(?=.*?"service":"(?<service>[^"]*)")(?=.*?"msg":"(?<msg>[^"]*)")(?=.*?"order_id":"(?<order_id>[^"]*)"|)(?=.*?"user":"(?<user>[^"]*)")(?=.*?"ip":"(?<ip>[^"]*)")(?=.*?"name":"(?<gateway_name>[^"]*)"|)(?=.*?"code":"(?<gateway_code>[^"]*)"|)(?=.*?"mfa":(?<mfa>[A-Za-z]+)|).*$

note input is JSON — use a JSON parser (jq, Logstash json filter, …) instead of a regex where possible
note a single linear template could not reproduce every input line — fields are captured with order-independent lookaheads instead

Grok pattern (Logstash / Elastic)

# custom patterns
JSON_NOTDQUOTE [^"]*

\{"ts":"%{TIMESTAMP_ISO8601:ts}","level":"%{LOGLEVEL:level}","service":"%{JSON_NOTDQUOTE:service}","msg":"%{JSON_NOTDQUOTE:msg}(?:","order_id":"%{JSON_NOTDQUOTE:order_id})?","user":"%{USERNAME:user}","ip":"%{IPV4:ip}(?:","gateway":\{"name":"%{JSON_NOTDQUOTE:gateway_name})?(?:","code":"%{JSON_NOTDQUOTE:gateway_code})?(?:","mfa":%{GREEDYDATA:mfa})?

note json input — consider the Logstash json codec/filter instead of grok
note 4 optional field(s) wrapped in (?:…)? inline regex — grok has no native optional syntax
note custom patterns emitted — save the '# custom patterns' block to a file in your patterns_dir

Wazuh decoder (OS_Regex XML)

<!--
  Generated by LogForge - Wazuh decoder (OS_Regex dialect, not PCRE)
  sample: {"ts":"2026-07-03T14:22:15.003Z","level":"error","service":"checkout","msg":"payment failed","order_id":"ord_9f3c","user":"jdoe","ip":"203.0.113.45","gateway":{
  test with: /var/ossec/bin/wazuh-logtest
-->

<decoder name="json-json">
  <prematch>^{</prematch>
  <plugin_decoder>JSON_Decoder</plugin_decoder>
</decoder>

note JSON input: emitted a JSON_Decoder plugin decoder — Wazuh extracts every key automatically as dynamic fields (nested keys become dotted names)
note field "ip" mapped to Wazuh conventional field "srcip"
note field names above are what the other LogForge generators use; JSON_Decoder will use the raw JSON keys instead
note decoder order and prematch specificity may need site-specific tuning (other decoders in your ruleset can shadow these) — validate with /var/ossec/bin/wazuh-logtest

rsyslog template / liblognorm rulebase

version=2
# json — liblognorm v2 rulebase (generated by LogForge)
# Usage with rsyslog (mmnormalize runs liblognorm):
#   module(load="mmnormalize")
#   action(type="mmnormalize" rulebase="/etc/rsyslog.d/json.rb" useRawMsg="on")
# Literal "%" is escaped as "%%"; raw tabs are written as \x09.
rule=json:{"ts":"%ts:date-rfc5424%","level":"%level:char-to{"extradata":"\""}%","service":"%service:char-to{"extradata":"\""}%","msg":"%msg:char-to{"extradata":"\""}%","order_id":"%order_id:char-to{"extradata":"\""}%","user":"%user:char-to{"extradata":"\""}%","ip":"%ip:ipv4%","gateway":{"name":"%gateway_name:char-to{"extradata":"\""}%","code":"%gateway_code:char-to{"extradata":"\""}%","mfa":%mfa:char-to{"extradata":"\""}%"}}
rule=json:{"ts":"%ts:date-rfc5424%","level":"%level:char-to{"extradata":"\""}%","service":"%service:char-to{"extradata":"\""}%","msg":"%msg:char-to{"extradata":"\""}%","user":"%user:char-to{"extradata":"\""}%","ip":"%ip:ipv4%"

note json structure: rsyslog mmjsonparse handles CEE/JSON natively — consider action(type="mmjsonparse") instead of this rulebase
note trailing literal "\"}}" reconstructed from line 1
note chosen parser types: ts=date-rfc5424, level=char-to("), service=char-to("), msg=char-to("), order_id=char-to("), user=char-to("), ip=ipv4, gateway_name=char-to("), gateway_code=char-to("), mfa=char-to(")
note optional columns (order_id, gateway_name, gateway_code, mfa): liblognorm has no optional parts within a single rule — emitted a second rule variant with only the always-present columns (max 2 variants; lines with other column combinations will not match and need extra rule= lines)

FAQ

What is NDJSON / JSON Lines and why do logs use it?: NDJSON (newline-delimited JSON), also called JSON Lines, is one complete JSON object per line with no enclosing array. Logs use it because each line is independently valid and parseable — a tail can stream them, a single corrupt line does not break the rest, and appending is trivial. A whole-file JSON array would require rewriting the closing bracket on every append.
How are nested JSON log fields flattened for a SIEM?: Nested objects are collapsed into dotted or underscored keys — gateway.name becomes gateway.name or gateway_name, and gateway.code becomes gateway.code / gateway_code. The exact delimiter depends on the shipper (Filebeat, Fluent Bit, Vector) and its config. Flattening lets flat-schema stores index and filter on what were nested paths.
Why do my JSON log lines have different fields from each other?: Because the schema is per-event: an error event may carry order_id, user, and an error object while an info event carries none of them. This is normal for structured logging. Downstream systems should treat the union of possible keys as optional, not require a fixed set — the presence or absence of a key is itself signal.
Which key holds the timestamp in a JSON log?: There is no universal key — it depends on the logging library. Common ones are ts, time, @timestamp (the Elastic convention), and t. Check what your framework emits; the value is usually ISO 8601 with milliseconds (2026-07-03T14:22:15.003Z), but some libraries write epoch seconds or milliseconds instead.

Try it on your own JSON application lines

Paste a few real lines, review the detected fields, and copy whichever format your stack needs. Free, no account, nothing uploaded.

Open this sample in LogForge →