What a Nginx access line looks like
The Combined sample below is fed verbatim into the engine to produce every parser on this page.
203.0.113.45 - - [03/Jul/2026:14:22:15 +0300] "GET /api/health HTTP/1.1" 200 2 "-" "kube-probe/1.29"
198.51.100.23 - - [03/Jul/2026:14:22:19 +0300] "POST /login HTTP/1.1" 401 231 "https://example.com/" "Mozilla/5.0 (X11; Linux x86_64; rv:126.0) Gecko/20100101 Firefox/126.0" Detected fields
The engine classified this sample as freeform and consolidated 11 fields across 2 lines. Fields marked literal were identical on every sample line, so they are baked into the pattern as anchors rather than captured.
- ip1 : ipv4
- _lit1 : literal · literal
- _lit2 : literal · literal
- timestamp : timestamp
- method : http_method
- path : path
- quoted_string : quoted_string · literal
- status : http_status
- number : number
- url : url
- user_agent : user_agent
Regex (named capture groups)
# sample: 203.0.113.45 - - [03/Jul/2026:14:22:15 +0300] "GET /api/health HTTP/1.1" 200 2 "-" "kube-probe/1.29"
# groups: ip1=203.0.113.45, timestamp=03/Jul/2026:14:22:15 +0300, method=GET, path=/api/health, status=200, number=2, url=-, user_agent=kube-probe/1.29
^(?<ip1>\d{1,3}(?:\.\d{1,3}){3}) - - \[(?<timestamp>\d+/[A-Za-z]+/\d+:\d+:\d+:\d+ \+\d+)\] "(?<method>[^"]*) (?<path>(?:/[^\s"']*|[A-Za-z]:[^\s"']*)) HTTP/1\.1" (?<status>\d{3}) (?<number>-?\d+(?:\.\d+)?) "(?<url>[^"]*)" "(?<user_agent>[^"]*)"$ Grok pattern (Logstash / Elastic)
# custom patterns
NGINX_NOTDQUOTE [^"]*
%{IPV4:ip1} - - \[%{HTTPDATE:timestamp}\] "%{WORD:method} %{UNIXPATH:path} HTTP/1\.1" %{INT:status} %{NUMBER:number} "%{NGINX_NOTDQUOTE:url}" "%{NGINX_NOTDQUOTE:user_agent} - note constant field "quoted_string" embedded as literal anchor "HTTP/1.1" (varying=false)
- note field "url" (url): samples do not all match %{URI}; using %{NGINX_NOTDQUOTE} instead
- note custom patterns emitted — save the '# custom patterns' block to a file in your patterns_dir
Wazuh decoder (OS_Regex XML)
<!--
Generated by LogForge - Wazuh decoder (OS_Regex dialect, not PCRE)
sample: 203.0.113.45 - - [03/Jul/2026:14:22:15 +0300] "GET /api/health HTTP/1.1" 200 2 "-" "kube-probe/1.29"
test with: /var/ossec/bin/wazuh-logtest
-->
<decoder name="nginx-freeform">
<prematch>^\d+.\d+.\d+.\d+ </prematch>
</decoder>
<decoder name="nginx-freeform">
<parent>nginx-freeform</parent>
<regex>^(\d+.\d+.\d+.\d+) - - [(\d+/\w+/\d+:\d+:\d+:\d+ \p\d+)] "(\w+) (\S+) HTTP/1.1" (\d+) (\d+) "(\.+)" "(\.+)"</regex>
<order>srcip, timestamp, method, path, status, number, url, user_agent</order>
</decoder>
- note no stable literal prefix found — <prematch> anchors on the leading field pattern; tighten it for your environment
- note field "ip1" mapped to Wazuh conventional field "srcip"
- note field "url": free-text capture (\.+) bounded by a quote anchor — OS_Regex greediness may over-consume if the anchor repeats
- note field "user_agent": free-text capture (\.+) bounded by end of line — OS_Regex greediness may over-consume if the anchor repeats
- note constant field "quoted_string" embedded as literal anchor "HTTP/1.1"
- note decoder order and prematch specificity may need site-specific tuning (other decoders in your ruleset can shadow these) — validate with /var/ossec/bin/wazuh-logtest
rsyslog template / liblognorm rulebase
version=2
# nginx — liblognorm v2 rulebase (generated by LogForge)
# Usage with rsyslog (mmnormalize runs liblognorm):
# module(load="mmnormalize")
# action(type="mmnormalize" rulebase="/etc/rsyslog.d/nginx.rb" useRawMsg="on")
# Literal "%" is escaped as "%%"; raw tabs are written as \x09.
rule=nginx:%ip1:ipv4% - - [%timestamp:char-to{"extradata":"]"}%] "%method:word% %path:word% HTTP/1.1" %status:number% %number:number% "%url:char-to{"extradata":"\""}%" "%user_agent:char-to{"extradata":"\""}%"
- note trailing literal "\"" reconstructed from line 1
- note field "timestamp": samples do not uniformly match engine type "timestamp"; using a generic parser
- note chosen parser types: ip1=ipv4, timestamp=char-to(]), method=word, path=word, status=number, number=number, url=char-to("), user_agent=char-to(")
FAQ
- What is the difference between the nginx main, combined, and common log formats?
- combined (nginx's built-in default) is the Common Log Format plus two trailing quoted fields, the Referer and User-Agent. The bare 'common' format stops after the response size. nginx also ships a predefined format literally named 'combined'; 'main' is just a conventional name people give their own custom log_format and carries whatever fields they defined — always check the log_format directive rather than assuming.
- Why does my nginx timestamp fail to parse as ISO 8601?
- Because it is not ISO 8601. nginx writes the local time as 03/Jul/2026:14:22:15 +0300 — day/abbreviated-month/year, a colon before the time, and a numeric UTC offset. Parse it with a %d/%b/%Y:%H:%M:%S %z strptime pattern (Grok's HTTPDATE handles it), and remember the month is a locale-sensitive English abbreviation.
- How do I split the nginx request field into method, path, and protocol?
- The $request field is logged as a single quoted string like "GET /api/health HTTP/1.1". Capture the whole quoted value first, then split it on spaces into method, request URI, and HTTP version. Doing it as one capture avoids miscounting fields when the URI itself contains encoded spaces or the request line is malformed.
- Can the same regex parse both nginx and Apache access logs?
- For the combined format, usually yes — nginx combined and Apache's combined layout are identical field-for-field. The differences that bite are custom log_format lines, Apache's optional %v (vhost) or response-time fields, and whichever module added extra columns. Always validate the pattern against a real sample from each server rather than trusting that 'combined means combined'.
Try it on your own Nginx access lines
Paste a few real lines, review the detected fields, and copy whichever format your stack needs. Free, no account, nothing uploaded.
Open this sample in LogForge →