What an Apache access (combined) line looks like
The Combined sample below is fed verbatim into the engine to produce every parser on this page.
192.0.2.10 - jdoe [03/Jul/2026:14:22:15 +0300] "GET /wp-admin/ HTTP/1.1" 302 512 "https://example.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 Chrome/126.0"
203.0.113.99 - - [03/Jul/2026:14:22:40 +0300] "GET /.env HTTP/1.1" 404 153 "-" "curl/8.6.0" Detected fields
The engine classified this sample as freeform and consolidated 11 fields across 2 lines. Fields marked literal were identical on every sample line, so they are baked into the pattern as anchors rather than captured.
- ip1 : ipv4
- _lit1 : literal · literal
- literal : literal
- timestamp : timestamp
- method : http_method · literal
- quoted_string : quoted_string
- quoted_string2 : quoted_string · literal
- status : http_status
- number : number
- url : url
- user_agent : user_agent
Regex (named capture groups)
# sample: 192.0.2.10 - jdoe [03/Jul/2026:14:22:15 +0300] "GET /wp-admin/ HTTP/1.1" 302 512 "https://example.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 Chrome/126.0"
# groups: ip1=192.0.2.10, literal=jdoe, timestamp=03/Jul/2026:14:22:15 +0300, quoted_string=/wp-admin/, status=302, number=512, url=https://example.com/, user_agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 Chrome/126.0
^(?<ip1>\d{1,3}(?:\.\d{1,3}){3}) - (?<literal>(?:[A-Za-z]+|-)) \[(?<timestamp>\d+/[A-Za-z]+/\d+:\d+:\d+:\d+ \+\d+)\] "GET (?<quoted_string>(?:/[A-Za-z]+-[A-Za-z]+/|/\.[A-Za-z]+)) HTTP/1\.1" (?<status>\d{3}) (?<number>-?\d+(?:\.\d+)?) "(?<url>[^"]*)" "(?<user_agent>[^"]*)"$ Grok pattern (Logstash / Elastic)
# custom patterns
APACHE_NOTDQUOTE [^"]*
%{IPV4:ip1} - %{NOTSPACE:literal} \[%{HTTPDATE:timestamp}\] "GET %{NOTSPACE:quoted_string} HTTP/1\.1" %{INT:status} %{NUMBER:number} "%{APACHE_NOTDQUOTE:url}" "%{APACHE_NOTDQUOTE:user_agent} - note constant field "method" embedded as literal anchor "GET" (varying=false)
- note constant field "quoted_string2" embedded as literal anchor "HTTP/1.1" (varying=false)
- note field "url" (url): samples do not all match %{URI}; using %{APACHE_NOTDQUOTE} instead
- note custom patterns emitted — save the '# custom patterns' block to a file in your patterns_dir
Wazuh decoder (OS_Regex XML)
<!--
Generated by LogForge - Wazuh decoder (OS_Regex dialect, not PCRE)
sample: 192.0.2.10 - jdoe [03/Jul/2026:14:22:15 +0300] "GET /wp-admin/ HTTP/1.1" 302 512 "https://example.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/5
test with: /var/ossec/bin/wazuh-logtest
-->
<decoder name="apache-freeform">
<prematch>^\d+.\d+.\d+.\d+ </prematch>
</decoder>
<decoder name="apache-freeform">
<parent>apache-freeform</parent>
<regex>^(\d+.\d+.\d+.\d+) - (\w+) [(\d+/\w+/\d+:\d+:\d+:\d+ \p\d+)] "GET (\S+) HTTP/1.1" (\d+) (\d+) "(\.+)" "(\.+)"</regex>
<order>srcip, literal, timestamp, quoted_string, status, number, url, user_agent</order>
</decoder>
- note no stable literal prefix found — <prematch> anchors on the leading field pattern; tighten it for your environment
- note field "ip1" mapped to Wazuh conventional field "srcip"
- note field "url": free-text capture (\.+) bounded by a quote anchor — OS_Regex greediness may over-consume if the anchor repeats
- note field "user_agent": free-text capture (\.+) bounded by end of line — OS_Regex greediness may over-consume if the anchor repeats
- note constant field "method" embedded as literal anchor "GET"
- note constant field "quoted_string2" embedded as literal anchor "HTTP/1.1"
- note decoder order and prematch specificity may need site-specific tuning (other decoders in your ruleset can shadow these) — validate with /var/ossec/bin/wazuh-logtest
rsyslog template / liblognorm rulebase
version=2
# apache — liblognorm v2 rulebase (generated by LogForge)
# Usage with rsyslog (mmnormalize runs liblognorm):
# module(load="mmnormalize")
# action(type="mmnormalize" rulebase="/etc/rsyslog.d/apache.rb" useRawMsg="on")
# Literal "%" is escaped as "%%"; raw tabs are written as \x09.
rule=apache:%ip1:ipv4% - %literal:word% [%timestamp:char-to{"extradata":"]"}%] "GET %quoted_string:word% HTTP/1.1" %status:number% %number:number% "%url:char-to{"extradata":"\""}%" "%user_agent:char-to{"extradata":"\""}%"
- note trailing literal "\"" reconstructed from line 1
- note field "timestamp": samples do not uniformly match engine type "timestamp"; using a generic parser
- note chosen parser types: ip1=ipv4, literal=word, timestamp=char-to(]), quoted_string=word, status=number, number=number, url=char-to("), user_agent=char-to(")
FAQ
- What is the difference between Apache combined and common log format?
- The 'common' format (CLF) ends after the response size: host, ident, user, time, request, status, bytes. 'combined' appends two quoted header fields, Referer and User-Agent. Both are just nicknames defined by LogFormat directives, so a given server logs whatever its active LogFormat says — inspect the config, not the file name.
- Why does my Apache line have an extra field my parser does not expect?
- Almost certainly a customized LogFormat. Common additions are a leading %v (virtual host), a %D/%T response-time column, %p (port), or %{X-Forwarded-For}i to capture the real client behind a proxy. Grab a representative sample from the actual server and regenerate the parser against it rather than assuming stock combined.
- How do I recover the real client IP when Apache is behind a load balancer?
- The %h field will be the proxy or load-balancer address. To get the originating client you need the X-Forwarded-For header, which is only present if the LogFormat includes %{X-Forwarded-For}i. If it does, capture that field and take the left-most address in the comma-separated list as the client (trusting it only as far as you trust your proxy chain).
Try it on your own Apache access (combined) lines
Paste a few real lines, review the detected fields, and copy whichever format your stack needs. Free, no account, nothing uploaded.
Open this sample in LogForge →