bokamba / logforge / parse / HAProxy HTTP

$ logforge parse haproxy

Parse HAProxy HTTP logs → regex, Grok, Wazuh & rsyslog

HAProxy, the load balancer and reverse proxy in front of a great many web stacks, emits one richly-structured line per request when configured with 'option httplog'. It logs via syslog (HAProxy has no built-in file logging; you point it at a local syslog socket or a remote collector), so each line carries a standard RFC 3164 header (Jul 3 14:22:15 lb01 haproxy[990]:) followed by HAProxy's own dense, positional HTTP log format. That message is compact but information-rich, and its power is also its parsing difficulty: several fields are themselves slash-separated composite values that only make sense once you know the schema.

Reading the example message 192.0.2.10:51234 [03/Jul/2026:14:22:15.123] https-in~ api/srv2 0/0/1/12/13 200 512 - - ---- 5/5/0/1/0 0/0 "GET /api/health HTTP/1.1" field by field: the client IP and port, the accept date in brackets (with millisecond precision), the frontend that received it (https-in, the ~ marking an SSL/TLS frontend), then the backend and the specific server that handled it as backend/server (api/srv2). Next comes the timers group Tq/Tw/Tc/Tr/Tt (0/0/1/12/13) — request wait, queue wait, connect, response, and total time in milliseconds — then the HTTP status code (200) and bytes read (512). The two dashes are the captured request cookie and response cookie. '----' is the termination state, a four-character code whose first two characters explain why the session ended (a widely-referenced field for diagnosing timeouts, aborts, and errors). Then the connection counts actconn/feconn/beconn/srv_conn/retries (5/5/0/1/0), the queue positions srv_queue/backend_queue (0/0), and finally the quoted HTTP request line.

The reason a generic HTTP-log parser fails on HAProxy is precisely those composite fields: the timers, connection counts, and queue positions are single tokens containing slashes that must be split into their five, five, and two sub-fields respectively, and the termination-state code is a fixed-width flag string, not a word. For operations and detection the load-bearing fields are the total time Tt and the individual timers (latency and where it is spent), the status code, the backend/server that served the request (isolating a bad node), the termination state (Cx, sx, and PR codes each point at a distinct failure), and the client IP plus request line for the usual web-security signals.

Open this in LogForge →

What a HAProxy HTTP line looks like

The syslog sample below is fed verbatim into the engine to produce every parser on this page.

Jul  3 14:22:15 lb01 haproxy[990]: 192.0.2.10:51234 [03/Jul/2026:14:22:15.123] https-in~ api/srv2 0/0/1/12/13 200 512 - - ---- 5/5/0/1/0 0/0 "GET /api/health HTTP/1.1"
Jul  3 14:22:19 lb01 haproxy[990]: 198.51.100.23:44002 [03/Jul/2026:14:22:19.880] https-in~ api/srv1 0/0/0/44/45 503 217 - - sH-- 7/7/1/2/0 0/0 "POST /api/checkout HTTP/1.1"

Detected fields

The engine classified this sample as syslog3164 and consolidated 17 fields across 2 lines. Fields marked literal were identical on every sample line, so they are baked into the pattern as anchors rather than captured.

  • timestamp : timestamp
  • hostname : hostname · literal
  • program : literal · literal
  • pid : number · literal
  • ipv4_port : ipv4_port
  • literal : literal
  • _lit1 : literal · literal
  • literal2 : literal
  • path : path
  • number : number
  • number2 : number
  • _lit2 : literal · literal
  • _lit3 : literal · literal
  • literal3 : literal
  • path2 : path
  • _lit4 : literal · literal
  • quoted_string : quoted_string

Regex (named capture groups)

# sample: Jul  3 14:22:15 lb01 haproxy[990]: 192.0.2.10:51234 [03/Jul/2026:14:22:15.123] https-in~ api/srv2 0/0/1/12/13 200 512 - - ---- 5/5/0/1/0 0/0 "GET /api/health HTTP/1.1"
# groups: timestamp=Jul  3 14:22:15, ipv4_port=192.0.2.10:51234, literal=03/Jul/2026:14:22:15.123, literal2=api/srv2, path=0/0/1/12/13, number=200, number2=512, literal3=----, path2=5/5/0/1/0, quoted_string=GET /api/health HTTP/1.1
^(?<timestamp>[A-Za-z]+  \d+ \d+:\d+:\d+) lb01 haproxy\[990\]: (?<ipv4_port>\d{1,3}(?:\.\d{1,3}){3}:\d{1,5}) \[(?<literal>\d+/[A-Za-z]+/\d+:\d+:\d+:\d+\.\d+)\] https-in~ (?<literal2>[A-Za-z]+/[A-Za-z]+\d+) (?<path>\d+/\d+/\d+/\d+/\d+) (?<number>-?\d+(?:\.\d+)?) (?<number2>-?\d+(?:\.\d+)?) - - (?<literal3>(?:----|[A-Za-z]+--)) (?<path2>\d+/\d+/\d+/\d+/\d+) 0/0 "(?<quoted_string>[^"]*)"$

Grok pattern (Logstash / Elastic)

# custom patterns
HAPROXY_NOTDQUOTE [^"]*

%{SYSLOGTIMESTAMP:timestamp} lb01 haproxy\[990\]: %{HOSTPORT:ipv4_port} \[%{DATA:literal}\] https-in~ %{NOTSPACE:literal2} %{NOTSPACE:path} %{NUMBER:number} %{NUMBER:number2} - - %{NOTSPACE:literal3} %{NOTSPACE:path2} 0/0 "%{HAPROXY_NOTDQUOTE:quoted_string}
  • note constant field "hostname" embedded as literal anchor "lb01" (varying=false)
  • note constant field "pid" embedded as literal anchor "990" (varying=false)
  • note field "path" (path): samples do not all match %{PATH}; using %{NOTSPACE} instead
  • note field "path2" (path): samples do not all match %{PATH}; using %{NOTSPACE} instead
  • note custom patterns emitted — save the '# custom patterns' block to a file in your patterns_dir

Wazuh decoder (OS_Regex XML)

<!--
  Generated by LogForge - Wazuh decoder (OS_Regex dialect, not PCRE)
  sample: Jul  3 14:22:15 lb01 haproxy[990]: 192.0.2.10:51234 [03/Jul/2026:14:22:15.123] https-in~ api/srv2 0/0/1/12/13 200 512 - - - -- - 5/5/0/1/0 0/0 "GET /api/health HT
  test with: /var/ossec/bin/wazuh-logtest
-->

<decoder name="haproxy-syslog3164">
  <program_name>^haproxy$</program_name>
</decoder>

<decoder name="haproxy-syslog3164">
  <parent>haproxy-syslog3164</parent>
  <regex offset="after_parent">^(\d+.\d+.\d+.\d+:\d+)</regex>
  <order>ipv4_port</order>
</decoder>
  • note syslog header fields handled by Wazuh pre-decoding (not re-parsed): timestamp, hostname, program, pid
  • note parent matches by <program_name> (1 program(s) seen in the samples) — extend the alternation for other programs
  • note field "literal" has no safe OS_Regex pattern before the "]" terminator — template truncated; field(s) omitted: literal, _lit1, literal2, path, number, number2, _lit2, _lit3, literal3, path2, _lit4, quoted_string
  • note decoder order and prematch specificity may need site-specific tuning (other decoders in your ruleset can shadow these) — validate with /var/ossec/bin/wazuh-logtest

rsyslog template / liblognorm rulebase

# rsyslog template — put in /etc/rsyslog.d/haproxy.conf
# Emits the detected RFC 3164 header fields plus the raw
# message body as JSON-ish text using standard rsyslog properties.
# Bind the template to an action, e.g.:
#   action(type="omfile" file="/var/log/haproxy.log" template="haproxy")
# A literal "%" inside a template string must be escaped as "%%".
template(name="haproxy" type="string" string="{\"timestamp\":\"%timereported%\",\"host\":\"%hostname%\",\"program\":\"%programname%\",\"procid\":\"%procid%\",\"msg\":\"%msg:::json%\"}\n")

# --- mmnormalize rulebase for the message body ----------------------
# rsyslog properties cover only the syslog header. To extract the fields
# inside the message body, save the lines below (starting at 'version=2',
# which must be the VERY FIRST line of the file) as /etc/rsyslog.d/haproxy.rb
# and load them with:
#   module(load="mmnormalize")
#   action(type="mmnormalize" rulebase="/etc/rsyslog.d/haproxy.rb")
version=2
rule=haproxy_msg: %ipv4_port:word% [%literal:char-to{"extradata":"]"}%] https-in~ %literal2:word% %path:word% %number:number% %number2:number% - - %literal3:word% %path2:word% 0/0 "%quoted_string:rest%
  • note header property mapping: timestamp -> %timereported%, hostname -> %hostname%, program -> %programname%, pid -> %procid%
  • note message-body fields cannot be extracted by rsyslog properties alone — an mmnormalize (liblognorm v2) rulebase for the msg part is appended below the template; save it as its own .rb file
  • note dropped header separator "]:" before the first message field (it belongs to the syslog tag, not the msg property)
  • note chosen parser types: ipv4_port=word, literal=char-to(]), literal2=word, path=word, number=number, number2=number, literal3=word, path2=word, quoted_string=rest

FAQ

What do the slash-separated numbers in an HAProxy log mean?
There are three such groups in the HTTP log. The timers Tq/Tw/Tc/Tr/Tt (e.g. 0/0/1/12/13) are request, queue, connect, response, and total time in milliseconds. The connection counts actconn/feconn/beconn/srv_conn/retries track concurrency and retries. srv_queue/backend_queue give queue positions. Each token must be split on its slashes into sub-fields.
What is the HAProxy termination state field?
It is the ---- style four-character code (e.g. sD--, PR--, CD--). The first character is the session-termination cause and the second is the TCP/HTTP state when it ended. It is the primary field for diagnosing why a request failed — timeouts, client or server aborts, and proxy-side denials each have distinct codes. A clean request shows ----.
Why does HAProxy log to syslog instead of a file?
HAProxy has no native file logging by design; it sends logs to a syslog endpoint (a local /dev/log socket or a remote server) via the "log" directive. This keeps HAProxy fast and non-blocking and delegates rotation and storage to syslog. To capture the HTTP format shown here you also need "option httplog" on the frontend.
How do I read HAProxy request latency from the log?
Use the timers group Tq/Tw/Tc/Tr/Tt. Tt is the total time from accept to close; Tr is the server response time; Tc is the connect time; Tw is time queued; Tq is time spent receiving the request. A high Tr points at a slow backend, a high Tw at saturation/queueing, and a high Tc at connection problems.

Try it on your own HAProxy HTTP lines

Paste a few real lines, review the detected fields, and copy whichever format your stack needs. Free, no account, nothing uploaded.

Open this sample in LogForge →