bokamba / logforge / parse / Palo Alto Networks PAN-OS (CEF)

$ logforge parse palo-alto

Parse Palo Alto Networks PAN-OS (CEF) logs → regex, Grok, Wazuh & rsyslog

Palo Alto Networks' PAN-OS firewalls can forward logs in several encodings; one of the most portable is CEF (Common Event Format), the ArcSight-originated standard that many SIEMs ingest directly. A PAN-OS CEF event is one line with two parts. The first is the CEF header: the literal 'CEF:0' version marker followed by exactly six pipe-delimited fields — CEF:0|Device Vendor|Device Product|Device Version|Signature ID|Name|Severity — so for PAN-OS that reads CEF:0|Palo Alto Networks|PAN-OS|11.1|THREAT|url|5, telling you the vendor, product, software version, the log type as the signature/class (THREAT, TRAFFIC, SYSTEM…), a human name, and a 0–10 severity. The second part, everything after the seventh pipe, is the extension: an open-ended set of space-separated key=value pairs carrying the actual event data.

The extension uses CEF's standardized short 'key names' wherever one exists — src and dst for source/destination IP, spt and dpt for the ports, suser for the user, act for the action, rt for the receipt time — plus vendor-specific keys for anything without a standard slot. PAN-OS-specific fields (app=web-browsing, cat=malware, request=https://…) ride alongside the standard ones. Parsing is non-trivial for three reasons that trip up naive splitters. First, the pipe delimiter can appear escaped (\|) inside a header field, so you cannot blindly split on '|'. Second, in the extension a value may contain spaces, and CEF's rule is that a key=value pair ends where the next 'key=' begins — you split on the ' key=' boundary, not on every space, or a request= URL with a space will corrupt everything after it. Third, literal equals signs and backslashes inside values are backslash-escaped and must be unescaped after extraction.

For detection the header severity and the log type (Signature ID field) let you triage at a glance, while the extension carries the investigative fields: src/dst and spt/dpt for the connection, suser for attribution, act for the verdict (block-url, allow, deny), and PAN-OS's app, cat, and request for what was actually blocked and why. A THREAT/url event with act=block-url and cat=malware pointing at an external request URL is exactly the kind of high-severity, high-context line that CEF was designed to carry cleanly across the wire into a correlation engine — provided your parser respects the header/extension split and the escaping rules.

Open this in LogForge →

What a Palo Alto Networks PAN-OS (CEF) line looks like

The CEF sample below is fed verbatim into the engine to produce every parser on this page.

CEF:0|Palo Alto Networks|PAN-OS|11.1|THREAT|url|5|rt=Jul 03 2026 14:22:15 src=192.0.2.55 dst=198.51.100.99 spt=52881 dpt=443 suser=bhapci app=web-browsing act=block-url request=https://malware-cdn.example.net/payload.bin cat=malware
CEF:0|Palo Alto Networks|PAN-OS|11.1|THREAT|virus|4|rt=Jul 03 2026 14:23:07 src=192.0.2.61 dst=203.0.113.12 spt=51002 dpt=80 suser=akaya app=web-browsing act=alert request=http://tracker.example.org/pixel.gif cat=spyware

Detected fields

The engine classified this sample as kv and consolidated 17 fields across 2 lines. Fields marked literal were identical on every sample line, so they are baked into the pattern as anchors rather than captured.

  • cef_version : number · literal
  • cef_vendor : literal · literal
  • cef_product : literal · literal
  • cef_device_version : number · literal
  • cef_signature_id : literal · literal
  • cef_name : literal
  • cef_severity : number
  • rt : timestamp
  • src : ipv4
  • dst : ipv4
  • spt : port
  • dpt : port
  • suser : username
  • app : literal · literal
  • act : severity
  • request : url
  • cat : literal

Regex (named capture groups)

# sample: CEF:0|Palo Alto Networks|PAN-OS|11.1|THREAT|url|5|rt=Jul 03 2026 14:22:15 src=192.0.2.55 dst=198.51.100.99 spt=52881 dpt=443 suser=bhapci app=web-browsing act=block-url request=https://malware-cdn.example.net/payload.bin cat=malware
# groups: cef_name=url, cef_severity=5, rt=Jul 03 2026 14:22:15, src=192.0.2.55, dst=198.51.100.99, spt=52881, dpt=443, suser=bhapci, act=block-url, request=https://malware-cdn.example.net/payload.bin, cat=malware
^CEF:0\|Palo Alto Networks\|PAN-OS\|11\.1\|THREAT\|(?<cef_name>[A-Za-z]+)\|(?<cef_severity>-?\d+(?:\.\d+)?)\|rt=(?<rt>[A-Za-z]+ \d+ \d+ \d+:\d+:\d+) src=(?<src>\d{1,3}(?:\.\d{1,3}){3}) dst=(?<dst>\d{1,3}(?:\.\d{1,3}){3}) spt=(?<spt>\d{1,5}) dpt=(?<dpt>\d{1,5}) suser=(?<suser>[A-Za-z0-9._@-]+) app=web-browsing act=(?<act>(?:[A-Za-z]+-[A-Za-z]+|[A-Za-z]+)) request=(?<request>[A-Za-z][A-Za-z0-9+.-]*://[^\s"']+) cat=(?<cat>[A-Za-z]+)$

Grok pattern (Logstash / Elastic)

# custom patterns
PALO_ALTO_MDYTIME %{MONTH} %{MONTHDAY} %{YEAR} %{TIME}

CEF:0\|Palo Alto Networks\|PAN-OS\|11\.1\|THREAT\|%{DATA:cef_name}\|%{NUMBER:cef_severity}\|rt=%{PALO_ALTO_MDYTIME:rt} src=%{IPV4:src} dst=%{IPV4:dst} spt=%{INT:spt} dpt=%{INT:dpt} suser=%{USERNAME:suser} app=web-browsing act=%{NOTSPACE:act} request=%{URI:request} cat=%{GREEDYDATA:cat}
  • note kv-structured input — consider the Logstash kv filter instead of (or after) grok
  • note constant field "cef_version" embedded as literal anchor "0" (varying=false)
  • note constant field "cef_device_version" embedded as literal anchor "11.1" (varying=false)
  • note field "act" (severity): samples do not all match %{LOGLEVEL}; using %{NOTSPACE} instead
  • note custom patterns emitted — save the '# custom patterns' block to a file in your patterns_dir

Wazuh decoder (OS_Regex XML)

<!--
  Generated by LogForge - Wazuh decoder (OS_Regex dialect, not PCRE)
  sample: CEF:0|Palo Alto Networks|PAN-OS|11.1|THREAT|url|5|rt=Jul 03 2026 14:22:15 src=192.0.2.55 dst=198.51.100.99 spt=52881 dpt=443 suser=bhapci app=web-browsing act=b
  test with: /var/ossec/bin/wazuh-logtest
-->

<decoder name="palo-alto-kv">
  <prematch>^CEF:\d+\|</prematch>
</decoder>

<decoder name="palo-alto-kv">
  <parent>palo-alto-kv</parent>
  <regex offset="after_parent">\|rt=(\w+ \d+ \d+ \d+:\d+:\d+)</regex>
  <order>rt</order>
</decoder>

<decoder name="palo-alto-kv">
  <parent>palo-alto-kv</parent>
  <regex offset="after_parent"> src=(\d+.\d+.\d+.\d+)</regex>
  <order>srcip</order>
</decoder>

<decoder name="palo-alto-kv">
  <parent>palo-alto-kv</parent>
  <regex offset="after_parent"> dst=(\d+.\d+.\d+.\d+)</regex>
  <order>dstip</order>
</decoder>

<decoder name="palo-alto-kv">
  <parent>palo-alto-kv</parent>
  <regex offset="after_parent"> spt=(\d+)</regex>
  <order>srcport</order>
</decoder>

<decoder name="palo-alto-kv">
  <parent>palo-alto-kv</parent>
  <regex offset="after_parent"> dpt=(\d+)</regex>
  <order>dstport</order>
</decoder>

<decoder name="palo-alto-kv">
  <parent>palo-alto-kv</parent>
  <regex offset="after_parent"> suser=(\w+)</regex>
  <order>srcuser</order>
</decoder>

<decoder name="palo-alto-kv">
  <parent>palo-alto-kv</parent>
  <regex offset="after_parent"> act=(\w+)</regex>
  <order>action</order>
</decoder>

<decoder name="palo-alto-kv">
  <parent>palo-alto-kv</parent>
  <regex offset="after_parent"> request=(\w+://\S+)</regex>
  <order>url</order>
</decoder>

<decoder name="palo-alto-kv">
  <parent>palo-alto-kv</parent>
  <regex offset="after_parent"> cat=(\w+)</regex>
  <order>cat</order>
</decoder>
  • note constant field "cef_version" skipped (identical in every line)
  • note constant field "cef_vendor" skipped (identical in every line)
  • note constant field "cef_product" skipped (identical in every line)
  • note constant field "cef_device_version" skipped (identical in every line)
  • note constant field "cef_signature_id" skipped (identical in every line)
  • note field "cef_name" skipped: positional header field without a key literal cannot be safely captured in OS_Regex (values may contain spaces; \.+ is unsafe mid-pattern)
  • note field "cef_severity" skipped: positional header field without a key literal cannot be safely captured in OS_Regex (values may contain spaces; \.+ is unsafe mid-pattern)
  • note field "src" mapped to Wazuh conventional field "srcip"
  • note field "dst" mapped to Wazuh conventional field "dstip"
  • note field "spt" mapped to Wazuh conventional field "srcport"
  • note field "dpt" mapped to Wazuh conventional field "dstport"
  • note field "suser" mapped to Wazuh conventional field "srcuser"
  • note constant field "app" skipped (identical in every line)
  • note field "act" mapped to Wazuh conventional field "action"
  • note field "request" mapped to Wazuh conventional field "url"
  • note kv fields are extracted by same-named sibling decoders (offset="after_parent"), so per-line field order/absence is tolerated — the shared name is what makes Wazuh evaluate every sibling
  • note decoder order and prematch specificity may need site-specific tuning (other decoders in your ruleset can shadow these) — validate with /var/ossec/bin/wazuh-logtest

rsyslog template / liblognorm rulebase

version=2
# palo_alto — liblognorm v2 rulebase (generated by LogForge)
# Usage with rsyslog (mmnormalize runs liblognorm):
#   module(load="mmnormalize")
#   action(type="mmnormalize" rulebase="/etc/rsyslog.d/palo_alto.rb" useRawMsg="on")
# Literal "%" is escaped as "%%"; raw tabs are written as \x09.
rule=palo_alto:CEF:0|Palo Alto Networks|PAN-OS|11.1|THREAT|%cef_name:char-to{"extradata":"|"}%|%cef_severity:number%|rt=%rt:word% src=%src:ipv4% dst=%dst:ipv4% spt=%spt:number% dpt=%dpt:number% suser=%suser:word% app=web-browsing act=%act:word% request=%request:word% cat=%cat:word%
  • note kv structure: rsyslog offers mmfields (fast, fixed single-char separator, untyped) and mmnormalize (this rulebase, typed fields + literal anchors); mmnormalize was chosen for typed extraction
  • note field "rt": samples do not uniformly match engine type "timestamp"; using a generic parser
  • note field "rt": values contain spaces but the only terminator is a space; matched as a single word (may under-match)
  • note chosen parser types: cef_name=char-to(|), cef_severity=number, rt=word, src=ipv4, dst=ipv4, spt=number, dpt=number, suser=word, act=word, request=word, cat=word

FAQ

What are the six fields in a CEF header?
After the CEF:0 version marker: Device Vendor, Device Product, Device Version, Signature ID (the event class/type), Name (a human-readable label), and Severity (0–10). They are pipe-delimited, and a literal pipe inside any of them is escaped as \|. Everything after the seventh pipe is the key=value extension.
Does PAN-OS only support CEF, or are there other log formats?
PAN-OS can forward logs in several formats — its native comma-separated syslog, LEEF, and CEF among them. CEF is a popular choice for SIEM portability because so many platforms parse it natively. Which one you receive depends on the Log Forwarding profile and syslog server settings configured on the firewall, so confirm the encoding before choosing a parser.
Why does splitting the CEF extension on spaces corrupt my fields?
Because extension values can contain spaces. CEF's rule is that a key=value pair runs until the next ' key=' token begins, so you must split on the key-boundary, not on every space. A field like request=https://host/path with a space in it will otherwise bleed into the following key. Backslash-escaped equals and pipes inside values also need unescaping after you split.
Which CEF extension fields map to the connection five-tuple?
CEF's standard short keys are src and dst for source and destination IP, spt and dpt for source and destination port, and proto for the protocol. suser is the source user and act is the action. PAN-OS adds app, cat (category), and request on top of these standard keys.

Try it on your own Palo Alto Networks PAN-OS (CEF) lines

Paste a few real lines, review the detected fields, and copy whichever format your stack needs. Free, no account, nothing uploaded.

Open this sample in LogForge →