What a CEF / Trellix / ArcSight (SIEM) line looks like
The CEF sample below is fed verbatim into the engine to produce every parser on this page.
CEF:0|Trellix|Endpoint Security|10.7|1092|Threat detected and blocked|8|src=192.0.2.10 dst=198.51.100.20 spt=51234 dpt=445 suser=jdoe act=blocked fname=invoice_scan.exe fileHash=44d88612fea8a8f36de82e1278abb02f
CEF:0|Trellix|Endpoint Security|10.7|1041|Malware quarantined|6|src=192.0.2.44 dst=198.51.100.77 spt=49888 dpt=443 suser=asmith act=quarantined fname=update_patch.dll fileHash=e99a18c428cb38d5f260853678922e03 Detected fields
The engine classified this sample as kv and consolidated 15 fields across 2 lines. Fields marked literal were identical on every sample line, so they are baked into the pattern as anchors rather than captured.
- cef_version : number · literal
- cef_vendor : literal · literal
- cef_product : literal · literal
- cef_device_version : number · literal
- cef_signature_id : number
- cef_name : literal
- cef_severity : number
- src : ipv4
- dst : ipv4
- spt : port
- dpt : port
- suser : username
- act : literal
- fname : literal
- filehash : literal
Regex (named capture groups)
# sample: CEF:0|Trellix|Endpoint Security|10.7|1092|Threat detected and blocked|8|src=192.0.2.10 dst=198.51.100.20 spt=51234 dpt=445 suser=jdoe act=blocked fname=invoice_scan.exe fileHash=44d88612fea8a8f36de82e1278abb02f
# groups: cef_signature_id=1092, cef_name=Threat detected and blocked, cef_severity=8, src=192.0.2.10, dst=198.51.100.20, spt=51234, dpt=445, suser=jdoe, act=blocked, fname=invoice_scan.exe, filehash=44d88612fea8a8f36de82e1278abb02f
^CEF:0\|Trellix\|Endpoint Security\|10\.7\|(?<cef_signature_id>-?\d+(?:\.\d+)?)\|(?<cef_name>(?:[A-Za-z]+ [A-Za-z]+ [A-Za-z]+ [A-Za-z]+|[A-Za-z]+ [A-Za-z]+))\|(?<cef_severity>-?\d+(?:\.\d+)?)\|src=(?<src>\d{1,3}(?:\.\d{1,3}){3}) dst=(?<dst>\d{1,3}(?:\.\d{1,3}){3}) spt=(?<spt>\d{1,5}) dpt=(?<dpt>\d{1,5}) suser=(?<suser>[A-Za-z0-9._@-]+) act=(?<act>[A-Za-z]+) fname=(?<fname>[A-Za-z]+_[A-Za-z]+\.[A-Za-z]+) fileHash=(?<filehash>(?:[A-Za-z]+\d+[A-Za-z]+\d+[A-Za-z]+\d+[A-Za-z]+\d+[A-Za-z]+\d+[A-Za-z]+\d+[A-Za-z]+\d+|\d+[A-Za-z]+\d+[A-Za-z]+\d+[A-Za-z]+\d+[A-Za-z]+\d+[A-Za-z]+\d+[A-Za-z]+\d+[A-Za-z]+\d+[A-Za-z]+))$ Grok pattern (Logstash / Elastic)
CEF:0\|Trellix\|Endpoint Security\|10\.7\|%{NUMBER:cef_signature_id}\|%{DATA:cef_name}\|%{NUMBER:cef_severity}\|src=%{IPV4:src} dst=%{IPV4:dst} spt=%{INT:spt} dpt=%{INT:dpt} suser=%{USERNAME:suser} act=%{NOTSPACE:act} fname=%{NOTSPACE:fname} fileHash=%{GREEDYDATA:filehash} - note kv-structured input — consider the Logstash kv filter instead of (or after) grok
- note constant field "cef_version" embedded as literal anchor "0" (varying=false)
- note constant field "cef_device_version" embedded as literal anchor "10.7" (varying=false)
Wazuh decoder (OS_Regex XML)
<!--
Generated by LogForge - Wazuh decoder (OS_Regex dialect, not PCRE)
sample: CEF:0|Trellix|Endpoint Security|10.7|1092|Threat detected and blocked|8|src=192.0.2.10 dst=198.51.100.20 spt=51234 dpt=445 suser=jdoe act=blocked fname=invoice_
test with: /var/ossec/bin/wazuh-logtest
-->
<decoder name="cef-kv">
<prematch>^CEF:\d+\|\w+\|</prematch>
</decoder>
<decoder name="cef-kv">
<parent>cef-kv</parent>
<regex offset="after_parent">\|src=(\d+.\d+.\d+.\d+)</regex>
<order>srcip</order>
</decoder>
<decoder name="cef-kv">
<parent>cef-kv</parent>
<regex offset="after_parent"> dst=(\d+.\d+.\d+.\d+)</regex>
<order>dstip</order>
</decoder>
<decoder name="cef-kv">
<parent>cef-kv</parent>
<regex offset="after_parent"> spt=(\d+)</regex>
<order>srcport</order>
</decoder>
<decoder name="cef-kv">
<parent>cef-kv</parent>
<regex offset="after_parent"> dpt=(\d+)</regex>
<order>dstport</order>
</decoder>
<decoder name="cef-kv">
<parent>cef-kv</parent>
<regex offset="after_parent"> suser=(\w+)</regex>
<order>srcuser</order>
</decoder>
<decoder name="cef-kv">
<parent>cef-kv</parent>
<regex offset="after_parent"> act=(\w+)</regex>
<order>action</order>
</decoder>
<decoder name="cef-kv">
<parent>cef-kv</parent>
<regex offset="after_parent"> fname=(\w+.\w+)</regex>
<order>fname</order>
</decoder>
<decoder name="cef-kv">
<parent>cef-kv</parent>
<regex offset="after_parent"> fileHash=(\w+)</regex>
<order>filehash</order>
</decoder>
- note constant field "cef_version" skipped (identical in every line)
- note constant field "cef_vendor" skipped (identical in every line)
- note constant field "cef_product" skipped (identical in every line)
- note constant field "cef_device_version" skipped (identical in every line)
- note field "cef_signature_id" skipped: positional header field without a key literal cannot be safely captured in OS_Regex (values may contain spaces; \.+ is unsafe mid-pattern)
- note field "cef_name" skipped: positional header field without a key literal cannot be safely captured in OS_Regex (values may contain spaces; \.+ is unsafe mid-pattern)
- note field "cef_severity" skipped: positional header field without a key literal cannot be safely captured in OS_Regex (values may contain spaces; \.+ is unsafe mid-pattern)
- note field "src" mapped to Wazuh conventional field "srcip"
- note field "dst" mapped to Wazuh conventional field "dstip"
- note field "spt" mapped to Wazuh conventional field "srcport"
- note field "dpt" mapped to Wazuh conventional field "dstport"
- note field "suser" mapped to Wazuh conventional field "srcuser"
- note field "act" mapped to Wazuh conventional field "action"
- note kv fields are extracted by same-named sibling decoders (offset="after_parent"), so per-line field order/absence is tolerated — the shared name is what makes Wazuh evaluate every sibling
- note decoder order and prematch specificity may need site-specific tuning (other decoders in your ruleset can shadow these) — validate with /var/ossec/bin/wazuh-logtest
rsyslog template / liblognorm rulebase
version=2
# cef — liblognorm v2 rulebase (generated by LogForge)
# Usage with rsyslog (mmnormalize runs liblognorm):
# module(load="mmnormalize")
# action(type="mmnormalize" rulebase="/etc/rsyslog.d/cef.rb" useRawMsg="on")
# Literal "%" is escaped as "%%"; raw tabs are written as \x09.
rule=cef:CEF:0|Trellix|Endpoint Security|10.7|%cef_signature_id:number%|%cef_name:char-to{"extradata":"|"}%|%cef_severity:number%|src=%src:ipv4% dst=%dst:ipv4% spt=%spt:number% dpt=%dpt:number% suser=%suser:word% act=%act:word% fname=%fname:word% fileHash=%filehash:word%
- note kv structure: rsyslog offers mmfields (fast, fixed single-char separator, untyped) and mmnormalize (this rulebase, typed fields + literal anchors); mmnormalize was chosen for typed extraction
- note chosen parser types: cef_signature_id=number, cef_name=char-to(|), cef_severity=number, src=ipv4, dst=ipv4, spt=number, dpt=number, suser=word, act=word, fname=word, filehash=word
FAQ
- Is CEF the same as LEEF?
- No, though they solve the same problem and look superficially alike. CEF (ArcSight) uses a six-field pipe header followed by a space-separated key=value extension with dictionary keys like src and dst. LEEF (IBM QRadar) uses a different pipe header and, in LEEF 2.0, a caret or tab delimiter between attributes with its own key names. They are not interchangeable — a CEF parser will not read LEEF and vice versa.
- What do the six CEF header fields mean?
- After CEF:0: Device Vendor, Device Product, Device Version, Signature ID, Name, and Severity (0–10). Vendor/Product/Version identify the source, Signature ID groups like events, Name is a human label, and Severity drives triage. The extension after the seventh pipe holds the event-specific key=value data.
- How do I extract a file hash or filename from a CEF event?
- They live in the extension as product-specific keys — commonly fname for the file name and fileHash (or the more specific fileHash variants) for the digest. Extract them by splitting the extension on key boundaries and reading the values. A fileHash gives you an indicator you can enrich against threat-intel feeds; watch for spaces in fname, which require boundary-aware splitting.
- Why do custom CEF extension keys break a strict parser?
- Because the extension is open-ended: beyond ArcSight's standard dictionary, any vendor can add its own keys, and different signatures emit different key sets. A parser that expects a fixed list of keys in a fixed order will fail. Match key=value pairs wherever they appear (order-independent), and treat unknown keys as capturable rather than fatal.
Try it on your own CEF / Trellix / ArcSight (SIEM) lines
Paste a few real lines, review the detected fields, and copy whichever format your stack needs. Free, no account, nothing uploaded.
Open this sample in LogForge →