What an AWS VPC Flow line looks like
The delimited sample below is fed verbatim into the engine to produce every parser on this page.
2 123456789012 eni-0abc12de34567890 203.0.113.45 192.0.2.10 51234 443 6 12 1520 1783085000 1783085060 ACCEPT OK
2 445566778899 eni-0f9e87dc65432100 198.51.100.77 192.0.2.20 40112 53 17 3 240 1783085100 1783085160 REJECT OK Detected fields
The engine classified this sample as freeform and consolidated 14 fields across 2 lines. Fields marked literal were identical on every sample line, so they are baked into the pattern as anchors rather than captured.
- number : number · literal
- number2 : number
- literal : literal
- ip1 : ipv4
- ip2 : ipv4
- number3 : number
- number4 : number
- number5 : number
- number6 : number
- number7 : number
- timestamp : timestamp
- timestamp2 : timestamp
- literal2 : literal
- _lit1 : literal · literal
Regex (named capture groups)
# sample: 2 123456789012 eni-0abc12de34567890 203.0.113.45 192.0.2.10 51234 443 6 12 1520 1783085000 1783085060 ACCEPT OK
# groups: number2=123456789012, literal=eni-0abc12de34567890, ip1=203.0.113.45, ip2=192.0.2.10, number3=51234, number4=443, number5=6, number6=12, number7=1520, timestamp=1783085000, timestamp2=1783085060, literal2=ACCEPT
^2 (?<number2>-?\d+(?:\.\d+)?) (?<literal>(?:[A-Za-z]+-\d+[A-Za-z]+\d+[A-Za-z]+\d+|[A-Za-z]+-\d+[A-Za-z]+\d+[A-Za-z]+\d+[A-Za-z]+\d+)) (?<ip1>\d{1,3}(?:\.\d{1,3}){3}) (?<ip2>\d{1,3}(?:\.\d{1,3}){3}) (?<number3>-?\d+(?:\.\d+)?) (?<number4>-?\d+(?:\.\d+)?) (?<number5>-?\d+(?:\.\d+)?) (?<number6>-?\d+(?:\.\d+)?) (?<number7>-?\d+(?:\.\d+)?) (?<timestamp>\d+) (?<timestamp2>\d+) (?<literal2>[A-Za-z]+) OK$ Grok pattern (Logstash / Elastic)
# custom patterns
AWS_VPC_FLOW_EPOCH \d{10}(?:\d{3})?
2 %{NUMBER:number2} %{NOTSPACE:literal} %{IPV4:ip1} %{IPV4:ip2} %{NUMBER:number3} %{NUMBER:number4} %{NUMBER:number5} %{NUMBER:number6} %{NUMBER:number7} %{AWS_VPC_FLOW_EPOCH:timestamp} %{AWS_VPC_FLOW_EPOCH:timestamp2} %{NOTSPACE:literal2} OK - note constant field "number" embedded as literal anchor "2" (varying=false)
- note custom patterns emitted — save the '# custom patterns' block to a file in your patterns_dir
Wazuh decoder (OS_Regex XML)
<!--
Generated by LogForge - Wazuh decoder (OS_Regex dialect, not PCRE)
sample: 2 123456789012 eni-0abc12de34567890 203.0.113.45 192.0.2.10 51234 443 6 12 1520 1783085000 1783085060 ACCEPT OK
test with: /var/ossec/bin/wazuh-logtest
-->
<decoder name="aws-vpc-flow-freeform">
<prematch>^\d+ </prematch>
</decoder>
<decoder name="aws-vpc-flow-freeform">
<parent>aws-vpc-flow-freeform</parent>
<regex>^2 (\d+) (\w+) (\d+.\d+.\d+.\d+) (\d+.\d+.\d+.\d+) (\d+) (\d+) (\d+) (\d+)</regex>
<order>number2, literal, srcip, ip2, number3, number4, number5, number6</order>
</decoder>
<decoder name="aws-vpc-flow-freeform">
<parent>aws-vpc-flow-freeform</parent>
<regex offset="after_regex">^ (\d+) (\d+) (\d+) (\w+) OK</regex>
<order>number7, timestamp, timestamp2, literal2</order>
</decoder>
- note field "ip1" mapped to Wazuh conventional field "srcip"
- note constant field "number" embedded as literal anchor "2"
- note decoder order and prematch specificity may need site-specific tuning (other decoders in your ruleset can shadow these) — validate with /var/ossec/bin/wazuh-logtest
rsyslog template / liblognorm rulebase
version=2
# aws_vpc_flow — liblognorm v2 rulebase (generated by LogForge)
# Usage with rsyslog (mmnormalize runs liblognorm):
# module(load="mmnormalize")
# action(type="mmnormalize" rulebase="/etc/rsyslog.d/aws_vpc_flow.rb" useRawMsg="on")
# Literal "%" is escaped as "%%"; raw tabs are written as \x09.
rule=aws_vpc_flow:2 %number2:number% %literal:word% %ip1:ipv4% %ip2:ipv4% %number3:number% %number4:number% %number5:number% %number6:number% %number7:number% %timestamp:number% %timestamp2:number% %literal2:word% OK
- note chosen parser types: number2=number, literal=word, ip1=ipv4, ip2=ipv4, number3=number, number4=number, number5=number, number6=number, number7=number, timestamp=number, timestamp2=number, literal2=word
FAQ
- What are the columns in a default VPC Flow Log line?
- The version-2 default is, in order: version, account-id, interface-id, srcaddr, dstaddr, srcport, dstport, protocol, packets, bytes, start, end, action, log-status. They are space-delimited and strictly positional, so the parser reads by column position — there are no keys.
- Why is the protocol a number like 6 or 17?
- It is the IANA protocol number, not a name: 6 is TCP, 17 is UDP, 1 is ICMP. Any rule that thinks in protocol names must map these numbers first. This is a common source of misparsed flow logs when someone expects "TCP" and gets "6".
- What do the start and end fields represent?
- They are Unix epoch seconds bounding the aggregation window for the flow, because VPC Flow Logs aggregate traffic over a capture window rather than logging each packet. So packets and bytes are totals across that window, not per-packet values — keep that in mind when computing rates.
- Does the column order ever change?
- Yes. VPC Flow Logs support a custom format where you choose the fields and their order (adding tcp-flags, vpc-id, pkt-srcaddr, flow-direction, and others). When a custom format is in use the default positional layout no longer applies — the format string is authoritative and the parser must follow it, not the version-2 default.
Try it on your own AWS VPC Flow lines
Paste a few real lines, review the detected fields, and copy whichever format your stack needs. Free, no account, nothing uploaded.
Open this sample in LogForge →