bokamba / tools /

log anvil

100% client-side

Generate realistic test logs from 12 real source formats — then inject detectable anomalies a SOC should catch. Seeded and deterministic: the same settings always produce the same bytes.

PRIVATE 100% in your browser — nothing is uploaded. Output is fully deterministic: same source + seed + config ⇒ byte-identical logs.

source & shape

inject anomalies

patterns a SOC should catch — hover a name for the signature

Pro

free raises the line cap to 100,000

summary

12lines
5.3stime span
1anomaly lines
42seed
HTTP 5xx spike1

generated log

172.16.31.77 - - [13/Sep/2020:12:26:40 +0000] "POST /static/style.css HTTP/1.1" 200 17486 "https://example.com/login" "Googlebot/2.1 (+http://www.google.com/bot.html)"
192.168.10.7 - - [13/Sep/2020:12:26:40 +0000] "POST /favicon.ico HTTP/1.1" 200 43943 "https://example.com/login" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
13.107.42.14 - - [13/Sep/2020:12:26:40 +0000] "GET /logout HTTP/1.1" 200 50147 "https://example.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
162.243.128.12 - - [13/Sep/2020:12:26:41 +0000] "GET /api/v1/orders/5140 HTTP/1.1" 502 15691 "https://example.com/login" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.1 Safari/605.1.15"HTTP 5xx spike
198.51.100.23 - - [13/Sep/2020:12:26:41 +0000] "GET /logout HTTP/1.1" 200 42231 "https://example.com/login" "python-requests/2.31.0"
192.168.100.254 - - [13/Sep/2020:12:26:42 +0000] "GET /search HTTP/1.1" 200 1013 "https://www.bing.com/" "Mozilla/5.0 (iPhone; CPU iPhone OS 17_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.1 Mobile/15E148 Safari/604.1"
34.223.14.100 - - [13/Sep/2020:12:26:42 +0000] "PUT /logout HTTP/1.1" 200 33300 "https://news.ycombinator.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
192.168.10.7 - - [13/Sep/2020:12:26:43 +0000] "GET /static/app.js HTTP/1.1" 200 20669 "https://news.ycombinator.com/" "Mozilla/5.0 (X11; Linux x86_64; rv:121.0) Gecko/20100101 Firefox/121.0"
203.0.113.201 - - [13/Sep/2020:12:26:44 +0000] "POST /api/v1/orders/29593 HTTP/1.1" 200 20051 "https://news.ycombinator.com/" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
192.168.1.133 - - [13/Sep/2020:12:26:44 +0000] "GET /search HTTP/1.1" 200 44780 "https://www.google.com/" "Mozilla/5.0 (iPhone; CPU iPhone OS 17_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.1 Mobile/15E148 Safari/604.1"
172.16.4.9 - - [13/Sep/2020:12:26:44 +0000] "GET /static/app.js HTTP/1.0" 200 25148 "-" "Googlebot/2.1 (+http://www.google.com/bot.html)"
192.168.1.50 - - [13/Sep/2020:12:26:45 +0000] "GET /metrics HTTP/1.0" 200 31312 "https://www.google.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"

The 5 detectable patterns

Each anomaly is a documented, detectable signature — the kind of thing your SIEM rules should fire on. Seed them into otherwise-clean logs to build test data that proves your detections actually work.

patternwhat it looks like
SSH brute-force A burst of failed logins from one source IP, then a success — the classic credential-stuffing signature.
HTTP 5xx spike A run of 5xx responses from one client/path — an outage or an app under attack a SOC should page on.
Port scan One source IP hitting many sequential destination ports — reconnaissance sweeping for open services.
Impossible travel The same user seen from two far-apart IPs within minutes — a session that could not physically happen.
Oversized transfer A response/transfer orders of magnitude too large — a possible exfiltration or data-dump event.

12 source formats

Every source emits its real on-the-wire shape — lifted from the LogForge catalogue — so the output round-trips straight back through a parser.

sourcefamily
Nginx access (combined) CLF
Apache access (combined) CLF
OpenSSH / sshd auth syslog
Syslog (RFC 3164 / BSD) syslog
Syslog (RFC 5424) syslog
FortiGate firewall (traffic) key=value
Cisco ASA firewall syslog
Windows Security Event key=value
JSON application JSON
Docker container (logfmt) key=value
Kubernetes (klog) key=value
AWS VPC Flow (v2) CSV

Deterministic by design. Output flows from a seeded PRNG and a clock walked from a fixed start — no Date.now(), no Math.random(). Commit a seed to a fixture and regenerate the exact same log anywhere, forever.

Stream it into your own SIEM

Generation is client-side, so to feed a live collector you download the output and pipe it from your own machine — nothing touches this server and you control the destination and the rate. Generate, hit download (.log or .jsonl), then use one of these:

Paced UDP syslog (logger) — ~10 lines/sec

while IFS= read -r line; do
  logger -n SIEM_HOST -P 514 -d -- "$line"
  sleep 0.1              # ~10 EPS — lower it to go faster
done < anvil.log

Paced UDP syslog (netcat)

while IFS= read -r line; do
  printf '%s\n' "$line" | nc -u -w0 SIEM_HOST 514
  sleep 0.1
done < anvil.log

Bulk TCP (netcat) — as fast as the socket allows

nc SIEM_HOST 514 < anvil.log

HTTP collector (JSONL export) — one POST per line

while IFS= read -r line; do
  curl -sS -X POST -H 'content-type: application/json' \
       -d "$line" https://your-collector.example/ingest
done < anvil.jsonl

Replace SIEM_HOST with your collector. Tune sleep for your target EPS. Test responsibly — only send to systems you own or are explicitly authorized to test.

Generate by source

A dedicated page for each source, with a live-generated deterministic sample and an anomaly-injection showcase.

FAQ

What is a synthetic log generator for?

It produces realistic, fake log lines so you can test a pipeline without shipping real production data. Use it to smoke-test a parser or decoder, seed a SIEM demo, load-test an ingest path, or build a detection lab — with data that looks exactly like nginx, sshd, FortiGate, Windows or a dozen other sources, but contains no real users, IPs, or secrets.

Is the output deterministic?

Yes. Generation is fully seeded: the same source, line count, seed, EPS, format and anomaly settings always produce byte-identical output. That means you can commit a seed to a test fixture and regenerate the exact same log file on any machine, in CI, or a year later — no snapshot files to store.

What anomalies can it inject, and why?

Five documented, detectable patterns a SOC should catch: an SSH brute-force burst (failed logins then a success from one IP), an HTTP 5xx spike, a port scan (one source hitting many sequential ports), impossible travel (one user from two far-apart IPs within minutes), and an oversized transfer (a response orders of magnitude too large). Each slider sets how much of the output that anomaly perturbs, so you can build test data that your rules SHOULD alert on — and prove they do.

Does anything get uploaded?

No. Generation runs 100% in your browser — no log data is ever sent anywhere. The only network call the tool ever makes is an optional license check when you activate a Pro key, and that sends just the key. Free tier never calls any API.

What is the line-count limit?

The free tier generates up to 1,000 lines per run, which is plenty for parser smoke-tests and demos. A Bokamba Pro license raises the cap to 100,000 lines for load-testing and larger fixtures. Either way it is all client-side and instant.

Can I feed the output straight into a parser?

Yes — that is the point. Copy or download the log, or hit “Parse this with LogForge →” to hand the generated lines directly to LogForge and get a working regex / Grok / Wazuh decoder for that shape. LogAnvil is LogForge in reverse: it emits the exact on-the-wire format each source produces.

Got a log and need to parse it instead? → Build a parser with LogForge — paste any log line and get a working regex, Grok, Wazuh decoder or rsyslog template.