Why I Wrote My Own Log Analyzer Instead of Using Splunk

A cybersecurity student's tour through the gap between theory and real incident response

judeh0747

~4 min read · April 20, 2026 (Updated: April 20, 2026) · Free: Yes

The first time I tried to analyze a real set of Windows Event Logs, I had three browser tabs open: the MITRE ATT&CK matrix, a cheat sheet for Event IDs, and a blog post about Kerberoasting I was still halfway through. I had about 40MB of EVTX files from a lab scenario and no SIEM to throw them into. I tried opening them in Event Viewer. I tried converting them to JSON. An hour in, I realized I didn't actually know what I was looking for — and that was the whole problem.

Every tutorial about threat hunting assumes you already have a stack. Elastic, Splunk, Sentinel — pick your flavor. Tutorials walk you through writing queries once logs are already ingested and normalized. That's fine if you're a SOC analyst with a full pipeline. But there's a whole category of situations where that stack doesn't exist: you're on an incident response engagement at a client site with no network access, you're a student trying to learn detection engineering on your laptop, you inherited a zip of logs from a breach and need triage today, or you're building CI/CD security checks and need a CLI that exits non-zero when it finds something.

The tools that exist in this space are mostly Windows-specific (Chainsaw, hayabusa, EvtxHussar — all excellent, all EVTX-only) or require spinning up infrastructure. Nothing handles EVTX and syslog and CEF and JSON in one binary with no setup. So I wrote one.

ThreatLens (github.com/TiltedLunar123/ThreatLens) is a Python CLI that parses the four log formats you actually encounter — EVTX, Syslog (RFC 3164 and 5424), CEF, and JSON/NDJSON — and runs a rule-driven detection engine mapped to MITRE ATT&CK. Entirely offline. Only runtime dependency is PyYAML.

The hard part wasn't the parsers — those are well-trodden ground. The hard part was making rule authoring feel natural. I wanted an analyst to be able to write a detection in a text editor without reading a 40-page DSL spec. A rule is YAML: a name, a severity, a list of field/operator/value conditions, and an optional threshold and time window. Five failed logons from the same source IP in 60 seconds. That's it. No parser quirks, no query language.

Under the hood, the rule engine is a dict lookup. Each operator — equals, contains, regex, gt, lt, twelve total — is a small function. Adding a new one is a line of code. That simplicity is load-bearing. The moment a rule language becomes expressive, it also becomes unreadable, and analysts stop writing rules. I wanted the opposite.

Three things surprised me while building this. First, parsing is the hardest part, not detection. Syslog looks simple until you realize RFC 3164 timestamps don't include a year. EVTX is a structured binary format with channels and providers and schemas that change between Windows versions. CEF has escape rules nobody reads. I spent more time on normalization than on every detection rule combined. Detection logic is what looks impressive on a blog post, but the unglamorous parsing layer is what makes the tool actually useful.

Second, correlation matters more than individual alerts. A single brute-force hit is a low-severity event. A brute-force hit followed by a successful logon followed by a new local admin account in the next 90 seconds is an incident. Building the attack-chain correlator — an ordered sequence of rule matches within a time window — changed my mental model of detection engineering. You're not looking for malicious events. You're looking for malicious stories.

Third, false positives kill tools. Every open-source security tool I've ever seen die in the wild dies from noise. I built the allowlist layer early and made the reason field required. You cannot suppress a rule without documenting why. That's a small piece of UX I think will pay off the more the tool gets used in real environments.

There's a long list of things I want to add. Cloud log formats (CloudTrail, Azure Activity) are next. Enrichment pipelines — geolocating IPs, tagging users with AD group context — would make correlations more powerful. And Sigma rule compatibility is already in, but I want to tune the translation layer so community rules fire cleanly without manual tweaking.

The core goal is already working: one binary, offline, format-agnostic, rule-driven. Point it at logs, get MITRE-mapped alerts, move on with your day. Try it: pip install threatlens, then threatlens scan /path/to/logs -o report.html -f html — min-severity high. Repo at github.com/TiltedLunar123/ThreatLens.

If you found this useful, follow me here — I'm writing about what I'm building and breaking as I work through a cybersecurity degree.

#cybersecurity #python #infosec

< Go to the original

Why I Wrote My Own Log Analyzer Instead of Using Splunk

A cybersecurity student's tour through the gap between theory and real incident response

Reporting a Problem