Most of it looks impressive in demos:
- feed scan results,
- ask an LLM what to do,
- get a possible attack path.
But once you actually work inside real Active Directory environments, especially during labs, assessments, or internal research, you notice a problem quickly:
LLMs are good at describing attack paths. They are not reliable at reasoning through them deterministically.
That realization pushed me toward building a different kind of system — a hard-coded correlation and exploitation engine focused on Active Directory enumeration, credential relationships, privilege chains, and attack-path scoring.
Not an AI wrapper.
A real engine.
The Problem With "AI-First" Offensive Tooling
A lot of modern security tooling now depends heavily on language models for:
- attack recommendations,
- privilege escalation suggestions,
- chaining findings together,
- prioritization.
The issue is that AD environments are extremely stateful and contextual.
A valid attack path often depends on:
- ACL relationships,
- protocol constraints,
- delegation settings,
- SPNs,
- ticket behavior,
- trust relationships,
- credential reuse,
- reachable hosts,
- operational constraints,
- environmental assumptions.
An LLM may sound correct while being operationally wrong.
In offensive security, "mostly correct" is dangerous.
Especially when:
- chaining attacks,
- automating decisions,
- prioritizing lateral movement,
- or evaluating real enterprise exposure.
So instead of trying to "prompt engineer" attack paths, I started exploring something else:
What if the reasoning itself was deterministic?
The Core Idea
The system I've been working on treats Active Directory exploitation more like a graph and scoring problem than a chatbot problem.
Instead of:
"AI, tell me what to do next."
The engine:
- collects data,
- normalizes findings,
- correlates relationships,
- scores paths,
- identifies realistic escalation opportunities,
- and recommends deterministic actions.
The important distinction:
- the logic is hard-coded,
- the correlations are engineered,
- and the exploitation methodology is reproducible.
No hallucinations.
No probabilistic guessing.
Why AD Is Perfect for Correlation Engines
Active Directory environments naturally expose relational structures:
- users,
- groups,
- ACLs,
- sessions,
- trusts,
- SPNs,
- delegation,
- shares,
- policies,
- hosts,
- certificates.
Most tools enumerate these independently.
The difficult part is connecting them meaningfully.
For example:
- a low-privileged user,
- with access to a writable share,
- containing credentials,
- that authenticate to a machine,
- with constrained delegation,
- exposing another path into a higher-value system.
Individually:
- none of those findings are critical.
Together:
- they become a chain.
That correlation layer is where a lot of tooling still struggles.
Why I Avoided Heavy AI Usage
I experimented mentally with integrating LLM-style logic into the workflow, but several issues became obvious:
1. Context Explosion
AD environments generate massive amounts of data.
Feeding:
- BloodHound data,
- LDAP enumeration,
- SMB results,
- Kerberos metadata,
- credential artifacts,
- local privilege findings,
into an LLM becomes extremely expensive and noisy.
2. Operational Inconsistency
Attack paths must be reproducible.
If the same environment produces:
- different recommendations,
- different priorities,
- different logic,
depending on prompt wording or model state, automation becomes unreliable.
3. Security Engineering Requires Precision
Things like:
- delegation abuse,
- ACL exploitation,
- Kerberos abuse,
- ADCS attacks,
- trust exploitation,
often require exact conditions.
A deterministic engine can explicitly encode those conditions.
That makes the system explainable.
What the Engine Actually Focuses On
The goal is not:
"automatically hack everything."
The focus is:
- reducing analyst fatigue,
- correlating findings,
- prioritizing realistic paths,
- and building methodology-aware automation.
Some areas the engine focuses on:
- credential relationship mapping,
- privilege inheritance,
- Kerberos attack opportunities,
- ADCS exposure,
- attack-path scoring,
- misconfiguration chaining,
- lateral movement feasibility,
- trust analysis,
- host-value prioritization.
The interesting challenge isn't enumeration.
It's orchestration.
The Hard Part Nobody Talks About
The hardest problem is not exploitation.
It's scoring.
A lot of tools can identify:
- vulnerable ACLs,
- kerberoastable accounts,
- weak permissions,
- exposed services.
But which finding matters most?
That depends on:
- path depth,
- exploit reliability,
- stealth,
- blast radius,
- operational noise,
- authentication dependencies,
- environmental assumptions.
This is where purely percentage-based "risk scoring" starts breaking down.
Real attack-path reasoning is contextual.
Why This Approach Interests Me More Than AI Wrappers
I think offensive security is slowly moving toward two extremes:
1. AI orchestration systems
Flexible, but inconsistent.
2. Deterministic correlation engines
Harder to engineer, but reliable.
Personally, I think the second category has far more long-term value for serious operators and enterprise environments.
Because eventually:
- explainability matters,
- reproducibility matters,
- operational confidence matters.
Especially in enterprise security tooling.
Final Thoughts
I don't think AI is useless in cybersecurity.
Far from it.
LLMs are genuinely useful for:
- summarization,
- documentation,
- knowledge retrieval,
- reporting,
- workflow acceleration.
But when it comes to:
- exploitation logic,
- attack-path correlation,
- privilege reasoning,
- and deterministic decision-making,
hard-coded engineering still matters a lot.
That's the direction I've been exploring: building systems that reason about environments through structured logic rather than probabilistic text generation.
Not because AI is "bad".
But because offensive security rewards precision.