thechosenone-shall-prevail/cold-relay: Cold Relay is a single-binary Active Directory security assessment tool that collects Windows authentication evidence across LDAP, Kerberos, SMB, DNS, GPO, delegation, certificate services, and more turning evidence into deterministic findings with an offline attack graph.

Most of it looks impressive in demos:

  • feed scan results,
  • ask an LLM what to do,
  • get a possible attack path.

But once you actually work inside real Active Directory environments, especially during labs, assessments, or internal research, you notice a problem quickly:

LLMs are good at describing attack paths. They are not reliable at reasoning through them deterministically.

That realization pushed me toward building a different kind of system — a hard-coded correlation and exploitation engine focused on Active Directory enumeration, credential relationships, privilege chains, and attack-path scoring.

Not an AI wrapper.

A real engine.

The Problem With "AI-First" Offensive Tooling

A lot of modern security tooling now depends heavily on language models for:

  • attack recommendations,
  • privilege escalation suggestions,
  • chaining findings together,
  • prioritization.

The issue is that AD environments are extremely stateful and contextual.

A valid attack path often depends on:

  • ACL relationships,
  • protocol constraints,
  • delegation settings,
  • SPNs,
  • ticket behavior,
  • trust relationships,
  • credential reuse,
  • reachable hosts,
  • operational constraints,
  • environmental assumptions.

An LLM may sound correct while being operationally wrong.

In offensive security, "mostly correct" is dangerous.

Especially when:

  • chaining attacks,
  • automating decisions,
  • prioritizing lateral movement,
  • or evaluating real enterprise exposure.

So instead of trying to "prompt engineer" attack paths, I started exploring something else:

What if the reasoning itself was deterministic?

The Core Idea

The system I've been working on treats Active Directory exploitation more like a graph and scoring problem than a chatbot problem.

Instead of:

"AI, tell me what to do next."

The engine:

  1. collects data,
  2. normalizes findings,
  3. correlates relationships,
  4. scores paths,
  5. identifies realistic escalation opportunities,
  6. and recommends deterministic actions.

The important distinction:

  • the logic is hard-coded,
  • the correlations are engineered,
  • and the exploitation methodology is reproducible.

No hallucinations.

No probabilistic guessing.

Why AD Is Perfect for Correlation Engines

Active Directory environments naturally expose relational structures:

  • users,
  • groups,
  • ACLs,
  • sessions,
  • trusts,
  • SPNs,
  • delegation,
  • shares,
  • policies,
  • hosts,
  • certificates.

Most tools enumerate these independently.

The difficult part is connecting them meaningfully.

For example:

  • a low-privileged user,
  • with access to a writable share,
  • containing credentials,
  • that authenticate to a machine,
  • with constrained delegation,
  • exposing another path into a higher-value system.

Individually:

  • none of those findings are critical.

Together:

  • they become a chain.

That correlation layer is where a lot of tooling still struggles.

Why I Avoided Heavy AI Usage

I experimented mentally with integrating LLM-style logic into the workflow, but several issues became obvious:

1. Context Explosion

AD environments generate massive amounts of data.

Feeding:

  • BloodHound data,
  • LDAP enumeration,
  • SMB results,
  • Kerberos metadata,
  • credential artifacts,
  • local privilege findings,

into an LLM becomes extremely expensive and noisy.

2. Operational Inconsistency

Attack paths must be reproducible.

If the same environment produces:

  • different recommendations,
  • different priorities,
  • different logic,

depending on prompt wording or model state, automation becomes unreliable.

3. Security Engineering Requires Precision

Things like:

  • delegation abuse,
  • ACL exploitation,
  • Kerberos abuse,
  • ADCS attacks,
  • trust exploitation,

often require exact conditions.

A deterministic engine can explicitly encode those conditions.

That makes the system explainable.

What the Engine Actually Focuses On

The goal is not:

"automatically hack everything."

The focus is:

  • reducing analyst fatigue,
  • correlating findings,
  • prioritizing realistic paths,
  • and building methodology-aware automation.

Some areas the engine focuses on:

  • credential relationship mapping,
  • privilege inheritance,
  • Kerberos attack opportunities,
  • ADCS exposure,
  • attack-path scoring,
  • misconfiguration chaining,
  • lateral movement feasibility,
  • trust analysis,
  • host-value prioritization.

The interesting challenge isn't enumeration.

It's orchestration.

The Hard Part Nobody Talks About

The hardest problem is not exploitation.

It's scoring.

A lot of tools can identify:

  • vulnerable ACLs,
  • kerberoastable accounts,
  • weak permissions,
  • exposed services.

But which finding matters most?

That depends on:

  • path depth,
  • exploit reliability,
  • stealth,
  • blast radius,
  • operational noise,
  • authentication dependencies,
  • environmental assumptions.

This is where purely percentage-based "risk scoring" starts breaking down.

Real attack-path reasoning is contextual.

Why This Approach Interests Me More Than AI Wrappers

I think offensive security is slowly moving toward two extremes:

1. AI orchestration systems

Flexible, but inconsistent.

2. Deterministic correlation engines

Harder to engineer, but reliable.

Personally, I think the second category has far more long-term value for serious operators and enterprise environments.

Because eventually:

  • explainability matters,
  • reproducibility matters,
  • operational confidence matters.

Especially in enterprise security tooling.

Final Thoughts

I don't think AI is useless in cybersecurity.

Far from it.

LLMs are genuinely useful for:

  • summarization,
  • documentation,
  • knowledge retrieval,
  • reporting,
  • workflow acceleration.

But when it comes to:

  • exploitation logic,
  • attack-path correlation,
  • privilege reasoning,
  • and deterministic decision-making,

hard-coded engineering still matters a lot.

That's the direction I've been exploring: building systems that reason about environments through structured logic rather than probabilistic text generation.

Not because AI is "bad".

But because offensive security rewards precision.