This architecture is designed to be entirely self-contained, utilizing an S3-compatible object storage backend (MinIO) as the state machine for a sequential vulnerability scanning pipeline.
The Core Concept: The "Inbox" Model
The pipeline is driven by "Inboxes" (buckets). Each stage of the pipeline reads a JSON ticket from its input inbox, appends findings or verdicts, and moves the updated ticket to the next stage.
- Logic: If no findings are detected after all scanners, the ticket is discarded.
- Confidence: High-confidence findings move to the sandbox for exploitation proof.
- Triage: Uncertain or low-confidence findings move to a "Potato" queue for manual human review.
The Sequential Pipeline
1. inbox-prime (The Entry Point)
- Purpose: Starting point for new tickets.
- Trigger: A Watcher service detects a new repository or a file change.
- Content: Basic ticket JSON (repo URL, commit SHA, etc).
- Next Stage: Trivy scanning.
2. inbox-tri (Vulnerability & Misconfig)
- Process: Initial scan for vulnerabilities and misconfigurations.
- Next Stage: Deep secret scanning.
3. inbox-lea (Deep Secret Discovery)
- Process: Deep Git history scanning for leaked credentials.
- Sysadmin Note: Verified secrets appended under
leak_findings. - Next Stage: SAST pattern probing.
4. inbox-pat (Static Analysis & Custom Rules)
- Process: Pattern matching for specific code flaws (C++, CUDA, etc.) using custom SAST rules.
- Next Stage: Dependency analysis.
5. inbox-dep (Dependency CVEs)
- Process: Precise scanning for dependency-level CVEs via OSV-Scanner.
- Next Stage: Automated LLM verdict.
6. inbox-jul (The "Juliet" Pre-Sandbox Queue)
- Purpose: High-confidence findings verified by a local LLM "ladder."
- Process: Ready for exploit confirmation.
- Next Stage: Exploit sandbox.
7. inbox-lim (The "Lima" Proven Gold)
- Purpose: Exploit confirmed via automated sandbox (crash/RCE/leak proof).
- Result: Final artifacts (logs/dumps) are ready for CVE submission.
Triage and Discard Paths
inbox-pot (The "Potato" Queue) For findings where the automated verdict is uncertain or low-confidence. This requires manual intervention to review the JSON and decide whether to move it to the sandbox or the trash.
inbox-trash Final discard for zero-finding tickets or rejected "potatoes."
Operations & Monitoring
Visualizing the Flow Monitoring is handled via a centralized dashboard (e.g., Grafana). Key metrics to track:
- Potato Queue: High alerts if >5 tickets are stale for more than 24 hours.
- Confidence Histogram: Visualizing verdict accuracy to tune LLM thresholds.
Maintenance Tips
- Logs: Monitor consumer services on each node using
journalctl. - Emergency Stop: All workers can be halted by stopping the consumer service across the cluster.
Field Report: Rebuilding and Results
Every homelabber knows the "Hardware Problem Monster." While the physical infrastructure is currently undergoing a complete rebuild, the data from the initial run has proven the concept's value.
The Filter in Action: 70 down to 16
The primary goal of this pipeline was to reduce "alert fatigue" for the sysadmin (the Potato Queue). In the most recent production run, the four-scanner battery flagged over 70 potential vulnerabilities.
In a traditional setup, that's 70 tickets requiring manual triage. However, once the LLM "Ladder" (Ollama running CPU optimized LLM's) processed those findings:
- Initial Scans: 70+ Potential CVEs.
- LLM Filtered: 16 High-Confidence Targets.
- Noise Reduction: ~77%
By using the LLM layer as a "Verdict Vault," the pipeline successfully discarded dozens of false positives and low-priority misconfigurations before a human ever had to look at a screen.
Lessons from the Rebuild
Rebuilding the lab from the ground up has allowed for a few optimizations:
- Compute Decoupling: Ensuring that the scan workers (Trivy, TruffleHog, etc.) are entirely stateless. If a node goes down, the ticket simply stays in its MinIO inbox until a new worker checks it out.
- CPU-Only Inference: Doubling down on the decentralized, CPU-only approach for the LLM verdicts to ensure the pipeline remains accessible without requiring high-end GPU clusters.
- Enhanced Monitoring: Integrating more granular Grafana tracking for the "Potato" queue to identify exactly where tickets are getting stuck in the manual triage phase.
The Clean Room Protocol: Protecting the Lab
One of the biggest challenges in modern AI-driven research is the privacy-vs-performance trade-off. To solve this, will utilizes a Sanitization Gateway before any data leaves the local mesh.
Once the local "Ironclaw" director has a high-confidence finding, it triggers a sanitization routine:
- PII & Metadata Stripping: All local paths, developer names, and repository metadata are scrubbed.
- Contextual Anonymization: The code is abstracted into a "generic" form that retains the logic flaw but removes the specific identity of the target.
- The Secure Handshake: This anonymized "logic packet" is sent to a secure cloud LLM for a final, high-reasoning sanity check.
This creates a "One-Way Valve" for information: Intelligence flows in from the cloud's reasoning capabilities, but sensitive lab data never flows out.
The mission remains the same: Building a decentralized, autonomous security research lab that respects privacy and runs on local iron.