Who I Am I'm a beginner bug hunter with limited Rust experience. I had never audited a blockchain consensus engine before this. I used an AI-assisted workflow to bridge the gap between "I can read code" and "I understand what this code does under adversarial conditions." This post is for people like me β technically curious, willing to learn, but not yet a Rust or cryptography expert.
## The Target: Arc Network and Malachite
**Arc** is Circle's Layer-1 blockchain
- the same Circle that created USDC (the second-largest stablecoin by market cap). Arc uses USDC as its native gas token and is designed for institutional financial use cases, with future participants including major financial institutions.
**Malachite** is the consensus engine underneath Arc. It implements Tendermint BFT in Rust - a battle-tested algorithm used across dozens of blockchains. Circle open-sourced it at: `https://github.com/circlefin/malachite` The bug bounty program runs on HackerOne at `hackerone.com/circle-bbp`.
### Why audit a consensus engine? Consensus engines are the most security-critical component of any blockchain. If you can crash or manipulate the consensus layer:
- The chain stops producing blocks
- All transactions freeze
- In Arc's case: USDC transfers halt A single bug in the right place can have financial consequences at institutional scale.
---
## Setting Up: The Research Environment
### What I cloned
```bash git clone https://github.com/circlefin/malachite.git cd malachite/code cargo build --workspaceThe codebase structure
Malachite is split into focused Rust crates:
core-consensus β pure BFT state machine (no I/O) core-votekeeper β vote aggregation and evidence tracking core-types β shared types: Height, Round, VotingPower sync β block synchronization between peers network β libp2p peer-to-peer networking signing β cryptographic signature verification engine β wires everything together config β default parameters
The key design principle: the consensus library is pure β it takes inputs and emits Effect values (instructions to the host), never performing I/O directly. This makes it very testable in isolation.
The input/output model
Network message arrives
β process!() macro
β Effect emitted (e.g., Effect::GetDecidedValues, Effect::Publish)
β Host application handles the effectUnderstanding this model was the key to understanding every vulnerability I found.
What is BFT Consensus? (Plain English)
Imagine 100 computers (validators) that need to agree on the same ordered list of transactions β without trusting each other completely.
Tendermint BFT works in rounds:
- One validator proposes a block
- All validators broadcast Prevote messages
- If 2/3+ agree, all broadcast Precommit messages
- If 2/3+ Precommit: block is finalized (irreversible)
- Move to next height (block number)
The critical property: as long as fewer than 1/3 of validators are malicious, the chain is safe and live.
This means: if you can crash just 1/3 of validators, the chain halts. That's the threat model I was hunting in.
My Methodology: The Two-Agent Approach
I used a two-agent AI workflow to compensate for my limited Rust experience:
Agent 1 (Claude Workbench β this conversation) β Security analyst β Reads briefings, reasons about exploitability β Drafts HackerOne reports β Issues research requests Agent 2 (Claude Code β local CLI) β Code reader β Opens actual source files β Traces execution paths β Answers specific questions about reachability
All communication happened through markdown files in two folders:
workbench_inbox/β my requests to Claude Codebriefings_output/β Claude Code's source-backed answers
This separation forced me to be precise: every claim in my reports had to be traced to a specific file and line number.
Example research request I sent to Claude Code
# Request - GossipSub Dedup Scope
## Context Does libp2p GossipSub deduplicate vote messages before they reach the consensus handler? This affects whether F1 requires a validator key.
## Files Needed - network/src/behaviour.rs
- look for message_id_fn - network/src/pubsub.rs
- custom topic encoding
## Questions
1. What is the message_id_fn hash function?
2. Does sequence_number get included in the hash?
3. Is there a consensus-level signature check before delivery?Claude Code read the actual files and reported back:
message_id_fnuses SeaHash over the fullgossipsub::Messagestructsequence_numberIS included β auto-increments β every message unique- No consensus-level check before buffering
This confirmed Attack Complexity = Low for my CVSS score.
The Vulnerability I Found (F1): Memory Exhaustion via Vote Flooding
β οΈ This vulnerability has been marked as resolved by Circle (HackerOne report #3578199). Full technical details are safe to share.
The bug in plain English
When a Malachite node receives a vote for a future block height (height H+1 when you're currently at H), it stores the vote in a buffer to process later.
The buffer (BoundedQueue) limits how many distinct heights can be buffered (default: 10). But it has no limit on how many votes per height can be stored.
More importantly: signature verification happens AFTER buffering.
This means any network peer can:
- Connect to a validator
- Send millions of fake vote messages claiming height H+1
- Each message gets buffered before any signature check
- The validator's RAM fills up
- Node crashes
The code (simplified)
// handle/vote.rs - the entry point
if consensus_height < vote_height {
// β οΈ Buffered HERE - before signature check
state.buffer_input(vote_height, Input::Vote(signed_vote), metrics);
return Ok(()); } // Signature verification only happens AFTER this early
return verify_signed_vote(co, &signed_vote, &state.params).await?;
rust
// bounded_queue.rs - the data structure
pub fn push(&mut self, height: Height, input: Input) -> bool {
if let Some(bucket) = self.inner.get_mut(&height) {
bucket.push(input); // β Vec grows without bound β οΈ
return true; } // Capacity check only applies to NEW height keys }The proof of concept
I wrote a unit test that proved the unbounded growth:
cargo test -p arc-malachitebft-core-consensus \
f1_per_height_vec_is_unbounded -- --nocaptureOutput:
[F1-POC-3] Messages pushed: 500000 [F1-POC-3] Heap consumed: 100 MB [F1-POC-3] Growth rate: 675 MB/s [F1-POC-3] 4 GB exhausted: ~6 seconds from a single attacker
675 MB/s memory growth. A node with 4 GB RAM exhausted in 6 seconds.
Why GossipSub didn't save it
I initially thought libp2p's GossipSub deduplication might prevent this attack. Claude Code read network/src/behaviour.rs and found:
rust
Copy
fn message_id(message: &gossipsub::Message) -> gossipsub::MessageId { let mut hasher = SeaHasher::new(); message.hash(&mut hasher); // includes sequence_number gossipsub::MessageId::new(hasher.finish().to_be_bytes().as_slice()) }
The hash includes sequence_number, which libp2p auto-increments per publish. Every message is guaranteed unique β dedup never fires. The attacker gets unique message IDs for free with zero effort.
What happened on HackerOne
My report was marked as a duplicate of report #3578199, filed on 2026β02β28 β about 45 days before my submission.
The triage response was encouraging:
"The proof-of-concept tests, memory growth measurements, and code references demonstrate excellent research quality."
Being a duplicate means I was right β just not first. The vulnerability was real, my analysis was correct, and my PoC was independently confirmed. That's a win for a first submission.
What I Learned
On Rust
Vec<T> has no automatic size limit. This is the most important thing to internalize for security research. In languages like Java or Python, you might expect collections to have some safety mechanism. In Rust, Vec::push() always succeeds (until OOM). Any codebase that stores attacker-influenced data in a Vec without a size check is potentially vulnerable.
The ? operator is a security signal. When you see ? in Rust, it means "propagate this error immediately." When you see if let Ok(...), it means "handle the success case, silently drop the error." In security-critical code, silent error dropping is almost always a finding worth investigating.
async fn + effects hide I/O cost. Malachite emits Effect::GetDecidedValues instead of directly reading the database. This is clean architecture β but it means every call to that function triggers a real disk read. Rate limiting needs to happen at the effect emission site, not at the I/O layer.
On bug hunting
Read the tests before the code. Production test suites document intended behaviour. I found the push_to_full_queue_succeeds_for_existing_index test in bounded_queue.rs that explicitly proved the unbounded behaviour β the developers had tested and documented it, just never considered it in an adversarial context.
Retract fast, retract early. I initially identified 3 "critical" bugs that turned out to be non-issues after tracing the code more carefully. Submitting them would have damaged my credibility. Every retraction made the remaining reports stronger.
CVSS is a conversation, not a formula. Every metric needs a justification. "AC=Low because sequence_number auto-increments" is a stronger argument than just asserting Low. Triagers respect researchers who show their reasoning.
Commit history is evidence. Two of my reports include specific git commits that prove Circle recently modified the vulnerable files without fixing the bugs. This is powerful evidence that the vulnerability survived code review.
On the process
The two-agent workflow worked. Separating "analyst who reasons about exploitability" from "code reader who traces execution paths" prevented me from making assumptions. Every claim had to be backed by a source file and line number.
Severity honesty builds trust. The triager who marked F1 as a duplicate explicitly praised the report quality. Writing Medium severity when it's Medium (not inflating to Critical) signals you understand the system. Triagers remember researchers who don't overclaim.
The Vulnerability Classes I Found
Beyond F1, I identified several other vulnerability classes in the codebase that are currently under active review by Circle's security team. I'll publish full technical details once the fixes land. At a high level, they fall into:
ClassLocationTypeResource exhaustionSync subsystemNo rate limit on DB readsSilent error handlingSigning verificationErrors discarded without loggingUnbounded accumulationVote evidence trackingNo cap on evidence entriesSelf-reported trustSync peer scoringPeer-reported values used without validationEach represents a different security principle:
- Rate limiting is a network-layer concern, not just an application concern
- Every error in a security-critical path should be logged and acted on
- Any data structure that grows based on attacker-influenced input needs a cap
- Peer-reported values are always adversarial inputs
Resources for New Bug Hunters
Learn Rust security
- The Rust Programming Language (free at doc.rust-lang.org/book)
- Rustonomicon β unsafe Rust and memory model
- Search for
Vec::push,HashMap::insert,unwrap()in any codebase β these are your first targets
Learn BFT / blockchain consensus
- The original Tendermint paper (2018) β in the Malachite
docs/folder - Malachite's
ARCHITECTURE.mdβ best explanation of the input/effect model - CometBFT source code (Go) β same algorithm, more mature codebase for comparison
Learn HackerOne reporting
- HackerOne's public disclosed reports β read 50 before writing your first
- CVSS v3.1 calculator at first.org/cvss
- CWE list at cwe.mitre.org β search by keyword
Tools I used
cargo test -- --nocaptureβ run tests with println! output visiblegrep -rn "pattern" crate/src/β find all usages of a functiongit log --oneline --stat -- path/to/fileβ see what changed and when- Claude Code CLI β for tracing execution paths across large codebases
What's Next
I have several reports under active review with Circle's security team. Once the fixes land and the program authorizes disclosure, I'll publish:
- Full technical writeups for each vulnerability
- The exact PoC code with real terminal output
- The git commit evidence that proved the bugs survived code review
- Lessons from the triage process on each report
Follow me here on Medium to get notified when that post goes live.
Final Thoughts
Bug hunting a consensus engine as a beginner is hard. But "hard" doesn't mean "inaccessible." The Malachite codebase is clean, well-structured, and has excellent documentation. The Circle bug bounty program has fast triage and honest feedback.
The biggest lesson: security bugs in consensus engines are rarely exotic cryptography failures. They're usually the same boring categories β unbounded buffers, missing rate limits, silent error handling β that appear in every codebase. The dangerous part is the context: a boring Vec::push() with no size cap becomes a chain-halting vulnerability when it sits in the path of an unauthenticated network message.
If you can read code and think adversarially, you can find these bugs too.