My First Bug Bounty: Finding Memory DoS Vulnerabilities in Circle's Arc Blockchain Consensus Engine

A beginner's journey through Rust, BFT consensus, and the HackerOne disclosure process

Veríssimo

~8 min read · April 15, 2026 (Updated: April 15, 2026) · Free: Yes

Who I Am I'm a beginner bug hunter with limited Rust experience. I had never audited a blockchain consensus engine before this. I used an AI-assisted workflow to bridge the gap between "I can read code" and "I understand what this code does under adversarial conditions." This post is for people like me — technically curious, willing to learn, but not yet a Rust or cryptography expert.

## The Target: Arc Network and Malachite  
**Arc** is Circle's Layer-1 blockchain 
- the same Circle that created USDC (the second-largest stablecoin by market cap). Arc uses USDC as its native gas token and is designed for institutional financial use cases, with future participants including major financial institutions.  

**Malachite** is the consensus engine underneath Arc. It implements Tendermint BFT in Rust - a battle-tested algorithm used across dozens of blockchains. Circle open-sourced it at: `https://github.com/circlefin/malachite`  The bug bounty program runs on HackerOne at `hackerone.com/circle-bbp`.  

### Why audit a consensus engine?  Consensus engines are the most security-critical component of any blockchain. If you can crash or manipulate the consensus layer:  
- The chain stops producing blocks 
- All transactions freeze 
- In Arc's case: USDC transfers halt  A single bug in the right place can have financial consequences at institutional scale.  

---  

## Setting Up: The Research Environment  

### What I cloned  

```bash git clone https://github.com/circlefin/malachite.git cd malachite/code cargo build --workspace

The codebase structure

Malachite is split into focused Rust crates:

core-consensus ← pure BFT state machine (no I/O) core-votekeeper ← vote aggregation and evidence tracking core-types ← shared types: Height, Round, VotingPower sync ← block synchronization between peers network ← libp2p peer-to-peer networking signing ← cryptographic signature verification engine ← wires everything together config ← default parameters

The key design principle: the consensus library is pure — it takes inputs and emits Effect values (instructions to the host), never performing I/O directly. This makes it very testable in isolation.

The input/output model

Network message arrives     
↓ process!() macro    
↓ Effect emitted (e.g., Effect::GetDecidedValues, Effect::Publish)     
↓ Host application handles the effect

Understanding this model was the key to understanding every vulnerability I found.

What is BFT Consensus? (Plain English)

Imagine 100 computers (validators) that need to agree on the same ordered list of transactions — without trusting each other completely.

Tendermint BFT works in rounds:

One validator proposes a block
All validators broadcast Prevote messages
If 2/3+ agree, all broadcast Precommit messages
If 2/3+ Precommit: block is finalized (irreversible)
Move to next height (block number)

The critical property: as long as fewer than 1/3 of validators are malicious, the chain is safe and live.

This means: if you can crash just 1/3 of validators, the chain halts. That's the threat model I was hunting in.

My Methodology: The Two-Agent Approach

I used a two-agent AI workflow to compensate for my limited Rust experience:

Agent 1 (Claude Workbench — this conversation) → Security analyst → Reads briefings, reasons about exploitability → Drafts HackerOne reports → Issues research requests Agent 2 (Claude Code — local CLI) → Code reader → Opens actual source files → Traces execution paths → Answers specific questions about reachability

All communication happened through markdown files in two folders:

workbench_inbox/ — my requests to Claude Code
briefings_output/ — Claude Code's source-backed answers

This separation forced me to be precise: every claim in my reports had to be traced to a specific file and line number.

Example research request I sent to Claude Code

# Request - GossipSub Dedup Scope 

## Context Does libp2p GossipSub deduplicate vote messages before they reach the consensus handler? This affects whether F1 requires a validator key.  

## Files Needed - network/src/behaviour.rs 
- look for message_id_fn - network/src/pubsub.rs    
- custom topic encoding  

## Questions 
1. What is the message_id_fn hash function? 
2. Does sequence_number get included in the hash? 
3. Is there a consensus-level signature check before delivery?

Claude Code read the actual files and reported back:

message_id_fn uses SeaHash over the full gossipsub::Message struct
sequence_number IS included → auto-increments → every message unique
No consensus-level check before buffering

This confirmed Attack Complexity = Low for my CVSS score.

The Vulnerability I Found (F1): Memory Exhaustion via Vote Flooding

⚠️ This vulnerability has been marked as resolved by Circle (HackerOne report #3578199). Full technical details are safe to share.

The bug in plain English

When a Malachite node receives a vote for a future block height (height H+1 when you're currently at H), it stores the vote in a buffer to process later.

The buffer (BoundedQueue) limits how many distinct heights can be buffered (default: 10). But it has no limit on how many votes per height can be stored.

More importantly: signature verification happens AFTER buffering.

This means any network peer can:

Connect to a validator
Send millions of fake vote messages claiming height H+1
Each message gets buffered before any signature check
The validator's RAM fills up
Node crashes

The code (simplified)

// handle/vote.rs - the entry point 
if consensus_height < vote_height {     
// ⚠️ Buffered HERE - before signature check     
state.buffer_input(vote_height, Input::Vote(signed_vote), metrics);     
return Ok(()); } // Signature verification only happens AFTER this early 
return verify_signed_vote(co, &signed_vote, &state.params).await?;
rust
// bounded_queue.rs - the data structure 
pub fn push(&mut self, height: Height, input: Input) -> bool {     
  if let Some(bucket) = self.inner.get_mut(&height) {
         bucket.push(input);   // ← Vec grows without bound ⚠️         
return true;     }     // Capacity check only applies to NEW height keys }

The proof of concept

I wrote a unit test that proved the unbounded growth:

cargo test -p arc-malachitebft-core-consensus \   
f1_per_height_vec_is_unbounded -- --nocapture

Output:

[F1-POC-3] Messages pushed: 500000 [F1-POC-3] Heap consumed: 100 MB [F1-POC-3] Growth rate: 675 MB/s [F1-POC-3] 4 GB exhausted: ~6 seconds from a single attacker

675 MB/s memory growth. A node with 4 GB RAM exhausted in 6 seconds.

Why GossipSub didn't save it

I initially thought libp2p's GossipSub deduplication might prevent this attack. Claude Code read network/src/behaviour.rs and found:

rust

Copy

fn message_id(message: &gossipsub::Message) -> gossipsub::MessageId { let mut hasher = SeaHasher::new(); message.hash(&mut hasher); // includes sequence_number gossipsub::MessageId::new(hasher.finish().to_be_bytes().as_slice()) }

The hash includes sequence_number, which libp2p auto-increments per publish. Every message is guaranteed unique — dedup never fires. The attacker gets unique message IDs for free with zero effort.

What happened on HackerOne

My report was marked as a duplicate of report #3578199, filed on 2026–02–28 — about 45 days before my submission.

The triage response was encouraging:

"The proof-of-concept tests, memory growth measurements, and code references demonstrate excellent research quality."

Being a duplicate means I was right — just not first. The vulnerability was real, my analysis was correct, and my PoC was independently confirmed. That's a win for a first submission.

What I Learned

On Rust

Vec<T> has no automatic size limit. This is the most important thing to internalize for security research. In languages like Java or Python, you might expect collections to have some safety mechanism. In Rust, Vec::push() always succeeds (until OOM). Any codebase that stores attacker-influenced data in a Vec without a size check is potentially vulnerable.

The ? operator is a security signal. When you see ? in Rust, it means "propagate this error immediately." When you see if let Ok(...), it means "handle the success case, silently drop the error." In security-critical code, silent error dropping is almost always a finding worth investigating.

async fn + effects hide I/O cost. Malachite emits Effect::GetDecidedValues instead of directly reading the database. This is clean architecture — but it means every call to that function triggers a real disk read. Rate limiting needs to happen at the effect emission site, not at the I/O layer.

On bug hunting

Read the tests before the code. Production test suites document intended behaviour. I found the push_to_full_queue_succeeds_for_existing_index test in bounded_queue.rs that explicitly proved the unbounded behaviour — the developers had tested and documented it, just never considered it in an adversarial context.

Retract fast, retract early. I initially identified 3 "critical" bugs that turned out to be non-issues after tracing the code more carefully. Submitting them would have damaged my credibility. Every retraction made the remaining reports stronger.

CVSS is a conversation, not a formula. Every metric needs a justification. "AC=Low because sequence_number auto-increments" is a stronger argument than just asserting Low. Triagers respect researchers who show their reasoning.

Commit history is evidence. Two of my reports include specific git commits that prove Circle recently modified the vulnerable files without fixing the bugs. This is powerful evidence that the vulnerability survived code review.

On the process

The two-agent workflow worked. Separating "analyst who reasons about exploitability" from "code reader who traces execution paths" prevented me from making assumptions. Every claim had to be backed by a source file and line number.

Severity honesty builds trust. The triager who marked F1 as a duplicate explicitly praised the report quality. Writing Medium severity when it's Medium (not inflating to Critical) signals you understand the system. Triagers remember researchers who don't overclaim.

The Vulnerability Classes I Found

Beyond F1, I identified several other vulnerability classes in the codebase that are currently under active review by Circle's security team. I'll publish full technical details once the fixes land. At a high level, they fall into:

ClassLocationTypeResource exhaustionSync subsystemNo rate limit on DB readsSilent error handlingSigning verificationErrors discarded without loggingUnbounded accumulationVote evidence trackingNo cap on evidence entriesSelf-reported trustSync peer scoringPeer-reported values used without validation

Each represents a different security principle:

Rate limiting is a network-layer concern, not just an application concern
Every error in a security-critical path should be logged and acted on
Any data structure that grows based on attacker-influenced input needs a cap
Peer-reported values are always adversarial inputs

Resources for New Bug Hunters

Learn Rust security

The Rust Programming Language (free at doc.rust-lang.org/book)
Rustonomicon — unsafe Rust and memory model
Search for Vec::push, HashMap::insert, unwrap() in any codebase — these are your first targets

Learn BFT / blockchain consensus

The original Tendermint paper (2018) — in the Malachite docs/ folder
Malachite's ARCHITECTURE.md — best explanation of the input/effect model
CometBFT source code (Go) — same algorithm, more mature codebase for comparison

Learn HackerOne reporting

HackerOne's public disclosed reports — read 50 before writing your first
CVSS v3.1 calculator at first.org/cvss
CWE list at cwe.mitre.org — search by keyword

Tools I used

cargo test -- --nocapture — run tests with println! output visible
grep -rn "pattern" crate/src/ — find all usages of a function
git log --oneline --stat -- path/to/file — see what changed and when
Claude Code CLI — for tracing execution paths across large codebases

What's Next

I have several reports under active review with Circle's security team. Once the fixes land and the program authorizes disclosure, I'll publish:

Full technical writeups for each vulnerability
The exact PoC code with real terminal output
The git commit evidence that proved the bugs survived code review
Lessons from the triage process on each report

Follow me here on Medium to get notified when that post goes live.

Final Thoughts

Bug hunting a consensus engine as a beginner is hard. But "hard" doesn't mean "inaccessible." The Malachite codebase is clean, well-structured, and has excellent documentation. The Circle bug bounty program has fast triage and honest feedback.

The biggest lesson: security bugs in consensus engines are rarely exotic cryptography failures. They're usually the same boring categories — unbounded buffers, missing rate limits, silent error handling — that appear in every codebase. The dangerous part is the context: a boring Vec::push() with no size cap becomes a chain-halting vulnerability when it sits in the path of an unauthenticated network message.

If you can read code and think adversarially, you can find these bugs too.

#web3-security #rust-programming-language #bug-bounty #hackerone