Using ChaCha20 as a Full CSPRNG for Provably Fair Gambling

Why most provably fair implementations are incomplete, and how a stream cipher solves what hashing cannot.

Emma Paułowicz

~13 min read · March 16, 2026 (Updated: March 24, 2026) · Free: Yes

Most provably fair crypto casinos advertise cryptographic fairness. They commit to a server seed, combine it with a player-provided client seed, hash the result, and derive a number. It works - for simple outcomes. But it breaks down the moment you need to prove a card shuffle was fair, a weighted lootbox draw was honest, or a crash multiplier followed the stated distribution.

This post describes a different architecture: using ChaCha20 - a stream cipher standardized in RFC 8439 - as a full deterministic CSPRNG that generates an arbitrarily long, verifiable random byte stream from which every game operation draws its randomness. Not just one number. The entire sequence.

Why This Matters for Licensing and Regulation

Gambling regulators don't accept "trust us". The GLI-19 standard (Gaming Laboratories International, Standards for Interactive Gaming Systems) requires that game outcomes use a Random Number Generator meeting specific criteria: statistical randomness, unpredictability, non-repeatability of sequences, and monitoring. NIST SP 800–90A specifies approved constructions for Deterministic Random Bit Generators (DRBGs).

ChaCha20 aligns with these requirements structurally:

Deterministic given the seed: same key + nonce = same output, always. This satisfies the DRBG requirement that outputs are reproducible from a known initial state - enabling third-party audit and player verification.
Cryptographically secure: ChaCha20 is a 256-bit cipher designed by Daniel J. Bernstein, adopted in TLS 1.3, used as the Linux kernel's CSPRNG (/dev/urandom), and formally analyzed for multi-user security. It's not an experimental choice - it's the cipher Google chose for Android devices.
No timing side channels: Unlike AES software implementations that rely on lookup tables vulnerable to cache-timing attacks, ChaCha20's ARX (add-rotate-XOR) operations run in constant time by design. There is no key-dependent memory access. This matters when the RNG runs on shared cloud infrastructure where a co-tenant could theoretically observe cache behavior.
Stream output: ChaCha20 produces an arbitrary-length byte stream, not a fixed-size hash. This is the critical difference - you can draw as many random bytes as a game round requires, all from one deterministic stream, all independently verifiable.

The regulatory argument is straightforward: a ChaCha20-based DRBG is cryptographically stronger than HMAC-SHA512-to-modulo, produces verifiable deterministic output, resists side-channel attacks, and uses a cipher already trusted by IETF, Google, Cloudflare, and the Linux kernel. If an auditor accepts HMAC-SHA512, they should prefer ChaCha20 - it satisfies the same DRBG requirements with stronger properties.

Why Hash-Based Provably Fair Is Limited

The standard provably fair approach works like this:

outcome = HMAC-SHA512(server_seed, client_seed + ":" + nonce)
number  = parseInt(outcome.substring(0, 8), 16)
result  = number % range

This proves one thing: a single integer in a range was predetermined before the round started. For a dice roll or a coin flip, that's sufficient.

But casino games aren't coin flips. Consider:

Blackjack requires a full deck shuffle. A 52-card deck has 52! ≈ 8 × 1⁰⁶⁷ possible permutations. To prove the shuffle was fair, you need to prove that each of the 51 random selections in a Fisher-Yates shuffle was drawn from an unbiased source. A single HMAC output gives you one number - not 51 sequential independent values.

Lootbox draws use weighted probabilities. An item has a 0.1% drop rate, another has 45%. The draw must sample from a weighted distribution, and the player must be able to verify that the stated weights were applied correctly. A hash-to-modulo approach can be forced to work, but verifying the weight mapping requires publishing the full probability table AND proving the mapping from random bytes to item selection.

Crash games follow a geometric distribution. The multiplier at which the game crashes must be drawn from a specific probability curve - the house edge is embedded in the curve's parameters. A single hash output mapped to a float doesn't let a verifier confirm the distribution shape.

Multi-action rounds generate multiple random events. Each reveal needs an independent random value, but all values must derive from the same committed seed. Hashing once per action requires a chain - and each chain link must be independently verifiable.

The root problem: hash-based systems prove a point. Stream-based systems prove a sequence. When a game round requires multiple random operations - shuffles, weighted picks, geometric samples, sequential reveals - you need a verifiable random stream, not a verifiable random number.

ChaCha20 as a Deterministic PRNG

ChaCha20 takes three inputs: a 256-bit key, a 96-bit nonce, and a 32-bit counter. It produces a stream of pseudorandom bytes that is computationally indistinguishable from true randomness to anyone who doesn't know the key.

In a provably fair context, the mapping is:

key   = HMAC-SHA256(server_seed, domain_parameters)
nonce = derive_from(server_seed, domain_parameters)
stream = ChaCha20(key, nonce, counter=0)

Where domain_parameters includes the game identifier, a hash of the client seed, and the round nonce. This is domain separation - ensuring that two different games or rounds with the same server seed produce completely independent streams.

The key derivation through HMAC-SHA256 serves two purposes:

Independence: even if an attacker knows one derived key, they cannot recover the server seed or derive keys for other rounds.
Standard construction: HMAC-based key derivation is a well-studied pattern (see HKDF, RFC 5869).

Once the stream is initialized, every random operation in the round draws bytes sequentially:

// All operations consume from the same deterministic stream
dice_roll    = stream.next_int_in_range(1, 101)
card_index   = stream.next_int_in_range(0, 51)
lootbox_tier = stream.next_weighted_pick(probability_table)
crash_point  = stream.next_geometric(house_edge_parameter)

The critical property: the byte stream is fully deterministic. Given the server seed, client seed, and nonce, a verifier can reconstruct the exact same ChaCha20 stream and replay every random operation byte-for-byte. This is not possible with hash-based approaches that produce isolated outputs rather than a continuous stream.

Why No Bias

A common pitfall in random number generation is modulo bias. When you map a random value to a range using the modulo operator (random_value % range), the result is biased unless the random value's range is an exact multiple of the target range.

For example, mapping a value from 0-255 to 0-99 using modulo: values 0-55 have a slightly higher probability than values 56-99, because 256 is not divisible by 100. In gambling, where millions of rounds are played, even a 0.01% bias is exploitable.

The correct approach is rejection sampling: draw bytes from the ChaCha20 stream, map to the target range, and if the raw value falls in the biased tail, discard it and draw again. Since ChaCha20 produces an unlimited stream, rejection sampling never runs out of material. The expected number of rejections is less than 1 per draw, so performance cost is negligible.

This is not a new technique - it's the standard recommendation for unbiased random integer generation. But it's easier to implement correctly with a stream cipher than with a hash, because a stream naturally provides sequential independent bytes while a hash requires re-hashing with an incremented counter.

Standards Compliance

ChaCha20 as a DRBG aligns with the principles in NIST SP 800-90A for deterministic random bit generators:

Deterministic output from a seed (key + nonce)
Backtracking resistance - compromising one stream doesn't reveal the master seed
Forward secrecy - each round's key is independently derived
Resistance to state compromise - the server seed is encrypted at rest, and derived keys are ephemeral

While NIST SP 800–90A specifically approves HMAC-DRBG and CTR-DRBG (based on AES), ChaCha20 is increasingly recognized as an equivalent or superior construction for software implementations. The Linux kernel replaced its previous CSPRNG with a ChaCha20-based design. Google and Cloudflare prefer ChaCha20-Poly1305 for TLS. The IETF standardized it in RFC 8439. A gambling RNG auditor evaluating ChaCha20 against GLI-19 criteria would find it meets every requirement: statistical randomness, unpredictability given the key, determinism for replay, and resistance to manipulation.

Proving a Shuffle Is Fair

The Fisher-Yates shuffle is the only correct general-purpose shuffling algorithm. It produces each of the n! possible permutations with equal probability - but only if each random selection is truly uniform over the correct range.

Here's how it works on a 52-card deck, powered by a ChaCha20 stream:

deck = [0, 1, 2, ..., 51]

for i from 51 down to 1:
    j = stream.next_int_in_range(0, i)   // uniform, rejection-sampled
    swap(deck[i], deck[j])

The shuffle consumes 51 random integers from the stream, each in a different range. The stream is deterministic. Therefore, the shuffle is deterministic.

Verification by the player:

After the round, the server seed is revealed (see the "Seed Rotation Flow" section).
The verifier derives the same HMAC-SHA256 key and nonce.
They initialize an identical ChaCha20 stream.
They replay the Fisher-Yates shuffle, consuming bytes in the same order.
If the resulting deck order matches what was dealt, the shuffle was fair.

This is what hash-based provably fair cannot do. To verify a shuffle with HMAC-SHA512, you'd need to either:

Hash 51 times with incremented nonces (and prove the nonce sequence was predetermined), or
Extract 51 values from a single hash output (which is only 64 bytes - not enough entropy for 52! permutations).

With a 256-bit key, ChaCha20 has ²²⁵⁶ possible states - vastly more than the ²²²⁶ needed for 52! permutations. The state space is sufficient for a cryptographically correct shuffle.

The Naive Shuffle Trap

A critical implementation mistake is the "naive shuffle" - selecting j from the full range [0, n-1] at every step instead of [0, i]. This produces a biased distribution where some permutations are dramatically more likely than others. With 3 cards, a naive shuffle produces 27 possible outcomes (3³) mapped onto 6 permutations - an uneven mapping that favors certain orderings.

In a gambling context, this bias could systematically favor certain hands or card sequences. Using the Fisher-Yates algorithm correctly, with each draw consuming from the ChaCha20 stream via rejection sampling, eliminates both the algorithmic bias (naive shuffle) and the numeric bias (modulo arithmetic).

The Seed Rotation Flow

The provably fair contract between player and operator rests on a commit-reveal protocol. The design must satisfy competing requirements: the player must not know the server seed before the round (preventing prediction), but must be able to verify it after (proving fairness). The server seed must be protected from insiders while active, and revealed cleanly when rotated.

The Lifecycle

                 ┌─────────────┐
                 │  Generate   │
                 │ server seed │  (32 bytes from OS CSPRNG)
                 └──────┬──────┘
                        │
                        ▼
                 ┌─────────────┐
                 │   Encrypt   │  (AES-256-CBC with random IV)
                 │  at rest    │  (versioned encryption key)
                 └──────┬──────┘
                        │
                        ▼
                 ┌─────────────┐
                 │  Publish    │
                 │ SHA-256     │  Player sees the hash.
                 │   hash      │  Cannot reverse it.
                 └──────┬──────┘
                        │
                   [rounds played -> nonce increments]
                        │
                        ▼
                 ┌─────────────┐
                 │  Player     │
                 │  requests   │  Triggers rotation.
                 │  rotation   │
                 └──────┬──────┘
                        │
                        ▼
              ┌──────────────────┐
              │  Decrypt old     │
              │  server seed,    │  Old seed is revealed.
              │  generate new    │  New seed pair begins.
              └──────────────────┘
                        │
                        ▼
              ┌──────────────────┐
              │  Player verifies │  hash(revealed_seed) == committed_hash?
              │  all past rounds │  Replay ChaCha20 stream for each nonce.
              └──────────────────┘

Guardrails

The nonce must be monotonically increasing. Each round increments the nonce by exactly one. If the operator could choose nonces, they could try different values until a favorable outcome appeared. Monotonic nonces, published alongside each round result, prevent this. A verifier who sees nonces 1, 2, 3, 5 (skipping 4) knows something was suppressed.

The client seed must be player-controllable. If only the server provides both seeds, the operator could pre-compute server seeds that produce favorable outcomes for a given client seed. The player must be able to set their own client seed at any time. The client seed should be hashed before use in key derivation - this prevents the operator from observing the raw client seed and precomputing outcomes. Publishing SHA-256(client_seed) as the key derivation input rather than the raw client seed adds a layer of indirection.

Seed rotation must be player-initiated. The server must never unilaterally reveal a seed - that would allow the operator to reveal seeds selectively, hiding unfavorable sequences. Only the player's rotation request triggers decryption and reveal. Once revealed, the old seed is immutable and publicly verifiable.

Insider Threat Prevention

The server seed is the crown jewel. Anyone who knows an active server seed can compute every future round's outcome until the player rotates. This makes at-rest encryption and key management critical.

Encrypt the server seed the moment it's generated. The raw seed should exist in memory only during generation and during decryption for reveal. At rest, it's encrypted with a symmetric key (AES-256-CBC with a random IV per seed). The encryption key should never be stored alongside the encrypted seeds.

Version the encryption keys. Use a key identifier stored alongside each encrypted seed. When rotating encryption keys (which should happen on a regular schedule), re-encrypt all active seeds with the new key. This way, a compromised old key doesn't expose newly created seeds, and you can audit which key version each seed was encrypted with.

Rate-limit and audit the reveal endpoint. Seed reveals should be logged with timestamps, requester identity, and the seed pair identifier. Unusual patterns (mass reveals, reveals without corresponding player requests, reveals during active rounds) should trigger alerts. The reveal endpoint should enforce that a seed can only be decrypted after the player has explicitly requested rotation and a new seed pair has been generated.

Never log or cache the raw server seed. Ensure application logs, error tracking systems, and caching layers never contain the plaintext server seed. The encrypted form can be logged safely; the plaintext cannot. This is the most common insider threat vector - not a malicious database query, but someone grepping production logs.

What to Test and How

A provably fair RNG implementation requires testing at multiple levels: determinism, statistical correctness, bias absence, and verification correctness. Statistical tests alone are insufficient - a system could pass NIST SP 800-22 randomness tests while still being vulnerable to prediction if the key derivation is flawed.

Determinism Tests

Pin the inputs, assert exact outputs.

server_seed = "known_test_value_a"
client_seed = "known_test_value_b"
nonce       = 42

stream = initialize_chacha20(server_seed, client_seed, nonce)
assert stream.next_bytes(32) == expected_hex_a
assert stream.next_int_in_range(0, 51) == expected_card
assert stream.next_weighted_pick([0.1, 0.3, 0.6]) == expected_tier

These are snapshot tests. If the key derivation, ChaCha20 initialization, or byte-consumption order changes even slightly, the outputs change. Any code modification that breaks determinism breaks provably fair verification for all historical rounds. These tests are your regression safety net.

Test determinism across implementations. If you provide a verification tool (which you should), test that your server-side implementation and the client-side verification tool produce identical outputs for the same inputs. Cross-implementation determinism is the entire point of provably fair - if the player's verifier disagrees with the server, the system is broken regardless of who is correct.

Distribution Tests

Run at least 1 million iterations. Better up to 1 billion to catch ununiformity in distribution. For each random operation, generate 1M samples and verify the distribution:

// Uniform distribution (dice roll, card selection)
counts = histogram(1_000_000 calls to next_int_in_range(0, 5))
for each bucket:
    assert abs(count - 166_667) < 6_sigma_threshold

// Weighted distribution (lootbox rarity)
weights = [0.001, 0.009, 0.09, 0.3, 0.6]
counts  = histogram(1_000_000 calls to next_weighted_pick(weights))
for each bucket:
    assert abs(observed_frequency - expected_frequency) < tolerance

The 6-sigma threshold (6 standard deviations from the expected mean) means a false positive occurs less than once in 500 million test runs. For gambling, where any detectable bias is a regulatory and financial risk, 6-sigma is the appropriate bar.

Test the shuffle specifically. Fisher-Yates correctness requires testing that every position in the shuffle receives every value with equal probability:

position_counts = matrix(deck_size, deck_size, init=0)

repeat 1_000_000 times:
    deck = shuffle(new_deck(), new_stream())
    for each position p:
        position_counts[p][deck[p]] += 1

    for each cell in position_counts:
        expected = 1_000_000 / deck_size
        assert abs(cell - expected) < 6_sigma_threshold

This catches the naive shuffle bug - if j is drawn from the wrong range, certain positions will show statistically significant bias even though the overall value distribution appears uniform.

Geometric Distribution Tests

For crash-style games where the multiplier follows a geometric distribution:

samples = [stream.next_geometric(house_edge) for _ in range(1_000_000)]

assert abs(median(samples) - expected_median) < tolerance
assert abs(mean(samples) - expected_mean) < tolerance

// Verify the tail: P(multiplier > X) should match (1 - house_edge)^X
for threshold in [2, 5, 10, 50, 100]:
    observed_tail = count(s > threshold for s in samples) / 1_000_000
    expected_tail = (1 - house_edge) ^ threshold
    assert abs(observed_tail - expected_tail) < tolerance

The tail test is critical. The house edge in a crash game is entirely determined by the geometric distribution's parameters. If the tail is thinner than stated, the operator is taking a larger edge than advertised.

Verification Round-Trip Test

The most important test. Simulate a complete player experience:

// Server side: generate seed pair, play rounds, reveal
server_seed  = generate_random(32)
client_seed  = "player_chosen_seed"
committed_hash = SHA256(encrypt(server_seed))

for nonce in 1..100:
    stream = initialize_chacha20(server_seed, client_seed, nonce)
    outcome = play_round(stream)
    record(nonce, outcome)

revealed_seed = decrypt(encrypted_server_seed)

// Verification side: reconstruct and compare
assert SHA256(encrypt(revealed_seed)) == committed_hash

for nonce in 1..100:
    stream = initialize_chacha20(revealed_seed, client_seed, nonce)
    verified_outcome = play_round(stream)
    assert verified_outcome == recorded_outcome[nonce]

If this test passes for every game type and every edge case (empty rounds, maximum bets, minimum bets, rounds with free bets), the provably fair contract is intact.

What Not to Test

Don't test ChaCha20 itself. It's a published, peer-reviewed, formally analyzed cipher. Testing whether ChaCha20 produces random-looking output is testing a widely accepted work - your time is better spent testing your key derivation, your byte-consumption order, and your game logic.

Conclusion

The provably fair ecosystem is stuck on hash-and-modulo for historical reasons - it was the first approach that worked, and it's simple to explain. But as casino games grow more complex (multiplayer rounds, multi-step games, weighted distributions, full deck shuffles), the limitations of hash-based approaches become untenable.

A ChaCha20 stream cipher gives you what hashing cannot: an arbitrarily long, deterministic, verifiable byte stream from which every random operation in a round draws its values. The player doesn't verify one number - they verify the entire random sequence.

The cryptographic foundations are not experimental. ChaCha20 is RFC 8439, deployed in TLS 1.3, trusted by the Linux kernel, and resistant to the cache-timing attacks that plague AES software implementations. Using it as a gambling CSPRNG isn't a leap - it's a natural application of a cipher designed for exactly this kind of deterministic, high-throughput random generation.

The house edge should be in the math, not in the opacity.

I build provably fair gambling systems and financial backend infrastructure. If you're working on iGaming compliance, RNG architecture, or distributed financial systems - connect with me on LinkedIn.

Sources:

#igaming #information-security #casino-games #software-architecture #programming