I Played 1,379 Chess Games in 3 Months. The Data Explains Why I Keep Losing.

Most losses don't come from bad moves. They come from patterns you don't notice.

Mostafa Nabil

~9 min read · March 30, 2026 (Updated: April 2, 2026) · Free: Yes

Most losses don't come from bad moves. They come from patterns you don't notice.

Try it with your own chess.com username from here

The clock hits 12:00. Again.

I'm staring at a lost position I could have saved. My hand is already on the mouse. Queue next game.

Three months later, that habit adds up to 1,379 games. Bullet, blitz, rapid. Around 15 a day.

Chess.com's free version gives you a rating graph and a win rate. It doesn't answer the questions that actually matter. I wanted to know:

Do I actually get worse when I'm on a losing streak?
What time of day should I be playing?
How many games is too many in one sitting?

So I built a pipeline to answer these questions. I pulled all my games from the chess.com API, loaded them into DuckDB, transformed them with dbt using a medallion architecture, orchestrated everything with Dagster, and built a Streamlit dashboard to visualize the results.

This article is about what the data told me. I'll also cover some of the SQL that was interesting to write.

Repo: github.com/MostafaNabilll/chesslens

The numbers at a glance

Here's where I stand across time controls:

Format    | Games | Win Rate | Peak | Current | Timeout Loss
----------|-------|----------|------|---------|-------------
Bullet    | 521   | 51.4%    | 1007 | 805     | 18.2%
Blitz     | 292   | 50.7%    | 917  | 859     | 7.2%
Rapid     | 562   | 48.2%    | 1084 | 1058    | 0.4%

Two things jump out.

1 — Bullet. Nearly one in five games ends on time. Not lost on the board. Lost on the clock.

2 — Rapid rating dropped from 1084 to 865, then I recovered at 1058, which tells me I've been playing too much bullet instead of slower games.

3 — 4 daily games with a 100% win rate, but let's not pretend that means anything.

There are smaller details too. A 14-game win streak I did not remember. An 8-game losing streak, I probably should.

Overview

Finding #1: I don't tilt (apparently)

"Tilt" is when you keep playing after losing and your performance drops because you're frustrated. Everyone assumes they tilt. I assumed I tilt. Nope.

I expected this to confirm the usual pattern. Lose a game → get frustrated → play worse.

Here's my win rate based on how many games I'd lost in a row right before:

Consecutive Losses | Games | Win Rate
-------------------|-------|----------
0                  | 752   | 50.4%
1                  | 333   | 43.8%
2                  | 173   | 55.5%
3                  | 74    | 59.5%
4                  | 28    | 53.6%
5                  | 13    | 61.5%

After one loss my win rate drops to 43.8%. Makes sense. But after two or more? It it climbs back up.

After three straight losses I'm winning nearly 60%. After five, 61.5%.

It does not feel like that while playing. It feels like things are getting worse.

What is probably happening is simpler. One loss does not change how I play. However, A streak forces me to slow down and pay attention.

Tilt Tracker

The SQL behind this was the trickiest part of the whole project. I needed to count how many losses happened in a row right before each game. The approach: create "groups" using a running SUM that increments every time I win or draw. All consecutive losses share the same group. Then ROW_NUMBER inside each group gives the streak count, and LAG shifts it to the next game. If that sounds confusing, the full query is in the repo.

Finding #2: Stop playing at night

I already knew this. Now I have a heatmap proving it.

My worst win rates, filtered to time slots where I played at least 5 games:

Day       | Time      | Format | Games | Win Rate
----------|-----------|--------|-------|----------
Monday    | afternoon | rapid  | 7     | 14.3%
Monday    | night     | blitz  | 7     | 14.3%
Saturday  | night     | rapid  | 6     | 16.7%
Saturday  | afternoon | bullet | 5     | 20.0%
Tuesday   | night     | blitz  | 5     | 20.0%

14.3% win rate on Monday nights playing blitz. That means I'm winning 1 out of 7 games and still queuing up for another one. Monday afternoon rapid is just as bad. Saturday night isn't great either.

The dashboard has a heatmap that shows this at a glance.

Green cells = good times to play. Red cells = close the laptop. The heatmap makes it obvious at a glance which time slots I should avoid.

When to play

Finding #3: My Scotch Game is working

I've been studying the Scotch Game opening for rapid over the past few months.

January: 44.4% win rate
February: 51.9% win rate
March: 61.4% win rate

Three months with steady improvement.

Scandinavian Defense in rapid went from 50% to 40% over the same period.
Caro-Kann in bullet crashed to 12.5% in February, down from an already bad 25% in January.

Some openings just don't work at faster time controls due to time pressure.

Opening Analysis

This section exposed a bug that took longer to find than it should have. I was filtering out openings with fewer than five games after applying a window function. That meant trends were sometimes calculated against rows that disappeared later.

Finding #4: I crush lower-rated opponents but fall apart against higher ones (OBVIOUSLY!)

I knew I'd struggle against stronger opponents. I didn't know it was this bad.

Opponent Strength | Bullet Win Rate | Rapid Win Rate
------------------|-----------------|---------------
Lower rated       | 83.1%           | 89.3%
Equal rated       | 48.2%           | 46.2%
Higher rated      | 21.3%           | 12.5%

Against lower rated players, I win most games.

Against higher rated players in rapid, I win one out of eight.

Equal rated games land almost exactly at fifty percent. Which is exactly what a rating system is supposed to do.

In bullet I do slightly better at 21%, probably because bullet has more chaos and time pressure leads to mistakes from both sides.

Opponent Analysis

Finding #5: Sessions of 6–10 games are the sweet spot

I defined a "session" as games played with less than 30 minutes between them, grouped by time control (mixing bullet and rapid ratings in one session would be useless).

Here's how session length affects my rating:

Session Length | Sessions | Avg Rating Change
--------------|----------|------------------
1-2 games     | 258      | -0.9
3-5 games     | 95       | -1.5
6-10 games    | 36       | +7.1
11-20 games   | 20       | +15.1
20+ games     | 4        | +97.0

Short sessions lose rating. Most likely because I stop after losses.

Anything under 6 games, I'm probably quitting after a loss, which makes the session look negative.

6 to 10 games is where I warm up enough to play well without burning out.

The 20+ sessions show +97 average but that's 4 sessions total, and one was my first day when ratings swing wildly. So ignore that row.

Best session I ever had: 33 blitz games, went from 465 to 711.

Worst: 3 bullet games, 653 down to 510. 143 points gone in 3 games.

Both happened on day one.

Sessio Analysis

Finding #6: White is slightly better for me

Color | Games | Win Rate
------|-------|----------
White | 688   | 51.5%
Black | 691   | 48.8%

2.7% gap. Consistent with the general principle that white has a slight advantage from moving first.

I play almost exactly 50/50 between the two colors, which is what you'd expect from random matchmaking.

Some smaller findings

18% of my bullet games are timeout losses. Nearly one in five. I'm either playing openings that are too slow for bullet or I just need to move faster.

Longest win streak: 14 games. Had no idea until I ran the query.

White vs black: 51.5% win rate as white, 48.8% as black. 2.7% gap, which is about what most players see since white moves first.

My rapid rating started around 923, peaked at 1083, then fell to 865 in late February, then climbed back to 1058 by late March. That recovery is probably the thing I'm most proud of in this whole dataset.

Rating Progression

How it's built

The ingestion is a Python script that pulls games from the chess.com public API with a 1 request per second rate limit (chess.com requires this). It can do a full backfill of all historical games or an incremental run that only grabs the current month. Raw JSON goes straight into DuckDB.

For transformation I used dbt-core with bronze, silver, and gold layers. Bronze keeps the raw JSON untouched. Silver parses it into typed columns and determines which side I was playing. The chess.com API returns data for both white and black, so I figure out my perspective once in a CTE and reuse it for every column instead of checking the username ten times. Gold has six analytics models. 18 schema tests across all layers.

Dagster runs the orchestration. Ingestion and dbt models are registered as assets with dependencies, so it knows to pull games first and then transform. Daily schedule at 6 AM, whole pipeline takes about 30 seconds.

The dashboard is Streamlit with Plotly. Six pages, one per question, time control filters everywhere.

Why these tools: DuckDB because 1,379 games fit in memory and I didn't need a server. Dagster over Airflow because the asset model maps well to the medallion layers and I can develop locally without Docker. dbt because it's what data teams actually use.

DBT Lineage

Dagster

The Dagster setup had one issue that wasted about an hour. The pipeline kept failing with exit code 1 and completely empty logs. No error message at all. Eventually I figured out that I had both dbt Cloud CLI and dbt-core installed, and Dagster's subprocess was picking up the Cloud CLI instead of the dbt-core in my virtual environment. Fixed it by pointing Dagster to the exact dbt binary path using sys.executable. If you're wiring up dagster-dbt and getting silent failures, check which dbt is actually being called.

Anyone can use it

I didn't want this to be a project that only works for my account. So I added multi-user support.

When you open the dashboard, enter your chess.com username. If ChessLens hasn't seen that user before, it pulls all their games from the API, runs the full transformation pipeline, and builds their dashboard. Takes about a minute depending on how many games they have.

The data model handles this with a username column in every table. Silver, gold, all of it partitioned by username. Every query in the dashboard filters by whoever is currently logged in. Switching users is one button click in the sidebar.

Stockfish analysis

The last feature I added was on-demand Stockfish evaluation. On the game replay page, there's an "Analyze with Stockfish" button.

Click a button, and Stockfish evaluates the game at depth 20. It takes around thirty to forty seconds.

After the analysis finishes, you get:

An accuracy percentage based on average centipawn loss
A count of best moves, excellent moves, good moves, inaccuracies, mistakes, and blunders
An evaluation bar next to the board that shows who's winning at each position
Colored dots next to moves that were inaccuracies, mistakes, or blunders

The results get cached in DuckDB so if you look at the same game again it loads instantly.

The accuracy formula is an approximation of the one used by chess.com. Their exact formula is proprietary, but the exponential decay based on average centipawn loss gets close. My results are usually within 5–10% of what chess.com shows for the same game.

Try ChessLens with your own games

Find me on LinkedIn if you want to talk about analytics engineering, dbt, or chess.

#data-science #data-engineering #data-analysis #chess #python

< Go to the original

I Played 1,379 Chess Games in 3 Months. The Data Explains Why I Keep Losing.

Most losses don't come from bad moves. They come from patterns you don't notice.

Most losses don't come from bad moves. They come from patterns you don't notice.

The numbers at a glance

Finding #1: I don't tilt (apparently)

Finding #2: Stop playing at night

Finding #3: My Scotch Game is working

Finding #4: I crush lower-rated opponents but fall apart against higher ones (OBVIOUSLY!)

Finding #5: Sessions of 6–10 games are the sweet spot

Finding #6: White is slightly better for me

Some smaller findings

How it's built

Anyone can use it

Stockfish analysis

Reporting a Problem