Beyond the Blend: A Ground-Truth Analysis of Bitcoin Mixer User Patterns

None, None

Beyond the Blend: A Ground-Truth Analysis of Bitcoin Mixer User Patterns

Employing machine learning to unravel the relationship between pre- and post-mixing transactions of Bitcoin mixer users

Master Thesis (2025)

Author(s)

P.H.M. de Haan (TU Delft - Technology, Policy and Management)

Contributor(s)

Rolf S. van Wegberg – Mentor (TU Delft - Organisation & Governance)

F. d'Hont – Graduation committee member (TU Delft - Policy Analysis)

K.J.M. Lubbertsen – Graduation committee member (Fiscale inlichtingen- en opsporingsdienst (FIOD))

Faculty

Technology, Policy and Management

To reference this document use:

https://resolver.tudelft.nl/uuid:9139c4a6-8bfe-4244-935c-6f63c46e0a2c

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

28-08-2025

Awarding Institution

Delft University of Technology

Programme

['Engineering and Policy Analysis']

Abstract

Bitcoin mixers break the visible trail between incoming and outgoing transactions. By severing the link between pre-mixing and post-mixing addresses, they provide anonymity that is attractive for laundering illicit funds. For investigators, this creates two obstacles: the vast number of outputs that overwhelm capacity, and the lack of knowledge of internal mixer mechanics that forces reliance on external transaction signals.

This thesis investigates whether transaction patterns before and after mixing can reduce the pool of possible post-mixing addresses linked to a pre-mixing address. The aim is not to prove exact one-to-one links but to narrow the search space so investigators can focus on the most likely outcomes.

We use a unique dataset seized from Bestmixer.io, a centralised mixer dismantled in 2019, containing thousands of verified pre- and post-mixing addresses. The analysis proceeds in two stages. First, we cluster wallets on address-level attributes using HDBSCAN, which yields only coarse profiles. Second, we build transaction graphs capturing how funds move through the mixer, learn graph embeddings with a Graph Autoencoder, and cluster them with k-means. This graph-based view reveals clearer transaction patterns. Pre-mixing, we identify consolidators pooling funds, straightforward depositors from exchanges, aggregator funnels combining smaller inputs, and higher-risk users via unregulated services. Post-mixing, we find splitters dispersing funds, large distributors sending bigger amounts to fewer addresses, and straightforward users with minimal redistribution.

We then test whether pre-mixing patterns can predict post-mixing outcomes. Using tree-based ensemble models (Random Forest and Gradient Boosting) with graph embeddings and the original deposit amount, the best model achieves 48 percent accuracy across five classes, more than double the 20 percent baseline. This demonstrates that transaction graph signals can probabilistically reduce the investigative search space.

The study provides the first ground-truth typology of mixer transaction patterns and shows that probabilistic “de-mixing” is feasible. Rather than pinpointing a single post-mixing address, the method highlights a smaller set of likely candidates, offering law enforcement a way to prioritise leads without access to a mixer’s internal mechanics.

Files

Thesis_PHM_de_Haan_Final.pdf

(pdf | 9.81 Mb)

License info not available