AC

Alejandro Cuevas

info

Please Note

2 records found

Conference paper (2022) - Alejandro Cuevas, F.E.G. Miedema, Kyle Soska, Nicolas Christin, R.S. van Wegberg
A number of recent studies have investigated online anony- mous (“dark web”) marketplaces. Almost all leverage a “measurement-by-proxy” design, in which researchers scrape market public pages, and take buyer reviews as a proxy for ac- tual transactions, to gain insights into market size and revenue. Yet, we do not know if and how this method biases results. We build a framework to reason about marketplace mea- surement accuracy, and use it to contrast estimates projected from scrapes of Hansa Market with data from a back-end database seized by the police. We further investigate, by sim- ulation, the impact of scraping frequency, consistency and rate-limits. We find that, even with a decent scraping regimen, one might miss approximately 46% of objects – with scraped listings differing significantly from not-scraped listings on price, views and product categories. This bias also impacts revenue calculations. We find Hansa’s total market revenue to be US $50M, which projections based on our scrapes un- derestimate by a factor of four. Simulations further show that studies based on one or two scrapes are likely to suffer from a very poor coverage (on average, 14% to 30%, respectively). A high scraping frequency is crucial to achieve reliable coverage, even without a consistent scraping routine. When high-frequency scraping is difficult, e.g., due to deployed anti- scraping countermeasures, innovative scraper design, such as scraping most popular listings first, helps improve cover- age. Finally, abundance estimators can provide insights on population coverage when population sizes are unknown. ...

Systematizing advances in network measurements for protecting organizations

Conference paper (2021) - Mathew Vermeer, Jonathan West, Alejandro Cuevas, Shuonan Niu, Nicolas Christin, Michel Van Eeten, Carlos Gañán, Tyler Moore, T. Fiebig
Asset discovery is fundamental to any organization's cybersecurity efforts. Indeed, one must accurately know which assets belong to an IT infrastructure before the infrastructure can be secured. While practitioners typically rely on a relatively small set of well-known techniques, the academic literature on the subject is voluminous. In particular, the Internet measurement research community has devised a number of asset discovery techniques to support many measurement studies over the past five years. In this paper, we systematize asset discovery techniques by constructing a framework that comprehensively captures how network identifiers and services are found. We extract asset discovery techniques from recent academic literature in security and networking and place them into the systematized framework. We then demonstrate how to apply the framework to several case studies of asset discovery workflows, which could aid research reproducibility. These case studies further suggest opportunities for researchers and practitioners to uncover and identify more assets than might be possible with traditional techniques. ...