Clustering Payloads

Grouping Randomized Scan Probes Into Campaign Templates

More Info
expand_more

Abstract

Over the past decade, the scanning landscape has significantly changed. Powerful tools such as Masscan or Zmap allow anyone to scan the entire Internet in a matter of hours. Simultaneously, we witnessed the emergence of stealthy scanners, which map the Internet from thousands of vantage points at a low rate attempting to forego detection. As scanning is typically the first step towards later intrusion, organizations need to track, understand and draw intelligence from these scan campaigns. Organizations benefit from obtaining insights into what adversaries are currently looking for, which might reveal some new vulnerabilities. Furthermore, relating IP addresses with each other participating in scan campaigns provides valuable insights into the adversary's capabilities. In this paper, we describe a protocol-agnostic approach to extract commonalities and patterns from UDP scan traffic, relate individual scan packets regardless of whether they are sending static data or randomizing their payloads across destinations, and obtain 97% pattern accuracy with a data coverage of 96%. We apply our methodology on seven years of NTP and DNS scan traffic demonstrating that our automatic clustering provides stable tracking of strategies over time and identifies groups of source IPs with these behavioral characteristics effectively.