Detection of algorithmically-generated domains

An adversarial machine learning approach

More Info
expand_more

Abstract

Domain name detection techniques are widely used to detect Algorithmically Generated Domain names (AGD) applied by Botnets. A major difficulty with these algorithms is to detect those generated names which are meaningful. In this way, Command and Control (C2) servers are detected. Machine learning techniques have been of great use to generalize the attributes of the meaningful names, generated algorithmically. To resist such techniques, the distribution of characters is used as a basis to generate meaningful domain names. Such techniques are called adversarial attacks attempting to fool machine learning methods. However, our experiments with more than 252757 samples show that in addition to character distribution of domain names, randomness property and pronounceability attributes are of great use to detect such meaningful names. Using these additional attributes, we have been able to identify malicious domain names with an accuracy of 98.19%.