Analysis of sequential feature engineering and statistical features for malware behavior discovery

Malware Packet-sequence Clustering and Analysis (MalPaCA) is a unsupervised clustering application for malicious network behavior, it currently uses solely sequential features to characterize network behavior. In this paper an extensive comparison between those features and statistical features is performed. During the comparison a better clustering performance achievable with statistical features for longer connection sequences is shown and advice on which features can be added to MalPaCA.