It is difficult to detect compromised devices sending spam because network connections related to spam are usually short-lived and can occur randomly, and because legitimate corporate email campaigns or peaks of activity in email servers can be misinterpreted.
We use a supervised machine-learning method to automatically analyze outgoing messages and determine whether they are legitimate or spam. The method analyzes the frequency of outgoing messages from a given device, several properties of outgoing messages, and other variables related to the principle of locality.
For example, a server performing legitimate activity tends to connect to a predictable number of endpoints, and an anomalous increase might indicate that the device has been compromised. The method also analyzes email subject fields to look for suspicious terminology, and the list of recipients to look for large numbers of recipients, and non-corporate or generic free email services.
Preliminary results demonstrate highly accurate detection of compromised devices sending spam, computational efficiency, and the ability to detect outgoing spam almost immediately.