Consequently, determining whether an external data upload is unusual by looking for previous uploads to the same endpoint is not sufficient, because external services may have additional endpoints that have been observed but not associated with the service. Therefore, the service itself needs to be characterized.
We can characterize a service by finding data-upload connections with common properties, such as identical or similar hostnames, identical JA3 client hashes, or identical ASNs. By identifying the dominant transfer endpoints involved in a single upload event, remaining external connections made by the device can be successively restricted based on various properties of those in the dominant group, until a plausible characterization of an external service is made.
We can then search for previous connections and uploads to the same service using these properties rather than specific hostnames or endpoints and associate observed endpoints with services, even if the hostnames are different.
This prevents legitimate uploads to commonly-used services from being identified as possible cases of exfiltration, even if these uploads are not always to the same endpoint, while also enabling precise characterization of malicious exfiltration patterns.