Abstract
Considerable attention has been given to the vulnerability of machine learning
to adversarial samples. This is particularly critical in anomaly detection; uses such
as detecting fraud, intrusion, and malware must assume a malicious adversary. We
specically address poisoning attacks, where the adversary injects carefully crafted benign
samples into the data, leading to concept drift that causes the anomaly detection
to misclassify the actual attack as benign. Our goal is to estimate the vulnerability
of an anomaly detection method to an unknown attack, in particular the expected
minimum number of poison samples the adversary would need to succeed. Such an
estimate is a necessary step in risk analysis: do we expect the anomaly detection to
be suciently robust to be useful in the face of attacks? We analyze DBSCAN, LOF,
one-class SVM as an anomaly detection method, and derive estimates for robustness
to poisoning attacks. The analytical estimates are validated against the number of
poison samples needed for the actual anomalies in standard anomaly detection test
datasets. We then develop defense mechanism, based on the concept drift caused by
the poisonous points, to identify that an attack is underway. We show that while it
is possible to detect the attacks, it leads to a degradation in the performance of the
anomaly detection method. Finally, we investigate whether the generated adversarial
samples for one anomaly detection method transfer to another anomaly detection
method.