Track Chair: Tom Longstaff
Within a compromised environment, anomalous behavior has been used to identify indicators of a wide variety of faults, both natural and malicious. Much of the R&D performed to address both the analysis of anomalies and the attribution of the behavior has focused on individual variations from specified or learned normal behavior along a single attribute or sensor (e.g., anomalous network traffic patterns). Human analysts often identify a series of intersecting anomalies, anomalies that may be related to the same behavior of interest, to gain a gestalt of the activity. These insights are particularly difficult to automatically identify and combine, but there may be techniques that could be applied if we consider the problem from a wide variety of potential sources intersecting about a common behavior. With our ability to handle “big data,” we have the opportunity to discover anomalies from a wider variety of sources.
Intersecting anomalies need not be strictly passive, but where there are hypothesized intersections of anomalies, we may generate specific stimulus events that help converge anomalies and more easily associate many anomalies to common adversary behavior. For example, cluster analytics may be appropriate to partitioning these intersecting anomalies and help to guide the appropriate generated events within the compromised system toward convergence.
Some questions related to this track are:
· What are the optimal sets of anomaly sources and types that can be combined to identify an imbedded adversary?
· What algorithms are appropriate in discovering the relationships between different anomaly types?
· How does “big data” help to enrich our use of anomaly identification and analysis? Can we use scale to our benefit?
· Do intersecting anomalies help us to uniquely identify adversary behavior when most of what the adversary does is identical to normal behavior?
· How can we “game” the system to help deconflict/cluster anomalies that clarify the normal and anomalous system behavior?
Can we combine multiple detected anomalies at scale and near-real time to address attacks in the real world? What are the barriers to successful application of such an approach?