VI Reflections: AI-Powered Behavior Analysis for Cybersecurity

By dgoff

Artificial Intelligence (AI)-powered behavioral analysis leverages AI to learn and predict adversarial behavior patterns. It is becoming increasingly necessary and widespread; according to one source, 93% of Security Operations Centers currently use some form of AI to conduct behavioral analytics. 

Behavioral analysis observes activity within a system to discern between normal behavior and atypical or anomalous activity and identify potential threats. Traditional cybersecurity methods relied on predefined, rules-based systems or signature-based detection. These methods can be adept at identifying known threats, but they struggle to detect new, previously unseen cyberattacks, such as zero-day exploits or sophisticated, slow-moving threats. Adversaries constantly evolve methods and tactics to evade detection, stealthily blending into the noise of everyday activity and finding ways to gain access to internal environments at increasingly faster speeds. Finding them is more complicated when adversaries embrace malware-free attacks or use stolen credentials to impersonate valid users. The sheer volume of data generated by modern networked systems can overwhelm traditional security technologies, making it difficult to rapidly analyze telemetry against emerging threat intelligence to detect early signs of adversary presence.

In machines, the AI input is exemplified by natural language processing, speech recognition, visual recognition, and more. Machines create knowledge representations such as graph databases and ontologies that help them store information.  Just as humans make decisions or draw inferences, machines can make a prediction, optimize for a target or outcome, and determine the best next steps or decisions to meet a specific goal. Machines can be taught using analogous methods. 

Supervised Machine Learning (ML) is much like learning by example. The computer is given a dataset with "labels" that act as answers, and eventually learns to tell the difference between different labels. One example is a dataset that contains photos labeled as either "dog" or "cat." With enough examples, the computer will notice that dogs generally have longer tails and less pointy ears than cats.

Unsupervised ML is more like learning by observation.  The computer observes patterns.  In this example it determines that dogs bark and cats meow. Through this, it learns to distinguish groups and patterns on its own. There are two groups of animals that can be separated by the sounds they make;  dogs bark and cats meow. Unsupervised learning requires no labels and can therefore be preferable when data sets are limited and lack labels.

Learning by algorithm occurs when a programmer instructs a computer exactly what to do, step-by-step, in a software program. Ideally, the most accurate and efficient AI results require a combination of learning methods. Both supervised and unsupervised ML can be useful methods if the right approach is applied.

In cybersecurity, behavioral analysis provides a layer of defense that activates at runtime to review activity that may have evaded detection from earlier defenses, such as sensor-based ML, memory scans, or signatures. One of the biggest hindrances to applying this at enterprise scale has been having the computing resources and high-fidelity telemetry necessary to effectively conduct behavioral analyses. 

In one method, unsupervised algorithms establish a baseline for normal behavior in unlabeled test sets based on the intrinsic characteristics of each value rather than any pre-determined examples of normality. This approach enables unsupervised machine algorithms to detect previously unseen anomalies, like complex network problems.

The most common analytics involve predictive models, which allow identification where risks might be within large amounts of data such as anomaly detection. Predictive modeling basically combines historical data with real-time behavior to understand or predict future behavior.

Historically, cybersecurity has used rule-driven frameworks to detect potential cyber threats. One example of this is if a large amount of data is downloaded in the middle of the night. This action might trigger a rule violation which would alert the security team. Behavioral analytics enables a people-centric defense by using complex ML algorithms to analyze user and entity data across an enterprise to identify unexpected behavior that may indicate a security breach. The rule-based approach is still an important part of a layered analytics security method. But smart hackers can avoid triggering many of the rules that are set up in these systems and finding them acting in a malicious manner is difficult. 

These are known as insider threats. Security programs can use ML to identify anomalous behavior that may indicate these insider threats.  One of the biggest applications of behavioral analytics in security is detecting them. Such attacks may be motivated by ego, monetary gain or retribution. Since employees already have access to sensitive information that they use in their job, no hacking is required to steal that information. Security rules are often not triggered. Behavioral analytics can be used to go the next step to identify and alert the security team to unusual behavior exhibited. 

Behavioral analytics can also detect Advanced Persistent Threats (APTs).  APTs occur when a hacker gains access to an organization’s server for an extended period of time. These attacks are especially difficult to detect using conventional methods because APTs are consciously designed to avoid triggering common rules to ensure long term access. Behavioral analytics, however, can detect APTs since their algorithms can monitor out of the ordinary activity that would be exhibited by APT’s.

Behavioral analytics is often called user and entity behavior analytics or UEBA.  An application of UEBA software is detecting zero-day attacks. Zero-day attacks are new attacks that have not been used before and therefore will have no rules written to detect them. Because behavioral analysis uses previous behavioral data to evaluate what is not normal, these new attacks can often be detected because they generally use new and unusual executables and methods to breach security.

The large volume of behavioral data collected makes it difficult to effectively analyze on an individual level. ML is a technique to use data to train algorithms to predict a value or classify data. ML is especially helpful to analyze big data since the algorithms can process vast amounts of data in contrast to humans which can only process small amounts of data at a time.

Some recent articles on AI and Behavioral Analytics:

J. Kaur, K. Kaur, S. Kant and S. Das, "UEBA with Log Analytics," 2022 3rd International Conference on Computing, Analytics and Networks (ICAN), Rajpura, Punjab, India, 2022, pp. 1-7, doi: 10.1109/ICAN56228.2022.10007245. 

J. Squillace and M. Bantan, "A Taxonomy of Privacy, Trust, and Security Breach Incidents of Internet-of-Things Linked to F(M).A.A.N.G. Corporations," 2022 IEEE World AI IoT Congress (AIIoT), Seattle, WA, USA, 2022, pp. 591-596, doi: 10.1109/AIIoT54504.2022.9817225. 

S. Bains, S. Gupta, K. Joshi, B. Kothapalli, S. Sharma and A. Dutt, "Quantum Computing in Cybersecurity: An in-Depth Analysis of Risks and Solutions," 2023 3rd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), Greater Noida, India, 2023, pp. 1651-1654, doi: 10.1109/ICACITE57410.2023.10183060. 

P. Khatarkar, D. P. Singh and A. Sharma, "Machine Learning Protocols for Enhanced Cloud Network Security," 2023 IEEE International Conference on ICT in Business Industry & Government (ICTBIG), Indore, India, 2023, pp. 1-6, doi: 10.1109/ICTBIG59752.2023.10456016. 

M. Ramish, A. Sinha, J. Desai, A. Raj, Y. S. Rajawat and P. Punia, "IT Attack Detection and Classification using Users Event Log Feature And Behavior Analytics through Fourier EEG Signal," 2022 IEEE 11th International Conference on Communication Systems and Network Technologies (CSNT), Indore, India, 2022, pp. 577-582, doi: 10.1109/CSNT54456.2022.9787637. 

M. Majdalawieh, A. B. Hani, H. Al-Sabbah, O. Adedugbe and E. Benkhelifa, "A Cloud-Native Knowledge Management Framework for Patient-Generated Health Data," 2023 Tenth International Conference on Social Networks Analysis, Management and Security (SNAMS), Abu Dhabi, United Arab Emirates, 2023, pp. 1-7, doi: 10.1109/SNAMS60348.2023.10375469. 

A. M. Awadallah, E. Damiani, J. Zemerly and C. Y. Yeun, "Identity Threats in the Metaverse and Future Research Opportunities," 2023 International Conference on Business Analytics for Technology and Security (ICBATS), Dubai, United Arab Emirates, 2023, pp. 1-6, doi: 10.1109/ICBATS57792.2023.10111122. 

G. M. S. Hossain, K. Deb, H. Janicke and I. H. Sarker, "PDF Malware Detection: Toward Machine Learning Modeling With Explainability Analysis," in IEEE Access, vol. 12, pp. 13833-13859, 2024, doi: 10.1109/ACCESS.2024.3357620.

A. Kumari, R. Dubey and I. Sharma, "ShNP: Shielding Nuclear Plants from Cyber Attacks Using Artificial Intelligence Techniques," 2023 Annual International Conference on Emerging Research Areas: International Conference on Intelligent Systems (AICERA/ICIS), Kanjirapally, India, 2023, pp. 1-6, doi: 10.1109/AICERA/ICIS59538.2023.10420386. 

To see previous articles, please visit the VI Reflections Archive.

Submitted by Gregory Rigby on