Topic Modeling for Cyber Threat Intelligence (CTI)
Author
Abstract

Topic modeling algorithms from the natural language processing (NLP) discipline have been used for various applications. For instance, topic modeling for the product recommendation systems in the e-commerce systems. In this paper, we briefly reviewed topic modeling applications and then described our proposed idea of utilizing topic modeling approaches for cyber threat intelligence (CTI) applications. We improved the previous work by implementing BERTopic and Top2Vec approaches, enabling users to select their preferred pre-trained text/sentence embedding model, and supporting various languages. We implemented our proposed idea as the new topic modeling module for the Open Web Application Security Project (OWASP) Maryam: Open-Source Intelligence (OSINT) framework. We also described our experiment results using a leaked hacker forum dataset (nulled.io) to attract more researchers and open-source communities to participate in the Maryam project of OWASP Foundation.

Year of Publication
2022
Conference Name
2022 Seventh International Conference on Informatics and Computing (ICIC)
Google Scholar | BibTeX