Communications of the IIMA



Cyber threat intelligence (CTI) is an actionable information or insight an organization uses to understand potential vulnerabilities it does have and threats it is facing. One important CTI for proactive cyber defense is exploit type with possible values system, web, network, website or Mobile. This study compares the performance of machine learning algorithms in predicating exploit types using form posts in the dark web, which is a semi- structured dataset collected from dark web. The study uses the CRISP data science approach. The results of the study show that machine learning algorithms which are function-based including support vector machine and deep-learning using artificial neural network are more accurate than those algorithms which are based on tree including Random Forest and Decision-Tree for CTI discovery from semi-structured dataset. Future research will include the use of high-performance computing and advanced deep-learning algorithms.