IP
I. Pejić
info
Please Note
<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>
2 records found
1
Organisations are becoming more conscious and deploying more and more security tools to ensure they are adequately protected against cyber-attacks. That means two things: (i) those extra tools inherently augment companies’ attack surface, and (ii) the Security Operations Centre (SOC) gets overwhelmed with the number of false positives those tools generate – leading to attack fatigue. In many cases, the SOC team cannot get through all alerts properly, allowing potential attacks to go unnoticed or be caught much later. Moreover, within a typical CISO organisation, the analysis of “attack” and “defence” data is done somewhat in silos. That means vulnerability data, red-team exercises, and the several available defence tooling data are not looked at as one.
Our work proposes an innovative way to bridge the gap between vulnerability data (CVEs) and security alert data originating from multiple security tools that protect servers using MITRE ATT&CK tactics. That would provide more context to the alerts which would be useful in their classification as attacks or false positives. We use DeBERTa (Decoding-enhanced BERT with Disentangled Attention), a deeplearning state-of-the-art model, to map CVE descriptions to MITRE ATT&CK tactics. Then, we map security alerts to MITRE ATT&CK tactics, which will be used as input to a context-enriched machinelearning model (by CVEs and tactics). That machine-learning model is used to classify security alerts as malicious or benign. We tested our approach using over 5.5 million security alert data combined with red-team exercise attacks and incident response labelling from the company, a large international organization with over 60,000 employees. Our CVE+tactic model (without hyperparameter tuning) detects 64% more true positives than the machine-learning model without that information. In addition, the SOC needs to investigate less than 1400 alerts to catch the red-team attacks in our test set, compared to more than 5500 generated by the model without CVE and tactics. Moreover, assuming a standard response time of 8 minutes per alert, this improved model would save the SOC team up to 550 person hours. That yields a model that catches red-team attacks without overwhelming the SOC with too many false positives. ...
Our work proposes an innovative way to bridge the gap between vulnerability data (CVEs) and security alert data originating from multiple security tools that protect servers using MITRE ATT&CK tactics. That would provide more context to the alerts which would be useful in their classification as attacks or false positives. We use DeBERTa (Decoding-enhanced BERT with Disentangled Attention), a deeplearning state-of-the-art model, to map CVE descriptions to MITRE ATT&CK tactics. Then, we map security alerts to MITRE ATT&CK tactics, which will be used as input to a context-enriched machinelearning model (by CVEs and tactics). That machine-learning model is used to classify security alerts as malicious or benign. We tested our approach using over 5.5 million security alert data combined with red-team exercise attacks and incident response labelling from the company, a large international organization with over 60,000 employees. Our CVE+tactic model (without hyperparameter tuning) detects 64% more true positives than the machine-learning model without that information. In addition, the SOC needs to investigate less than 1400 alerts to catch the red-team attacks in our test set, compared to more than 5500 generated by the model without CVE and tactics. Moreover, assuming a standard response time of 8 minutes per alert, this improved model would save the SOC team up to 550 person hours. That yields a model that catches red-team attacks without overwhelming the SOC with too many false positives. ...
Organisations are becoming more conscious and deploying more and more security tools to ensure they are adequately protected against cyber-attacks. That means two things: (i) those extra tools inherently augment companies’ attack surface, and (ii) the Security Operations Centre (SOC) gets overwhelmed with the number of false positives those tools generate – leading to attack fatigue. In many cases, the SOC team cannot get through all alerts properly, allowing potential attacks to go unnoticed or be caught much later. Moreover, within a typical CISO organisation, the analysis of “attack” and “defence” data is done somewhat in silos. That means vulnerability data, red-team exercises, and the several available defence tooling data are not looked at as one.
Our work proposes an innovative way to bridge the gap between vulnerability data (CVEs) and security alert data originating from multiple security tools that protect servers using MITRE ATT&CK tactics. That would provide more context to the alerts which would be useful in their classification as attacks or false positives. We use DeBERTa (Decoding-enhanced BERT with Disentangled Attention), a deeplearning state-of-the-art model, to map CVE descriptions to MITRE ATT&CK tactics. Then, we map security alerts to MITRE ATT&CK tactics, which will be used as input to a context-enriched machinelearning model (by CVEs and tactics). That machine-learning model is used to classify security alerts as malicious or benign. We tested our approach using over 5.5 million security alert data combined with red-team exercise attacks and incident response labelling from the company, a large international organization with over 60,000 employees. Our CVE+tactic model (without hyperparameter tuning) detects 64% more true positives than the machine-learning model without that information. In addition, the SOC needs to investigate less than 1400 alerts to catch the red-team attacks in our test set, compared to more than 5500 generated by the model without CVE and tactics. Moreover, assuming a standard response time of 8 minutes per alert, this improved model would save the SOC team up to 550 person hours. That yields a model that catches red-team attacks without overwhelming the SOC with too many false positives.
Our work proposes an innovative way to bridge the gap between vulnerability data (CVEs) and security alert data originating from multiple security tools that protect servers using MITRE ATT&CK tactics. That would provide more context to the alerts which would be useful in their classification as attacks or false positives. We use DeBERTa (Decoding-enhanced BERT with Disentangled Attention), a deeplearning state-of-the-art model, to map CVE descriptions to MITRE ATT&CK tactics. Then, we map security alerts to MITRE ATT&CK tactics, which will be used as input to a context-enriched machinelearning model (by CVEs and tactics). That machine-learning model is used to classify security alerts as malicious or benign. We tested our approach using over 5.5 million security alert data combined with red-team exercise attacks and incident response labelling from the company, a large international organization with over 60,000 employees. Our CVE+tactic model (without hyperparameter tuning) detects 64% more true positives than the machine-learning model without that information. In addition, the SOC needs to investigate less than 1400 alerts to catch the red-team attacks in our test set, compared to more than 5500 generated by the model without CVE and tactics. Moreover, assuming a standard response time of 8 minutes per alert, this improved model would save the SOC team up to 550 person hours. That yields a model that catches red-team attacks without overwhelming the SOC with too many false positives.
A Generative Adversarial Network (GAN) is a deep-learning generative model in the field of Ma- chine Learning (ML) that involves training two Neural Networks (NN) using a sizable data set. In certain fields, such as medicine, the data involved in training may be hospital patient records that are stored across different hospitals. The classic cen- tralized implementation would involve sending the data to a centralized server where the model would be trained. However, that would involve breach- ing the privacy and confidentiality of the patients and their data, and would be unacceptable. There- fore, Federated Learning (FL), a ML technique that trains ML models in a distributed setting without data every leaving the host device, would be a bet- ter alternative to the centralized option. In this ML technique, only parameters and certain meta- data would be communicated. In spite of that, there still exist attacks that can infer user data using the parameters and metadata. A fully privacy preserv- ing solution involves homomorphically encrypting (HE) the data communicated. This paper will focus on the performance loss of training a FL-GAN with three different types of homomorphic encryption: Partial Homomorphic Encryption (PHE), Some- what Homomorphic Encryption (SHE), and Fully Homomorphic Encryption (FHE). We will also test the performance loss of Multi Party Computations (MPC), as it has homomorphic properties. The per- formance will be compared to the performance of training an FL-GAN without encryption. Our ex- periments show that the more complex the encryp- tion method is, the longer it takes, with the extra time taken for HE being quite significant in com- parison to the base case of FL.
...
A Generative Adversarial Network (GAN) is a deep-learning generative model in the field of Ma- chine Learning (ML) that involves training two Neural Networks (NN) using a sizable data set. In certain fields, such as medicine, the data involved in training may be hospital patient records that are stored across different hospitals. The classic cen- tralized implementation would involve sending the data to a centralized server where the model would be trained. However, that would involve breach- ing the privacy and confidentiality of the patients and their data, and would be unacceptable. There- fore, Federated Learning (FL), a ML technique that trains ML models in a distributed setting without data every leaving the host device, would be a bet- ter alternative to the centralized option. In this ML technique, only parameters and certain meta- data would be communicated. In spite of that, there still exist attacks that can infer user data using the parameters and metadata. A fully privacy preserv- ing solution involves homomorphically encrypting (HE) the data communicated. This paper will focus on the performance loss of training a FL-GAN with three different types of homomorphic encryption: Partial Homomorphic Encryption (PHE), Some- what Homomorphic Encryption (SHE), and Fully Homomorphic Encryption (FHE). We will also test the performance loss of Multi Party Computations (MPC), as it has homomorphic properties. The per- formance will be compared to the performance of training an FL-GAN without encryption. Our ex- periments show that the more complex the encryp- tion method is, the longer it takes, with the extra time taken for HE being quite significant in com- parison to the base case of FL.