"uuid","repository link","title","author","contributor","publication year","abstract","subject topic","language","publication type","publisher","isbn","issn","patent","patent status","bibliographic note","access restriction","embargo date","faculty","department","research group","programme","project","coordinates"
"uuid:49f88587-c44b-46a5-b363-3dc2b6f21865","http://resolver.tudelft.nl/uuid:49f88587-c44b-46a5-b363-3dc2b6f21865","Anomaly Detection in WAAM Deposition of Nickel Alloys: Single-Material and Cross-Material Analysis","Rajesh, Aditya (TU Delft Mechanical, Maritime and Materials Engineering; TU Delft Materials Science and Engineering)","Hermans, M.J.M. (mentor); Ya, Wei (graduation committee); Delft University of Technology (degree granting institution)","2023","The current research work investigates the possibility of using machine learning models to deduce the relationship between WAAM (wire arc additive manufacturing) sensor responses and defect presence in the printed part. The work specifically focuses on three materials from the nickel alloy family – Inconel 718, Invar 36 and Inconel 625, and uses three sensor responses (welding voltage, welding current and welding audio) for predictions. A variety of types of prints, including ramp tests, single bead depositions, and walls were explored. Three different machine learning models are used – artificial neural networks (ANNs), K-Means clustering and random forests (RF), and the performances are compared. In addition to separate material analysis, cross-material predictions are conducted using two supervised models to investigate the prediction capabilities of such an approach. The results indicate that models are indeed capable of finding connections between welding parameters and defect formation, and the accuracies range from 60% to 90% and the correlation coefficient is less than 0.5 (indicating weak positive correlation) depending on the model and material. The cross-material predictions are significantly worse, with accuracies ranging from 20% to 27% and very weak correlation coefficients (less than 0.1). Analysis of the results indicates that the importance of audio sensor response depends on the nature of defect, and that additional sensors like spectrometers could give a wider range of information to cover more types of defects, potentially raising the performance of cross-material predictions. Between the models, random forest is found to perform the best overall, with ANNs coming in a close second. The versatility of ANNs indicates that increasing the dataset size and resolving the class imbalance could potentially tip the scales in the favor of ANNs.","Anomaly Detection; Nickel Alloys; GMAW; Wire arc additive manufacturing; Machine Learning","en","master thesis","","","","","","","","","","","","Materials Science and Engineering","",""
"uuid:df016cc5-bd42-4c01-b16d-6d4889246861","http://resolver.tudelft.nl/uuid:df016cc5-bd42-4c01-b16d-6d4889246861","Covert DNS Storage Channel Detection: Uncovering surreptitious data exchange using the phonebook of the internet","van Hal, Sven (TU Delft Electrical Engineering, Mathematics and Computer Science)","Verwer, S.E. (mentor); Lagendijk, R.L. (graduation committee); Aniche, Maurício (graduation committee); Vermeulen, B. (mentor); Delft University of Technology (degree granting institution)","2021","The cyber arms race has red and blue teams continuously at their toes to keep ahead. Increasingly capable cyber actors breach secure networks at a worrying scale. While network monitoring and analysis should identify blatant data exfiltration attempts, covert channels bypass these measures and facilitate surreptitious information extraction. The many legitimate uses and widespread availability of DNS, the ""phone book"" of the internet, make it an attractive protocol for such covert channels. Covert DNS storage channels encode information in the payload of outbound DNS queries.
This thesis aims to assess the effectiveness of using machine learning methods to detect covert DNS storage channels. Our literature survey identified distinct differences in 1) algorithm type, either unsupervised anomaly detection or supervised classification, and 2) the information source for features, either isolated DNS queries or query sequences.
We performed experiments with (Extended) Isolation Forest algorithms for anomaly detection and Random Forests for classification, combined with different feature set compositions to evaluate their relative performance. Payload-only features were derived from isolated queries and behavioral features were extracted from time-based or fixed-length sliding windows over per-domain query sequences. We evaluated our models using a large-scale corporate DNS dataset of real-world proportions and a novel dataset of connection tunneling traffic and simulated credit card exfiltration malware.
We found that the majority of experiments were able to achieve high detection rates of 98.6% or more on a variety of storage channel threats, at low false positive rates. Classification models significantly outperform anomaly detection models on threats seen during training. Evaluation on unseen threats, however, revealed that generalization is difficult, provided the limited set of training threats and showed anomaly detection models more capable at detecting a variety of threats than classification models. We furthermore showed that feature sets with a behavioral component consistently outperform payload-only features, although our experiments were inconclusive regarding the relative performance between composite feature sets.
Given the prevalence of benign storage channels misusing DNS for legitimate data transfer, we recommend rigorous filtering of training data beforehand to improve model optimization and evaluation. Furthermore, extending the malicious training set with DNS command-and-control (C2) malware is a promising future research direction to improve generalization of classification models.","Machine Learning; DNS storage channels; Anomaly Detection; Classification","en","master thesis","","","","","","","","","","","","","",""
"uuid:e7d67dee-2091-4b40-94af-a8be6f2ee66a","http://resolver.tudelft.nl/uuid:e7d67dee-2091-4b40-94af-a8be6f2ee66a","A personalized approach for communicating found anomalies in Netflow data to end-users","de Hoog, Dion (TU Delft Electrical Engineering, Mathematics and Computer Science; TU Delft Applied Sciences)","Panichella, A. (mentor); Wehrmann, C. (mentor); Kalmar, E. (graduation committee); Venkatesha Prasad, R.R. (graduation committee); Delft University of Technology (degree granting institution)","2021","In this research, we use different supervised and unsupervised machine learning techniques to detect anomalies in NetFlow data. We aim to create a system for home or small-business use where the user is in control. We use WEKA for the machine learning models and feature selection. The UGR’16 dataset is used to train and test the models. We create three different models for each method where the model is trained on one day and tested on another. We find that supervised models perform better than unsupervised ones. Random Forest has the highest F1-score (0.9165) of the supervised models. However, Random Forest is statistically similar to 6 out of 8 different classifiers according to the Friedman and Nemenyi tests. One challenge with supervised methods is that there is a need for a third party to make sure the models are updated for new attacks. However, it seems feasible to create a system for network monitoring for home usage. Finally, we argue that the approach to research in machine learning should, in some cases, take a different direction. Instead of chasing the highest accuracy, we should look at which factors allow a user to work with a system. A set of rules is much easier to interpret than complex models. Especially if these models are statistically similar we could look at factors other than a single metric. After these results, we look at presenting warning messages to end-users. We want to motivate users to take action after they read a message. For this, we first create a theoretical framework based on behaviour, motivation and uncertainties. We want to find out if we can use uncertainties to group users, which allows us to create a semi-personal approach. Based on this theoretical framework, we create different versions of a warning message that focus on different uncertainties. Through 22 interviews, we asked the participants to rank the different versions and asked them questions about their stance on network security and preferences for warning messages. We found that uncertainties can not be used to group users but that they do influence them. Including a solution and which steps to take offers users peace of mind, even if they are not able to complete the steps. The results indicate that a more personal approach is necessary where every person has the choice to customize the message to fit their preference.","NetFlow; Machine Learning; Science Communication; Interviews; Anomaly Detection; Computer Science; Monitoring system; Uncertainties; Behaviour; Motivation","en","master thesis","","","","","","degree of Master of Science in Computer Science & Science Communication","","","","","","Computer Science","",""
"uuid:8f6269fb-03c2-4067-a52e-0237387ddcef","http://resolver.tudelft.nl/uuid:8f6269fb-03c2-4067-a52e-0237387ddcef","Probabilistic Quantification of Airspace Resilience","Janowski, Jakub (TU Delft Aerospace Engineering)","Hoekstra, J.M. (mentor); Ellerbroek, J. (mentor); Sharpanskykh, Alexei (graduation committee); Udluft, Heiko (graduation committee); Sirigu, Giuseppe (graduation committee); Delft University of Technology (degree granting institution)","2020","The resilience of the Air Traffic Management (ATM) system to disturbances is required to maintain high operation performance. Before it can be improved, the resilience of the ATM system must be quantified. The measurement of resilience requires knowledge of a system reference state. This thesis proposes a novel methodology to detect disruptions without a pre-specified reference state and to quantify airspace resilience to disturbances. The method utilises residual-based anomaly detection to model a reference state based on historical values and detect deviations from it. The method has been tested in assessing the resilience of arrival time (airspace state)to high winds (disturbance) in 9 airports worldwide for a year. The results have shown that the method is capable of detecting disruptions as well as airports experiencing high wind conditions tend to be more resilient to them.","Resilience; Anomaly Detection; Disruptions; Bayesian; Regression; Air Traffic Management; Machine Learning; Key Performance Indicators; Robustness; Time in airspace; Arrival Time","en","master thesis","","","","","","","","2022-01-16","","","","Aerospace Engineering","",""
"uuid:b6bad7a5-0afd-4268-873d-32a4a18b4281","http://resolver.tudelft.nl/uuid:b6bad7a5-0afd-4268-873d-32a4a18b4281","Advanced Set Bounding Methods for Fault Detection","Ritsma, Folkert (TU Delft Mechanical, Maritime and Materials Engineering)","Ferrari, Riccardo (mentor); Al-Ars, Zaid (mentor); Delft University of Technology (degree granting institution)","2019","Performance of set based fault detection is highly dependent on the complexity of the set bounding methods used to bound the healthy residual set. Existing methods achieve robust performance with complex set bounding that narrowly define healthy system behavior, yet at the cost of higher computation times. In this thesis a major improvement is reached in both accuracy and computation time by applying machine learning methods to set bounding. A method is developed which achieves fault detection at several orders of magnitude the speed of an existing set based fault detection method without sacrificing a robust performance.","Fault Detection; Machine Learning; Anomaly Detection; Outlier Detection; Support Vector Machines; Model Based Fault Detection; Set Based Fault Detection","en","master thesis","","","","","","","","","","","","Mechanical Engineering | Systems and Control","",""
"uuid:dbf3e77c-e624-47ef-b951-3f1948b1609a","http://resolver.tudelft.nl/uuid:dbf3e77c-e624-47ef-b951-3f1948b1609a","Scalability Analysis of Predictive Maintenance Using Machine Learning in Oil Refineries","Helmiriawan, Helmi (TU Delft Electrical Engineering, Mathematics and Computer Science; TU Delft Quantum & Computer Engineering)","Al-Ars, Zaid (mentor); Delft University of Technology (degree granting institution)","2018","Modern refineries typically use a high number of sensors that generate an enormous amount of data about the condition of the plants. This generated data can be used to perform predictive maintenance, an approach to predict impending failures and mitigate downtime in refineries. This research analyzes the scalability of machine learning methods for predictive maintenance solution in an oil refinery. It can be done by modeling the normal behavior of the plant and use the prediction error to identify anomalies which might potentially become failures. Several methods and learning algorithms are explored in this research to model the normal behavior of multiple components in the plant. The experiments are performed by using historical process data from a crude distiller unit at Shell Pernis Refinery. The results show that the proposed approach using multiple targets model is able to predict multiple components in the plant. It is not only able to detect anomalies but also identify the faulty component. Furthermore, it reduces the required time to model the normal behavior of the plant which improves the scalability of the predictive maintenance approach in the refinery.","Machine Learning; Predictive Maintenance; Anomaly Detection; Deep Learning","en","master thesis","","","","","","","","2018-12-31","","","","Computer Science","",""
"uuid:9a35364f-89dc-4f31-84bd-072738b9c4e8","http://resolver.tudelft.nl/uuid:9a35364f-89dc-4f31-84bd-072738b9c4e8","Monitoring Release Logs at Adyen: Feature Extraction and Anomaly Detection","Lan, Yikai (TU Delft Electrical Engineering, Mathematics and Computer Science)","van Deursen, A. (mentor); Verwer, S.E. (mentor); Tax, D.M.J. (graduation committee); Huibers, Pieter (mentor); Delft University of Technology (degree granting institution)","2018","Monitoring the release logs of modern online software is a challenging topic because of the enormous amount of release logs and the complicated release process. The goal of this thesis is to develop a pipeline that can monitor the release logs and find anomalous logs, automating this step with anomaly detection and reducing the required manual effort. We improve the pipeline from the recent work of Microsoft, enabling it to monitor logs with different severity levels and extremely long sequences.
We first use IPLoM and its reconciling step for raw logs to obtain log events and then use log event sets, a simplified version of log sequences, for anomaly detection. The outlier scores of log event sets are calculated using anomaly detection algorithms, and those with an outlier score higher than the threshold are clustered to reduce the number of output. In the final output result, we propose two ranking functions to sort the potential anomalous clusters and only show the top 10 results. Another complementary step beside anomaly detection is designed to capture recurrent anomalies in known clusters that have seen before. By finding the optimal parameters for hierarchical clustering, nearest neighbor distance, and LOF, we test the performance of pipeline on Adyen log data and make our suggestions. Finally, we also test the robustness of the pipeline with two types of artificial data sets.","Log Analysis; Anomaly Detection; Machine Learning","en","master thesis","","","","","","","","","","","","Computer Science | Software Technology","",""