AD

A. Dekhovich

info

Please Note

8 records found

Conference paper (2025) - Aleksandr Dekhovich, Oleg Soloviev
Monitoring manufacturing processes plays an important role in chip production. Current state-of-the-art approaches use the entire surface to classify defects with CNN- or Transformer-based models, resulting in considerable mea-surement costs. Therefore, new advanced techniques are required to reduce the cost of inspection. In this work, we ad-vocate for the reinforcement learning-based feedback loop with a classifier trained with supervised contrastive loss. In contrast to previous works in this manner, our approach is not limited to only one type of defect but can identify multiple defects on one wafer. We tested our algorithm on the publicly available WM-811 k and MixedWM38 datasets, showing a significant reduction in scanning time compared to CNN-based approaches while maintaining similar accu-racy. We demonstrate the reduction of up to 40% in costs as-sociated with wafer scanning in defect classification tasks, even if multiple defects are on the surface. Moreover, we demonstrate that in the multi-defect scenario, the trained model can be directly used to detect outliers, requiring only about 12.5% of the surface to find at least one type of defect. ...
Wafer map defect recognition is a vital part of the semiconductor manufacturing process that requires a high level of precision. Measurement tools in such manufacturing systems can scan only a small region (patch) of the map at a time. However, this can be resource-intensive and lead to unnecessary additional costs if the full wafer map is measured. Instead, selective sparse measurements of the image save a considerable amount of resources (e.g. scanning time). Therefore, in this work, we propose a feedback loop approach for wafer map defect recognition. The algorithm aims to find sequentially the most informative regions in the image based on previously acquired ones and make a prediction of a defect type by having only these partial observations without scanning the full wafer map. To achieve our goal, we introduce a reinforcement learning-based measurement acquisition process and recurrent neural network-based classifier that takes the sequence of these measurements as an input. Additionally, we employ an ensemble technique to increase the accuracy of the prediction. As a result, we reduce the need for scanned patches by 38% having higher accuracy than the conventional convolutional neural network-based approach on a publicly available WM-811k dataset. ...
Journal article (2024) - Aleksandr Dekhovich, Marcel H.F. Sluiter, David M.J. Tax, Miguel A. Bessa
Physics-informed neural networks (PINNs) have recently become a powerful tool for solving partial differential equations (PDEs). However, finding a set of neural network parameters that fulfill a PDE at the boundary and within the domain of interest can be challenging and non-unique due to the complexity of the loss landscape that needs to be traversed. Although a variety of multi-task learning and transfer learning approaches have been proposed to overcome these issues, no incremental training procedure has been proposed for PINNs. As demonstrated herein, by developing incremental PINNs (iPINNs) we can effectively mitigate such training challenges and learn multiple tasks (equations) sequentially without additional parameters for new tasks. Interestingly, we show that this also improves performance for every equation in the sequence. Our approach learns multiple PDEs starting from the simplest one by creating its own subnetwork for each PDE and allowing each subnetwork to overlap with previously learned subnetworks. We demonstrate that previous subnetworks are a good initialization for a new equation if PDEs share similarities. We also show that iPINNs achieve lower prediction error than regular PINNs for two different scenarios: (1) learning a family of equations (e.g., 1-D convection PDE); and (2) learning PDEs resulting from a combination of processes (e.g., 1-D reaction–diffusion PDE). The ability to learn all problems with a single network together with learning more complex PDEs with better generalization than regular PINNs will open new avenues in this field. ...
Journal article (2024) - Aleksandr Dekhovich, Miguel A. Bessa
We introduce a new continual (or lifelong) learning algorithm called LDA-CP &S that performs segmentation tasks without undergoing catastrophic forgetting. The method is applied to two different surface defect segmentation problems that are learned incrementally, i.e., providing data about one type of defect at a time, while still being capable of predicting every defect that was seen previously. Our method creates a defect-related subnetwork for each defect type via iterative pruning and trains a classifier based on linear discriminant analysis (LDA). At the inference stage, we first predict the defect type with LDA and then predict the surface defects using the selected subnetwork. We compare our method with other continual learning methods showing a significant improvement – mean Intersection over Union better by a factor of two when compared to existing methods on both datasets. Importantly, our approach shows comparable results with joint training when all the training data (all defects) are seen simultaneously. ...
Doctoral thesis (2024) - A. Dekhovich, M.H.F. Sluiter, D.M.J. Tax
Deep learning models have made enormous strides over the past decade. However, they still have some disadvantages when dealing with changing data streams. One of these flaws is the phenomenon called catastrophic forgetting. It occurs when a model learns multiple tasks sequentially, having access only to the data of the current task. However, this scenario has strong implications for real-world machine learning and engineering problems where new information is introduced into the system over time. Continual learning is a subfield of deep learning that aims to work in this scenario. Therefore, this thesis presents a general continual learning paradigm to tackle the catastrophic forgetting issue in deep learning models, regardless of architecture.

Following ideas from the neuroscience literature, we create task-specific regions in the network, i.e. subnetworks, to encode information there. Thus, some parameters are responsible for solving this task, which mitigates forgetting compared to conventional training where the trainable parameters are simultaneously assigned to all tasks. A proper subnetwork should be then selected by the algorithm to make a prediction or information about the correct subnetwork must be given by the user. The subnetworks can share some connections to transfer knowledge between each other and facilitate future learning.

In the first part of the thesis, we describe the proposed methodology: task-specific subnetwork creation during training and the proper subnetwork selection during inference stages. We examine different subnetwork prediction strategies outlining their advantages and disadvantages. We validate the proposed algorithms on a series of well-known image datasets in computer vision in classification and semantic segmentation tasks. The proposed solution significantly outperforms current state-of-the-art methods by 10-20\% of accuracy.

The second part of the thesis illustrates the benefits of cooperative learning via continual learning in physical sciences and solid mechanic examples. We demonstrate that by sharing parameters, the following subnetwork can be trained either with lower prediction error, requiring fewer training data points, or both, compared to conventional training with one network per task. Importantly, the model does not forget any of the acquired knowledge since once a parameter is assigned to a subnetwork, it is not changed when training new tasks. We would like to highlight the potential importance of further development of continual learning methods in engineering to improve the generalization capabilities of the models.

The thesis concludes by discussing the main results and findings. We also outline the main limitations of the work and directions for improvement. Further development of continual learning models will lead to more advanced artificial intelligence systems that should contribute to solving a wider range of problems. ...
Journal article (2024) - Aleksandr Dekhovich, David M.J. Tax, Marcel H.F. Sluiter, Miguel A. Bessa
Current deep neural networks (DNNs) are overparameterized and use most of their neuronal connections during inference for each task. The human brain, however, developed specialized regions for different tasks and performs inference with a small fraction of its neuronal connections. We propose an iterative pruning strategy introducing a simple importance-score metric that deactivates unimportant connections, tackling overparameterization in DNNs and modulating the firing patterns. The aim is to find the smallest number of connections that is still capable of solving a given task with comparable accuracy, i.e. a simpler subnetwork. We achieve comparable performance for LeNet architectures on MNIST, and significantly higher parameter compression than state-of-the-art algorithms for VGG and ResNet architectures on CIFAR-10/100 and Tiny-ImageNet. Our approach also performs well for the two different optimizers considered—Adam and SGD. The algorithm is not designed to minimize FLOPs when considering current hardware and software implementations, although it performs reasonably when compared to the state of the art. ...
Data-driven modeling in mechanics is evolving rapidly based on recent machine learning advances, especially on artificial neural networks. As the field matures, new data and models created by different groups become available, opening possibilities for cooperative modeling. However, artificial neural networks suffer from catastrophic forgetting, i.e. they forget how to perform an old task when trained on a new one. This hinders cooperation because adapting an existing model for a new task affects the performance on a previous task trained by someone else. The authors developed a continual learning method that addresses this issue, applying it here for the first time to solid mechanics. In particular, the method is applied to recurrent neural networks to predict history-dependent plasticity behavior, although it can be used on any other architecture (feedforward, convolutional, etc.) and to predict other phenomena. This work intends to spawn future developments on continual learning that will foster cooperative strategies among the mechanics community to solve increasingly challenging problems. We show that the chosen continual learning strategy can sequentially learn several constitutive laws without forgetting them, using less data to achieve the same error as standard (non-cooperative) training of one law per model. ...
Journal article (2023) - Aleksandr Dekhovich, David M.J. Tax, Marel H.F. Sluiter, Miguel A. Bessa
The human brain is capable of learning tasks sequentially mostly without forgetting. However, deep neural networks (DNNs) suffer from catastrophic forgetting when learning one task after another. We address this challenge considering a class-incremental learning scenario where the DNN sees test data without knowing the task from which this data originates. During training, Continual Prune-and-Select (CP&S) finds a subnetwork within the DNN that is responsible for solving a given task. Then, during inference, CP&S selects the correct subnetwork to make predictions for that task. A new task is learned by training available neuronal connections of the DNN (previously untrained) to create a new subnetwork by pruning, which can include previously trained connections belonging to other subnetwork(s) because it does not update shared connections. This enables to eliminate catastrophic forgetting by creating specialized regions in the DNN that do not conflict with each other while still allowing knowledge transfer across them. The CP&S strategy is implemented with different subnetwork selection strategies, revealing superior performance to state-of-the-art continual learning methods tested on various datasets (CIFAR-100, CUB-200-2011, ImageNet-100 and ImageNet-1000). In particular, CP&S is capable of sequentially learning 10 tasks from ImageNet-1000 keeping an accuracy around 94% with negligible forgetting, a first-of-its-kind result in class-incremental learning. To the best of the authors’ knowledge, this represents an improvement in accuracy above 10% when compared to the best alternative method. ...