Circular Image

L. Miranda da Cruz

info

Please Note

37 records found

Journal article (2027) - Santiago del Rey, Luís Cruz, Xavier Franch, Silverio Martínez-Fernández
To raise awareness of the environmental impact of deep learning (DL), numerous studies have estimated the energy consumption of DL systems. However, energy estimates during DL training often rely on unverified assumptions. This work addresses that gap by investigating how model architecture and training environment affect energy consumption. We train a variety of computer vision models and collect energy consumption and accuracy metrics to analyze their trade-offs across configurations. Our results show that selecting the right model–training environment combination can reduce training energy consumption by up to 80.68% with less than 2% loss in F1 score. We find a significant interaction effect between model and training environment: energy efficiency improves when GPU computational power scales with model complexity. Moreover, we demonstrate that common estimation practices, such as using FLOPs or GPU TDP, fail to capture these dynamics and can lead to substantial errors. To address these shortcomings, we propose the Stable Training Epoch Projection (STEP) and the Pre-training Regression-based Estimation (PRE) methods. Our evaluation demonstrates that STEP and PRE achieve reductions in Root Mean Squared Error (RMSE) up to 97% and 84%, respectively, when compared to existing estimation tools. ...
Journal article (2026) - Luis Cruz, Patricia Lago, Henry Muccini, Eoin Woods

The Convergence of Software Engineering and Green AI

Journal article (2025) - Luís Cruz, Xavier Franch, Silverio Martínez-Fernández
The latest advancements in machine learning, specifically in foundation models, are revolutionizing the frontiers of existing software engineering (SE) processes. This is a bi-directional phenomenon, where (1) software systems are now challenged to provide AI-enabled features to their users, and (2) AI is used to automate tasks within the software development lifecycle. In an era where sustainability is a pressing societal concern, our community needs to adopt a long-term plan enabling a conscious transformation that aligns with environmental sustainability values. In this article, we reflect on the impact of adopting environmentally friendly practices to create AI-enabled software systems and make considerations on the environmental impact of using foundation models for software development. ...
Failure prediction models can be significantly beneficial for managing large-scale complex software systems, but their trustworthiness is severely affected by changes in the data over time, also known as concept drift. Thus, monitoring these models against concept drift and retraining them when the data changes becomes crucial in designing reliable failure prediction models. In this work, we evaluate the effects of monitoring failure prediction models over time using label-independent (unsupervised) drift detectors. We show that retraining based on unsupervised drift detectors instead of periodically reduces the cost of acquiring true labels without compromising accuracy. Furthermore, we propose a novel feature reduction for unsupervised drift detectors and an evaluation pipeline that practitioners can employ to select the most suitable unsupervised drift detector for their application. ...

Adapting AIOps Capacity Forecasting Models to Data Changes

Conference paper (2025) - Lorena Poenaru-Olaru, Wouter Van't Hof, Adrian Stańdo, Arkadiusz P. Trawiński, Eileen Kapel, Jan S. Rellermeyer, Luis Cruz, Arie Van Deursen
Capacity management is critical for software organizations to allocate resources effectively and meet operational demands. An important step in capacity management is predicting future resource needs often relies on data-driven analytics and machine learning (ML) forecasting models, which require frequent retraining to stay relevant as data evolves. Continuously retraining the forecasting models can be expensive and difficult to scale, posing a challenge for engineering teams tasked with balancing accuracy and efficiency. Retraining only when the data changes appears to be a more computationally efficient alternative, but its impact on accuracy requires further investigation. In this work, we investigate the effects of retraining capacity forecasting models for time series based on detected changes in the data compared to periodic retraining. Our results show that drift-based retraining achieves comparable forecasting accuracy to periodic retraining in most cases, making it a costeffective strategy. However, in cases where data is changing rapidly, periodic retraining is still preferred to maximize the forecasting accuracy. These findings offer actionable insights for software teams to enhance forecasting systems, reducing retraining overhead while maintaining robust performance. ...

Optimizing Energy Efficiency Without Compromising Accuracy

Conference paper (2025) - Lorena Poenaru-Olaru, June Sallou, Luis Cruz, Jan S. Rellermeyer, Arie Van Deursen
The reliability of machine learning (ML) software systems is heavily influenced by changes in data over time. For that reason, ML systems require regular maintenance, typically based on model retraining. However, retraining requires significant computational demand, which makes it energy-intensive and raises concerns about its environmental impact. To understand which retraining techniques should be considered when designing sustainable ML applications, in this work, we study the energy consumption of common retraining techniques. Since the accuracy of ML systems is also essential, we compare retraining techniques in terms of both energy efficiency and accuracy. We showcase that retraining with only the most recent data compared to all available data reduces energy consumption by up to 25%, being a sustainable alternative to the status quo. Furthermore, our findings show that retraining a model only when there is evidence that updates are necessary, rather than on a fixed schedule, can reduce energy consumption by up to 40%, provided a reliable data change detector is in place. Our findings pave the way for better recommendations for ML practitioners, guiding them toward more energy-efficient retraining techniques when designing sustainable ML software systems. ...
Book chapter (2025) - Luís Cruz, Petra Heck
Technology brings exciting opportunities to improve our interactions with the natural surroundings. However, that same technological development might also negatively impact the environment. Every new technology has a carbon footprint, whether from its construction or operation. And most technological developments require software systems, and more recently AI-based software systems. For these software systems to positively impact our environment, they need to be developed and operated with sustainability in mind, also called 'green' in the discipline of software engineering. This chapter explores various dimensions of sustainability in software system development, drawing on existing software quality frameworks. We highlight green software best practices for development and knowledge transfer. We examine AI-based software systems, emphasising the importance of energy efficiency and carbon impact in the next generation of intelligent systems. This entails considering decisions at different stages of the AI lifecycle, ranging from underlying design choices in training pipelines to selecting optimal hardware for training and serving models. This chapter presents the intersection of green software, sustainable software engineering, and green AI as of major importance for future innovation. By prioritising sustainability in software development and AI, we can foster a more sustainable and eco-friendly future, with the potential to reduce energy consumption and mitigate the environmental impact of technology. ...

Energy Debugging And Testing for Android

Conference paper (2025) - Erik Blokland, Luís Cruz, Arie van Deursen
Energy consumption of software is becoming increasingly important in today’s mobile-focused world, but knowledge and techniques with which to measure energy consumption have lagged behind. This paper introduces a methodology for measuring the energy consumption of Android apps at the method level, and a concrete implementation of this methodology: EDATA. We evaluate EDATA by revisiting several Android code smells found in prior work to increase energy consumption, and by using a novel evaluation technique which allows us to generate a ground-truth for energy consumption using run-time as a proxy. Finally, we perform a case study on a real-world energy bug found in Adyen’s Android point of sale (POS) software. Our findings show that EDATA is able to accurately order methods by their energy consumption, and distinguish between different versions of a method. We also observed that debug mode has inconsistent effects on energy consumption, and that energy efficiency may not be consistent between devices. Finally, our case study shows that while developers and stakeholders agree that energy consumption is important, a lack of awareness and easy-to-use profiling prevents it from becoming a first-class metric in the development process. ...
Conference paper (2024) - Arumoy Shome, Luís Cruz, Arie Van Deursen
Although several fairness definitions and bias mitigation techniques exist in the literature, all existing solutions evaluate fairness of Machine Learning (ML) systems after the training stage. In this paper, we take the first steps towards evaluating a more holistic approach by testing for fairness both before and after model training. We evaluate the effectiveness of the proposed approach and position it within the ML development lifecycle, using an empirical analysis of the relationship between model dependent and independent fairness metrics. The study uses 2 fairness metrics, 4 ML algorithms, 5 real-world datasets and 1600 fairness evaluation cycles. We find a linear relationship between data and model fairness metrics when the distribution and the size of the training data changes. Our results indicate that testing for fairness prior to training can be a "cheap" and effective means of catching a biased data collection process early; detecting data drifts in production systems and minimising execution of full training cycles thus reducing development time and costs. ...
Effective change management is crucial for businesses heavily reliant on software and services to minimise incidents induced by changes. Unfortunately, in practice it is often difficult to effectively use artificial intelligence for IT Operations (AIOps) to enhance service management, primarily due to inadequate data quality. Establishing reliable links between changes and the induced incidents is crucial for identifying patterns, improving change deployment, identifying high-risk changes, and enhancing incident response. In this research, we investigate the enhancement of traceability between changes and incidents through AIOps methods. Our approach involves a close examination of incident-inducing changes, the replication of methods linking incidents to the changes that caused them, introducing an adapted method, and demonstrating its results using historical data and practical evaluations. Our findings reveal that incident-inducing changes exhibit different characteristics dependent on context. Furthermore, a significant disparity exists between assessments based on historical data and real-world observation, with an increased occurrence of false positives when identifying links between unlabeled changes and incidents. This study highlights the complex nature of identifying links between changes and incidents, emphasising the contextual influence on AIOps method effectiveness. While we are actively working on improving the quality of current data through AIOps approaches, it remains apparent that further measures are necessary to address issues like data imbalances and promote a postmortem culture that brings attention to the value of properly administrating tickets. A better overview of change failure rates contributes to improved risk compliance and reliable change management. ...

A Tool for Efficient Deep Learning Component Selection

Conference paper (2024) - Jai Kannan, Scott Barnett, Luis Cruz, Anj Simmons, Taylan Selvi
For software that relies on machine-learned functionality, model selection is key to finding the right model for the task with desired performance characteristics. Evaluating a model requires developers to i) select from many models (e.g. the Hugging face model repository), ii) select evaluation metrics and training strategy, and iii) tailor trade-offs based on the problem domain. However, current evaluation approaches are either ad-hoc resulting in sub-optimal model selection or brute force leading to wasted compute. In this work, we present GreenRunner, a novel tool to automatically select and evaluate models based on the application scenario provided in natural language. We leverage the reasoning capabilities of large language models to propose a training strategy and extract desired trade-offs from a problem description. GreenRunner features a resource-efficient experimentation engine that integrates constraints and trade-offs based on the problem into the model selection process. Our preliminary evaluation demonstrates that GreenRunner is both efficient and accurate compared to ad-hoc evaluations and brute force. This work presents an important step toward energy-efficient tools to help reduce the environmental impact caused by the growing demand for software with machine-learned functionality. Our tool is available at Figshare GreenRunner. ...

Insights from a Case Study at ING

Conference paper (2024) - Eileen Kapel, Luís Cruz, Diomidis Spinellis, Arie Van Deursen
An incident management process is necessary in businesses that depend strongly on software and services. A proper process is essential to guarantee that incidents are well-handled, especially in a financial software-defined business needing to adhere to guidelines and regulations. This paper aims to enhance understanding of the current state of practice through a single-case exploratory case study, at the international bank ING, by interviewing 15 subject matter experts on the incident management process. The research identifies eight core observations on tool usage, the challenges experienced and future opportunities. Core challenges include monitoring data quality, the complexity of the environment, and the balance between minimising incident resolution time and following procedural guidelines. Future opportunities can lessen these challenges by making better use of available tooling and employing machine learning approaches. This requires tight supervision on the use of best practices and good monitoring data quality. The findings emphasise the need for a strengthened focus on improving the quality of monitoring data, handling environment complexity, incident clustering, and better support for regulatory compliance. ...

An Exploratory Study

Conference paper (2024) - Pooja Rani, Jonas Zellweger, Veronika Kousadianos, Luis Cruz, Timo Kehrer, Alberto Bacchelli
As the energy footprint generated by software is increasing at an alarming rate, understanding how to develop energy-efficient applications has become a necessity. Previous work has introduced catalogs of coding practices, also known as energy patterns. These patterns are yet limited to Mobile or third-party libraries. In this study, we focus on the Web domain-a main source of energy consumption. First we investigated whether and how Mobile energy patterns can be ported to this domain and found that 20 patterns could be ported. Then, we interviewed six expert web developers from different companies to challenge the ported patterns. Most developers expressed concerns for antipatterns, specifically with functional antipatterns, and were able to formulate guidelines to locate these patterns in the source code. Finally, to quantify the effect of Web energy patterns on energy consumption, we set up an automated pipeline to evaluate two ported patterns: 'Dynamic Retry Delay' (DRD) and 'Open Only When Necessary' (OOWN). With this, we found no evidence that the DRD pattern consumes less energy than its antipattern, while the opposite is true for OOWN. Data and Material: https://doi.org/10.5281/zenodo.8404487. ...
Anomaly detection techniques are essential in automating the monitoring of IT systems and operations. These techniques imply that machine learning algorithms are trained on operational data corresponding to a specific period of time and that they are continuously evaluated on newly emerging data. Operational data is constantly changing over time, which affects the performance of deployed anomaly detection models. Therefore, continuous model maintenance is required to preserve the performance of anomaly detectors over time. In this work, we analyze two different anomaly detection model maintenance techniques in terms of the model update frequency, namely blind model retraining and informed model retraining. We further investigate the effects of updating the model by retraining it on all the available data (full-history approach) and only the newest data (sliding window approach). Moreover, we investigate whether a data change monitoring tool is capable of determining when the anomaly detection model needs to be updated through retraining. ...

Strategic Model Selection for Ensembles in Production

Conference paper (2024) - Nienke Nijkamp, June Sallou, Niels van der Heijden, Luís Cruz
Integrating Artificial Intelligence (AI) into software systems has significantly enhanced their capabilities while escalating energy demands. Ensemble learning, combining predictions from multiple models to form a single prediction, intensifies this problem due to cumulative energy consumption. This paper presents a novel approach to model selection that addresses the challenge of balancing the accuracy of AI models with their energy consumption in a live AI ensemble system. We explore how reducing the number of models or improving the efficiency of model usage within an ensemble during inference can reduce energy demands without substantially sacrificing accuracy. This study introduces and evaluates two model selection strategies, Static and Dynamic, for optimizing ensemble learning systems' performance while minimizing energy usage. Our results demonstrate that the Static strategy improves the F1 score beyond the baseline, reducing average energy usage from 100% from the full ensemble to 62%. The Dynamic strategy further enhances F1 scores, using on average 76% compared to 100% of the full ensemble. Moreover, we propose an approach that balances accuracy with resource consumption, significantly reducing energy usage without substantially impacting accuracy. This method decreased the average energy usage of the Static strategy from approximately 62% to 14%, and for the Dynamic strategy, from around 76% to 57%. Our field study of Green AI using an operational AI system developed by a large professional services provider shows the practical applicability of adopting energy-conscious model selection strategies in live production environments. ...
Literature review (2023) - Roberto Verdecchia, June Sallou, Luís Cruz
With the ever-growing adoption of artificial intelligence (AI)-based systems, the carbon footprint of AI is no longer negligible. AI researchers and practitioners are therefore urged to hold themselves accountable for the carbon emissions of the AI models they design and use. This led in recent years to the appearance of researches tackling AI environmental sustainability, a field referred to as Green AI. Despite the rapid growth of interest in the topic, a comprehensive overview of Green AI research is to date still missing. To address this gap, in this article, we present a systematic review of the Green AI literature. From the analysis of 98 primary studies, different patterns emerge. The topic experienced a considerable growth from 2020 onward. Most studies consider monitoring AI model footprint, tuning hyperparameters to improve model sustainability, or benchmarking models. A mix of position papers, observational studies, and solution papers are present. Most papers focus on the training phase, are algorithm-agnostic or study neural networks, and use image data. Laboratory experiments are the most common research strategy. Reported Green AI energy savings go up to 115%, with savings over 50% being rather common. Industrial parties are involved in Green AI studies, albeit most target academic readers. Green AI tool provisioning is scarce. As a conclusion, the Green AI research field results to have reached a considerable level of maturity. Therefore, from this review emerges that the time is suitable to adopt other Green AI research strategies, and port the numerous promising academic results to industrial practice. This article is categorized under: Technologies > Machine Learning. ...
Conference paper (2023) - T.E.R. Yarally, Luis Cruz, Daniel Feitosa, J. Sallou, A. van Deursen
The batch size is an essential parameter to tune during the development of new neural networks. Amongst other quality indicators, it has a large degree of influence on the model’s accuracy, generalisability, training times and parallelisability. This fact is generally known and commonly studied. However, during the application phase of a deep learning model, when the model is utilised by an end-user for inference, we find that there is a disregard for the potential benefits of introducing a batch size. In this study, we examine the effect of input batching on the energy consumption and response times of five fully-trained neural networks for computer vision that were considered state-of-the-art at the time of their publication. The results suggest that batching has a significant effect on both of these metrics. Furthermore, we present a timeline of the energy efficiency and accuracy of neural networks over the past decade. We find that in general, energy consumption rises at a much steeper pace than accuracy and question the necessity of this evolution. Additionally, we highlight one particular network, ShuffleNetV2 (2018), that achieved a competitive performance for its time while maintaining a much lower energy consumption. Nevertheless, we highlight that the results are model dependent. ...
Conference paper (2023) - Wander Siemers, June Sallou, Luis Cruz
Artificial intelligence is bringing ever new functionalities to the realm of mobile devices that are now considered essential (e.g., camera and voice assistants, recommender systems). Yet, operating artificial intelligence takes up a substantial amount of energy. However, artificial intelligence is also being used to enable more energy-efficient solutions for mobile systems. Hence, artificial intelligence has two faces in that regard, it is both a key enabler of desired (efficient) mobile functionalities and a major power draw on these devices, playing a part in both the solution and the problem. In this paper, we present a review of the literature of the past decade on the usage of artificial intelligence within the realm of green mobile computing. From the analysis of 34 papers, we highlight the emerging patterns and map the field into 13 main topics that are summarized in details. Our results showcase that the field is slowly increasing in the past years, more specifically, since 2019. Regarding the double impact AI has on the mobile energy consumption, the energy consumption of AI-based mobile systems is under-studied in comparison to the usage of AI for energy-efficient mobile computing, and we argue for more exploratory studies in that direction. We observe that although most studies are framed as solution papers (94%), the large majority do not make those solutions publicly available to the community. Moreover, we also show that most contributions are purely academic (28 out of 34 papers) and that we need to promote the involvement of the mobile software industry in this field. ...
Deployed machine learning systems often suffer from accuracy degradation over time generated by constant data shifts, also known as concept drift. Therefore, these systems require regular maintenance, in which the machine learning model needs to be adapted to concept drift. The literature presents plenty of model adaptation techniques. The most common technique is periodically executing the whole training pipeline with all the data gathered until a particular point in time, yielding a massive energy footprint. In this paper, we propose a research path that uses concept drift detection and adaptation to enable sustainable AI systems. ...
AIOps solutions enable faster discovery of failures in operational large-scale systems through machine learning models trained on operation data. These models become outdated during the occurrence of concept drift, a term used to describe shifts in data distributions. In operation data concept drift is inevitable and it impacts the performance of AIOps solutions over time. Therefore, concept drift should be closely monitored and immediate maintenance to prevent erroneous predictions is required. In this work, we propose an automated maintenance pipeline for AIOps models that monitors the occurrence of concept drift and chooses the most appropriate model retraining technique according to the drift type. ...