G. Iosifidis | TU Delft Repository

Constrained Online Convex Optimization with Memory and Predictions

Journal article (2026) - Mohammed Abdullah, George Iosifidis, Salah Eddine Elayoubi, Tijani Chahed

We study Constrained Online Convex Optimization with Memory (COCO-M), where both the loss and the constraints depend on a finite window of past decisions made by the learner. This setting extends the previously studied unconstrained online optimization with memory framework and captures practical problems such as the control of constrained dynamical systems and scheduling with reconfiguration budgets. For this problem, we propose the first algorithms that achieve sublinear regret and sublinear cumulative constraint violation under time-varying constraints, both with and without predictions of future loss and constraint functions. Without predictions, we introduce an adaptive penalty approach that guarantees sublinear regret and constraint violation. When short-horizon and potentially unreliable predictions are available, we reinterpret the problem as online learning with delayed feedback and design an optimistic algorithm whose performance improves as prediction accuracy improves, while remaining robust when predictions are inaccurate. Our results bridge the gap between classical constrained online convex optimization and memory-dependent settings, and provide a versatile learning toolbox with diverse applications. ...

Meta-Learning-Based Handover Management in NextG O-RAN

Journal article (2026) - Michail Kalntis, George Iosifidis, Jose Suarez-Varela, Andra Lutu, Fernando A. Kuipers

While traditional handovers (THOs) have served as a backbone for mobile connectivity, they increasingly suffer from failures and delays, especially in dense deployments and high-frequency bands. To address these limitations, 3GPP introduced Conditional Handovers (CHOs) that enable proactive cell reservations and user-driven execution. However, both handover (HO) types present intricate trade-offs in signaling, resource usage, and reliability. This paper presents unique, countrywide mobility management datasets from a top-tier mobile network operator (MNO) that offer fresh insights into these issues and call for adaptive and robust HO control in next-generation networks. Motivated by these findings, we propose CONTRA, a framework that, for the first time, jointly optimizes THOs and CHOs within the O-RAN architecture. We study two variants of CONTRA: one where users are a priori assigned to one of the HO types, reflecting distinct service or user-specific requirements, as well as a more dynamic formulation where the controller decides on-the-fly the HO type, based on system conditions and needs. To this end, it relies on a practical meta-learning algorithm that adapts to runtime observations and guarantees performance comparable to an oracle with perfect future information (universal no-regret). CONTRA is specifically designed for near-real-time deployment as an O-RAN xApp and aligns with the 6G goals of flexible and intelligent control. Extensive evaluations leveraging crowdsourced datasets show that CONTRA improves user throughput and reduces both THO and CHO switching costs, outperforming 3GPP-compliant and Reinforcement Learning (RL) baselines in dynamic and real-world scenarios. ...

CHOMET: Conditional Handovers via Meta-Learning

Conference paper (2025) - M. Kalntis, F. A. Kuipers, G. Iosifidis

Handovers (HOs) are the cornerstone of modern cellular networks for enabling seamless connectivity to a vast and diverse number of mobile users. However, as mobile networks become more complex with more diverse users and smaller cells, traditional HOs face significant challenges, such as prolonged delays and increased failures. To mitigate these issues, 3GPP introduced conditional handovers (CHOs), a new type of HO that enables the preparation (i.e., resource allocation) of multiple cells for a single user to increase the chance of HO success and decrease the delays in the procedure. Despite its advantages, CHO introduces new challenges that must be addressed, including efficient resource allocation and managing signaling/communication overhead from frequent cell preparations and releases. This paper presents a novel framework aligned with the O-RAN paradigm that leverages meta-learning for CHO optimization, providing robust dynamic regret guarantees and demonstrating at least 180% superior performance than other 3GPP benchmarks in volatile signal conditions. ...

On the Dynamic Regret of Following the Regularized Leader

Optimism with History Pruning

Journal article (2025) - Naram Mhaisen, George Iosifidis

We revisit the Follow the Regularized Leader (FTRL) framework for Online Convex Optimization (OCO) over compact sets, focusing on achieving dynamic regret guarantees. Prior work has highlighted the framework’s limitations in dynamic environments due to its tendency to produce “lazy” iterates. However, building on insights showing FTRL’s ability to produce “agile” iterates, we show that it can indeed recover known dynamic regret bounds through optimistic composition of future costs and careful linearization of past costs, which can lead to pruning some of them. This new analysis of FTRL against dynamic comparators yields a principled way to interpolate between greedy and agile updates and offers several benefits, including refined control over regret terms, optimism without cyclic dependence, and the application of minimal recursive regularization akin to AdaFTRL. More broadly, we show that it is not the “lazy” projection style of FTRL that hinders (optimistic) dynamic regret, but the decoupling of the algorithm’s state (linearized history) from its iterates, allowing the state to grow arbitrarily. Instead, pruning synchronizes these two when necessary. ...

Cooperative Edge Inferences With Online Learning

Journal article (2025) - Mengyuan Li, R. Venkatesha Prasad, George Iosifidis

The efficient execution of inferences at the edge is becoming increasingly critical for communication systems that are expected to provide users with fast and accurate mobile data analytics. These inference tasks are inherently latency-sensitive and computationally demanding, whereas edge nodes are limited by energy budgets and heterogeneous resources. This article studies how a set of edge nodes can collaborate in executing demanding streaming inference tasks to optimize their aggregate performance. Such collaborative task exchange schemes enable the sharing of scarce computing resources and machine learning (ML) models (which perform the inferences) and constitute a scalable approach to this intricate problem. We formulate this exchange process as an online convex optimization (OCO) problem and design a dynamic task assignment algorithm, which is proven to have optimality guarantees even when the network and service parameters (resources and task properties) are unknown and vary arbitrarily over time. The algorithm aims to maximize inference accuracy while minimizing overall task latency and energy (including for data transfers) and simultaneously ensures that collaborating nodes do not suffer imbalanced energy costs. Through a series of data-driven experiments, we quantify the cooperation benefits under different weight combinations and validate the convergence and adaptability of the proposed learning algorithm across diverse conditions, including variations of system parameters, as well as heterogeneity across nodes and tasks. ...

Adaptive Resource Allocation for Virtualized Base Stations in O-RAN With Online Learning

Journal article (2025) - Michail Kalntis, George Iosifidis, Fernando A. Kuipers

Open RAN systems, with their virtualized base stations (vBSs), offer increased flexibility and reduced costs, vendor diversity, and interoperability. However, optimizing the allocation of radio resources in such systems raises new challenges due to the volatile vBSs operation, and the dynamic network conditions and user demands they are called to support. Leveraging the novel O-RAN multi-tier control architecture, we propose a new set of resource allocation threshold policies with the aim of balancing the vBSs' performance and energy consumption in a robust and provably optimal fashion. To that end, we introduce an online learning algorithm that operates under minimal assumptions and without requiring knowledge of the environment, hence being suitable even for "challenging"environments with non-stationary or adversarial demands and conditions. We also develop a meta-learning scheme that utilizes other available algorithmic schemes, e.g., tailored for more "easy"environments, by choosing dynamically the best-performing algorithm; thus enhancing the system's effectiveness. We prove that the proposed solutions achieve sub-linear regret (zero optimality gap), and characterize their dependence on the main system parameters. The performance of the algorithms is evaluated with real-world data from a testbed, in stationary and adversarial conditions, indicating energy savings of up to 64.5% compared with several state-of-the-art benchmarks. ...

Network-cycle motif participation is associated with individual and collective wealth in Honduran villages

Journal article (2025) - Shivkumar Vishnempet Shridhar, Selena T. Lee, Yanick Charette, George Iosifidis, Nicholas A. Christakis

Geodesic cycles, or loops of nodes connected in a sequence within a network, are an important if under-studied network motif, and their prominence or deficiency is associated with both beneficial and detrimental properties in diverse kinds of networks. Here, we examine cycles formed by people’s reports of informal borrowing/lending and friendship ties among 22,551 rural Hondurans (in 174 isolated villages), and we explore their association with personal and community wealth across two time points. We find that cycles of different lengths (i.e., 3 or 4 ties in a loop) constitute an over-represented motif, and their quantity is strongly associated with individual wealth, i.e., richer individuals are involved in more cycles. Furthermore, we introduce a new metric of cycle composition, defined as the average of some measure (e.g., wealth) of a node’s alters in its cycles, and find that this metric outperforms cycle quantity as an indicator of both current and future wealth. A longitudinal analysis also reflects a higher participation rate in future cycles among wealthier individuals. When benchmarking cycles with eigenvector centrality, we find that cycle participation offers distinctive insights. Finally, cycle composition is a strong indicator of overall village wealth. In sum, the potential for the flow of money in a village through structural social network cycles may relate to both individual-level and village-level wealth. ...

Minimization of the Training Makespan in Hybrid Federated Split Learning

Journal article (2025) - Joana Tirana, Dimitra Tsigkari, George Iosifidis, Dimitris Chatzopoulos

Parallel Split Learning (SL) allows resource-constrained devices that cannot participate in Federated Learning (FL) to train deep neural networks (NNs) by splitting the NN model into parts. In particular, such devices (clients) may offload the processing task of the largest model part to a computationally powerful helper, and multiple helpers may be employed and work in parallel. In hybrid federated and split learning (HFSL), on the other hand, devices can participate in the training process through any of the two protocols (SL and FL), depending on the system's characteristics. This could considerably reduce the maximum training time over all clients (makespan), especially in highly heterogeneous scenarios. In this paper, we study the joint problem of the training protocol selection, client-helper assignments, and scheduling decisions, to minimize the training makespan. We prove this problem is NP-hard and propose two solution methods: one based on the decomposition of the problem by leveraging its inherent symmetry, and a second fully scalable one. Through numerical evaluations using our testbed's measurements, we build a solution strategy comprising these methods. Moreover, this strategy finds a near-optimal solution and achieves a shorter makespan than the baseline schemes by up to 71%. ...

Multi-Objective Reverse Offloading in Edge Computing for AI Tasks

Journal article (2025) - Petros Amanatidis, George Michailidis, Dimitris Karampatzakis, Vasileios Kalenteridis, George Iosifidis, Thomas Lagkas

Offloading tasks between edge nodes is a subject that has drawn a lot of attention since edge computing first emerged. A large number of edge IoT devices utilizing increased computing resources such as autonomous vehicles and UAVs can be used to execute AI tasks close to users. We present a novel approach that deviates from the conventional edge computing offloading concept namely offloading computationally intensive tasks from cloudlets to nearby end nodes. Specifically, we enhance a scenario where end nodes assist more powerful nodes (like cloudlets) in executing AI inference tasks. In edge computing networks, as end nodes grow in number, they build an idle computing capacity which can solve and provide efficient solutions. Our goal is to solve a defined Multi-Objective optimization problem with three objectives namely the overall execution time (slowest substasks), the execution accuracy, and the total energy consumption. We address this challenging optimization problem using a novel method with our released Multi-Objective Edge AI-Adaptive Reverse Offloading, or MOEAI-ARO, algorithm. Using an edge computing testbed and a representative AI service, we demonstrate the effectiveness of our reverse offloading proposal and method. The results indicate that our method further optimizes the system's performance compared to baseline algorithms. ...

Adaptive reverse task offloading in edge computing for AI processes

Journal article (2024) - Petros Amanatidis, Dimitris Karampatzakis, Georgios Michailidis, Thomas Lagkas, George Iosifidis

Nowadays, we witness the proliferation of edge IoT devices, ranging from smart cameras to autonomous vehicles, with increasing computing capabilities, used to implement AI-based services in users’ proximity, right at the edge. As these services are often computationally demanding, the popular paradigm of offloading their tasks to nearby cloud servers has gained much traction and been studied extensively. In this work, we propose a new paradigm that departs from the above typical edge computing offloading idea. Namely, we argue that it is possible to leverage these end nodes to assist larger nodes (e.g., cloudlets) in executing AI tasks. Indeed, as more and more end nodes are deployed, they create an abundance of idle computing capacity, which, when aggregated and exploited in a systematic fashion, can be proved beneficial. We introduce the idea of reverse offloading and study a scenario where a powerful node splits an AI task into a group of subtasks and assigns them to a set of nearby edge IoT nodes. The goal of each node is to minimize the overall execution time, which is constrained by the slowest subtask, while adhering to predetermined energy consumption and AI performance constraints. This is a challenging MINLP (Mixed Integer Non-Linear Problem) optimization problem that we tackle with a novel approach through our newly introduced EAI-ARO (Edge AI-Adaptive Reverse Offloading) algorithm. Furthermore, a demonstration of the efficacy of our reverse offloading proposal using an edge computing testbed and a representative AI service is performed. The findings suggest that our method optimizes the system’s performance significantly when compared with a greedy and a baseline task offloading algorithm. ...

Fair resource allocation in virtualized O-RAN platforms

Journal article (2024) - Fatih Aslan, George Iosifidis, Jose A. Ayala-Romero, Andres Garcia-Saavedra, Xavier Costa-Perez

O-RAN systems and their deployment in virtualized general-purpose computing platforms (O-Cloud) constitute a paradigm shift expected to bring unprecedented performance gains. However, these architectures raise new implementation challenges and threaten to worsen the already-high energy consumption of mobile networks. This paper presents first a series of experiments which assess the O-Cloud's energy costs and their dependency on the servers' hardware, capacity and data traffic properties which, typically, change over time. Next, it proposes a compute policy for assigning the base station data loads to O-Cloud servers in an energy-efficient fashion; and a radio policy that determines at near-real-Time the minimum transmission block size for each user so as to avoid unnecessary energy costs. The policies balance energy savings with performance, and ensure that both of them are dispersed fairly across the servers and users, respectively. To cater for the unknown and time-varying parameters affecting the policies, we develop a novel online learning framework with fairness guarantees that apply to the entire operation horizon of the system (long-Term fairness). The policies are evaluated using trace-driven simulations and are fully implemented in an O-RAN compatible system where we measure the energy costs and throughput in realistic scenarios. ...

Deep Reinforcement Learning for Orchestrating Cost-Aware Reconfigurations of vRANs

Journal article (2024) - Fahri Wisnu Murti, Samad Ali, George Iosifidis, Matti Latva-aho

Virtualized Radio Access Networks (vRANs) are fully configurable and can be implemented at a low cost over commodity platforms to enable network management flexibility. In this paper, a novel vRAN reconfiguration problem is formulated to jointly reconfigure the functional splits of the base stations (BSs), locations of the virtualized central units (vCUs) and distributed units (vDUs), their resources, and the routing for each BS data flow. The objective is to minimize the long-term total network operation cost while adapting to the varying traffic demands and resource availability. In the first step, testbed measurements are performed to study the relationship between the traffic demands and computing resources, which reveals high variance and depends on the platform and its load. Consequently, finding the perfect model of the underlying system is non-trivial. Therefore, to solve the proposed problem, a deep reinforcement learning (RL)-based framework is proposed and developed using model-free RL approaches. Moreover, the problem consists of multiple BSs sharing the same resources, which results in a multi-dimensional discrete action space and leads to a combinatorial number of possible actions. To overcome this curse of dimensionality, action branching architecture, which is an action decomposition method with a shared decision module followed by neural network is combined with Dueling Double Deep Q-network (D3QN) algorithm. Simulations are carried out using an O-RAN compliant model and real traces of the testbed. Our numerical results show that the proposed framework successfully learns the optimal policy that adaptively selects the vRAN configurations, where its learning convergence can be further expedited through transfer learning even in different vRAN systems. It also offers significant cost savings by up to 59% of a static benchmark, 35% of Deep Deterministic Policy Gradient with discretization, and 76% of non-branching D3QN. ...

Virtualized Radio Access Networks (vRANs) are fully configurable and can be implemented at a low cost over commodity platforms to enable network management flexibility. In this paper, a novel vRAN reconfiguration problem is formulated to jointly reconfigure the functional splits of the base stations (BSs), locations of the virtualized central units (vCUs) and distributed units (vDUs), their resources, and the routing for each BS data flow. The objective is to minimize the long-term total network operation cost while adapting to the varying traffic demands and resource availability. In the first step, testbed measurements are performed to study the relationship between the traffic demands and computing resources, which reveals high variance and depends on the platform and its load. Consequently, finding the perfect model of the underlying system is non-trivial. Therefore, to solve the proposed problem, a deep reinforcement learning (RL)-based framework is proposed and developed using model-free RL approaches. Moreover, the problem consists of multiple BSs sharing the same resources, which results in a multi-dimensional discrete action space and leads to a combinatorial number of possible actions. To overcome this curse of dimensionality, action branching architecture, which is an action decomposition method with a shared decision module followed by neural network is combined with Dueling Double Deep Q-network (D3QN) algorithm. Simulations are carried out using an O-RAN compliant model and real traces of the testbed. Our numerical results show that the proposed framework successfully learns the optimal policy that adaptively selects the vRAN configurations, where its learning convergence can be further expedited through transfer learning even in different vRAN systems. It also offers significant cost savings by up to 59% of a static benchmark, 35% of Deep Deterministic Policy Gradient with discretization, and 76% of non-branching D3QN.

Adaptive Online Non-stochastic Control

Conference paper (2024) - Naram Mhaisen, George Iosifidis

We tackle the problem of Non-stochastic Control (NSC) with the aim of obtaining algorithms whose policy regret is proportional to the difficulty of the controlled environment. Namely, we tailor the Follow The Regularized Leader (FTRL) framework to dynamical systems by using regularizers that are proportional to the actual witnessed costs. The main challenge arises from using the proposed adaptive regularizers in the presence of a state, or equivalently, a memory, which couples the effect of the online decisions and requires new tools for bounding the regret. Via new analysis techniques for NSC and FTRL integration, we obtain novel disturbance action controllers (DAC) with sub-linear data adaptive policy regret bounds that shrink when the trajectory of costs has small gradients, while staying sub-linear even in the worst case. ...

Workflow Optimization for Parallel Split Learning

Conference paper (2024) - Joana Tirana, Dimitra Tsigkari, George Iosifidis, Dimitris Chatzopoulos

Split learning (SL) has been recently proposed as a way to enable resource-constrained devices to train multi-parameter neural networks (NNs) and participate in federated learning (FL). In a nutshell, SL splits the NN model into parts, and allows clients (devices) to offload the largest part as a processing task to a computationally powerful helper. In parallel SL, multiple helpers can process model parts of one or more clients, thus, considerably reducing the maximum training time over all clients (makespan). In this paper, we focus on orchestrating the workflow of this operation, which is critical in highly heterogeneous systems, as our experiments show. In particular, we formulate the joint problem of client-helper assignments and scheduling decisions with the goal of minimizing the training makespan, and we prove that it is NPhard. We propose a solution method based on the decomposition of the problem by leveraging its inherent symmetry, and a second one that is fully scalable. A wealth of numerical evaluations using our testbed’s measurements allow us to build a solution strategy comprising these methods. Moreover, we show that this strategy finds a near-optimal solution, and achieves a shorter makespan than the baseline scheme by up to 52.3%. ...

Balancing Energy Preservation and Performance in Energy-Harvesting Sensor Networks

Journal article (2024) - Jernej Hribar, Ryoichi Shinkuma, Kuon Akiyama, George Iosifidis, Ivana Dusparic

The development of environmentally friendly, green communications is at the forefront of designing future Internet of Things (IoT) networks, although many opportunities to improve energy conservation from energy-harvesting (EH) sensors remain unexplored. Ubiquitous computing power, available in the form of cloudlets, enables the processing of the collected observations at the network edge. Often, the information that the Artificial Intelligence of Things (AIoT) application obtains by processing observations from one sensor can also be obtained by processing observations from another sensor. Consequently, a sensor can take advantage of the correlation between processed observations to avoid unnecessary transmissions and save energy. For example, when two cameras monitoring the same intersection detect the same vehicles, the system can recognize this overlap and reduce redundant data transmissions. This approach allows the network to conserve energy while still ensuring accurate vehicle detection, thereby maintaining the overall performance of the AIoT task. In this article, we consider such a system and develop a novel solution named balancing energy efficiency in sensor networks with multiagent reinforcement learning (BEES-MARL). Our proposed solution is capable of taking advantage of correlations in a system with multiple EH-powered sensors observing the same scene and transmitting their observations to a cloudlet. We evaluate the proposed solution in two data-driven use cases to verify its benefits and in a general setting to demonstrate scalability. Our solution improves task performance, measured by recall, by up to 16% over a heuristic approach, while minimizing latency and preventing outages. ...

Through the Telco Lens

A Countrywide Empirical Study of Cellular Handovers

Conference paper (2024) - Michail Kalntis, José Suárez-Varela, Jesús Omaña Iglesias, Anup Kiran Bhattacharjee, George Iosifidis, Fernando A. Kuipers, Andra Lutu

Cellular networks rely on handovers (HOs) as a fundamental element to enable seamless connectivity for mobile users. A comprehensive analysis of HOs can be achieved through data from Mobile Network Operators (MNOs); however, the vast majority of studies employ data from measurement campaigns within confined areas and with limited end-user devices, thereby providing only a partial view of HOs. This paper presents the first countrywide analysis of HO performance, from the perspective of a top-tier MNO in a European country. We collect traffic from approximately 40M users for 4 weeks and study the impact of the radio access technologies (RATs), device types, and manufacturers on HOs across the country. We characterize the geo-temporal dynamics of horizontal (intra-RAT) and vertical (inter-RATs) HOs, at the district level and at millisecond granularity, and leverage open datasets from the country's official census office to associate our findings with the population. We further delve into the frequency, duration, and causes of HO failures, and model them using statistical tools. Our study offers unique insights into mobility management, highlighting the heterogeneity of the network and devices, and their effect on HOs. ...

Optimistic Online Non-stochastic Control via FTRL

Conference paper (2024) - Naram Mhaisen, George Iosifidis

This paper brings the concept of 'optimism' to the new and promising framework of online Non-stochastic Control (NSC). Namely, we study how NSC can benefit from a prediction oracle of unknown quality responsible for forecasting future costs. The posed problem is first reduced to an optimistic learning with delayed feedback problem, which is handled through the Optimistic Follow the Regularized Leader (OFTRL) algorithmic family. This reduction enables the design of OptFTRL-C, the first Disturbance Action Controller (DAC) with optimistic policy regret bounds. These new bounds are commensurate with the oracle's accuracy, ranging from O (1) for perfect predictions to the order-optimal O(ST) even when all predictions fail. By addressing the challenge of incorporating untrusted predictions into online control, this work contributes to the advancement of the NSC framework and paves the way toward effective and robust learning-based controllers. ...

Reservation of Virtualized Resources with Optimistic Online Learning

Conference paper (2023) - Jean-Baptiste Monteil, Georgios Iosifidis, Ivana Dusparic

The virtualization of wireless networks enables new services to access network resources made available by the Network Operator (NO) through a Network Slicing market. The different service providers (SPs) have the opportunity to lease the network resources from the NO to constitute slices that address the demand of their specific network service. The goal of any SP is to maximize its service utility and minimize costs from leasing resources while facing uncertainties of the prices of the resources and the users' demand. In this paper, we propose a solution that allows the SP to decide its online reservation policy, which aims to maximize its service utility and minimize its cost of reservation simultaneously. We design the Optimistic Online Learning for Reservation (OOLR) solution, a decision algorithm built upon the Follow-the-Regularized Leader (FTRL), that incorporates key predictions to assist the decision-making process. Our solution achieves a O(√T) regret bound where T represents the horizon. We integrate a prediction model into the OOLR solution and we demonstrate through numerical results the efficacy of the combined models' solution against the FTRL baseline. ...

Enabling Long-term Fairness in Dynamic Resource Allocation

Journal article (2023) - Tareq Si Salem, Georgios Iosifidis, Giovanni Neglia

We study the fairness of dynamic resource allocation problem under the α-fairness criterion. We recognize two different fairness objectives that naturally arise in this problem: the well-understood slot-fairness objective that aims to ensure fairness at every timeslot, and the less explored horizon-fairness objective that aims to ensure fairness across utilities accumulated over a time horizon. We argue that horizon-fairness comes at a lower price in terms of social welfare. We study horizon-fairness with the regret as a performance metric and show that vanishing regret cannot be achieved in presence of an unrestricted adversary. We propose restrictions on the adversary's capabilities corresponding to realistic scenarios and an online policy that indeed guarantees vanishing regret under these restrictions. ...

EdgeBOL

A Bayesian Learning Approach for the Joint Orchestration of vRANs and Mobile Edge AI

Journal article (2023) - Jose A. Ayala-Romero, Andres Garcia-Saavedra, Xavier Costa-Perez, George Iosifidis

Future mobile networks need to support intelligent services which collect and process data streams at the network edge, so as to offer real-time and accurate inferences to users. However, the widespread deployment of these services is hindered by the unprecedented energy cost they induce to the network, and by the difficulties in optimizing their end-to-end operation. To address these challenges, we propose a Bayesian learning framework for jointly configuring the service and the Radio Access Network (RAN), aiming to minimize the total energy consumption while respecting accuracy and latency service requirements. Using a fully-fledged prototype with a software-defined base station (vBS) and a GPU-enabled edge server, we profile a typical video analytics service and identify new performance trade-offs and optimization opportunities. Accordingly, we tailor the proposed learning framework to account for the (possibly varying) network conditions, user needs, and service metrics, and apply it to a range of experiments with real traces. Our findings suggest that this approach effectively adapts to different hardware platforms and service requirements, and outperforms state-of-the-art benchmarks based on neural networks. ...