System developers and automotive manufacturers have proposed and used advanced driver assistance systems to enable the deployment of next-generation applications in connected vehicles, such as cooperative perception and vehicle-to-everything communication, while addressing autono
...
System developers and automotive manufacturers have proposed and used advanced driver assistance systems to enable the deployment of next-generation applications in connected vehicles, such as cooperative perception and vehicle-to-everything communication, while addressing autonomy-related challenges. The example deployment of these applications and systems, including proof-of-concept levels, generally depends on vehicle sensor suites, communication units, large-scale memory systems, and high-performance computing (HPC) units, which are together responsible for sensing the vehicle's environment, efficient data processing, and application deployment using machine learning models or rule-based algorithms. These models and algorithms are generally computationally complex as they process sensed and transformed data using statistical algorithms and deep learning models, such as those using convolutional operations and attention mechanisms, for tasks including decision-making, actuation, system analysis, environment sensing, monitoring, and infotainment applications. The computational complexity of these algorithms and models further increases upon scaling, primarily due to high data volumes, which also requires deep convolutions or similar operations for data processing. These operations require high-performance computing units to meet the application's operational and performance requirements, including latency, throughput, and accuracy.
Generally, the AI models used in connected vehicle applications are designed with a prime focus on model performance metrics; however, their high energy consumption and resulting carbon footprints are often overlooked. Recent studies have shown that the computing requirements for autonomous and connected vehicles can themselves become a significant component of overall energy consumption. For example, large-scale deployment of on-board AI across a global vehicle fleet could generate carbon emissions comparable to those of today's entire data center infrastructure. The computing hardware inside autonomous and connected vehicles can consume hundreds to over a thousand watts when running multiple perception and decision models simultaneously. Since these vehicles are battery-powered, this computing energy directly reduces driving range and increases operational cost. At fleet scale, this translates into a substantial carbon footprint, even when vehicles are electric. Therefore, energy efficiency is a primary design requirement, not only for sustainability but also for maintaining vehicle usability, battery longevity, and cost-efficiency. This thesis addresses the disparity between the strong research focus on model accuracy and the limited focus on energy usage by developing and evaluating an energy-aware adaptive framework for AI-driven vehicular services, such as non-safety-critical perception and high-definition mapping applications. The framework achieves energy and runtime improvements through energy-aware training, resource allocation, and adaptive deployment of computationally intensive models.
Previous research has proposed developing energy-efficient solutions in the hardware and software domains. For example, hardware-related energy-efficient solutions include transitioning from high-end graphical processing units to specialized AI accelerators and integrated circuits, which can process neural networks and related operations more efficiently. Similarly, architectural shifts and software-level solutions include transitioning from the centralized computing approach to dedicated edge computing and tiny machine learning solutions. However, the research scope remains within hardware-software co-design and optimization. By specifically targeting software-level optimizations, this thesis explores approximate computing (AxC) as a mechanism to utilize the error resilience of AI models in the perception and latency-tolerant applications of the vehicle-edge computing ecosystem. By balancing a trade-off between quality of experience and energy efficiency, AxC provides opportunities to reduce on-board energy demands and resulting carbon footprints of vehicle and edge devices while maintaining acceptable application performance levels. To explore and optimize the trade-off between model performance and energy consumption for connected autonomous vehicle applications, the following research questions are addressed:
1) What are the requirements for enabling energy efficiency in data-intensive vehicular services?
2) Which components can enable task deployments energy-efficiently and collaboratively in vehicle-edge-cloud computing?
3) How can energy-efficient components be integrated into an energy-aware adaptive software framework?
4) Can the framework effectively balance the trade-off between energy efficiency and performance in vehicle-edge-cloud computing scenarios?
Building upon existing research and knowledge on energy-efficient computing, this thesis addresses the above-mentioned questions. By addressing (RQ1), this research identifies the technical and operational requirements to enable and integrate energy efficiency into data-intensive vehicular services. These functional requirements include performing high-level computations with minimal energy use and efficiently processing large data streams on edge devices. Secondly, to design and develop energy-saving components (RQ2), the thesis proposes software-level approximation schemes combined with variational inference for both training-time and post-training model optimization and acceleration. Third, contributing to (RQ2) and (RQ3), the research explores ML model partitioning and computing resource allocation mechanisms to utilize the distributed and heterogeneous nature of the vehicle-edge-cloud environment for distributed training and inference. These explorations aim to meet service-level objective deployment using lookup table-based mechanisms. Addressing (RQ3) and (RQ4), the thesis integrates these components into an energy-aware adaptive software framework. This framework provides optimized model training and deployment strategies for distributed training and inference on heterogeneous computing resources, while effectively balancing the trade-off between energy efficiency and on-device application performance.
This thesis utilizes the design science methodology, adapting principles from the Information Systems Research Framework (design-as-a-search-process). This approach ensures research rigor to develop artefacts based on the application domain and existing theoretical knowledge. Further within the process, it adds design knowledge to the existing knowledge base of the application domain. As the development cycle of the research methodology includes tests and experiments, the effectiveness of the developed artefacts can be seen through test and experimental evaluation. Applying the proposed software approximation schemes, model partitioning, resource allocation, and adaptive deployment strategies on the state-of-the-art models shows up to 40% improvements in energy saving for less than 7% quality or model performance degradation when compared to the full precision and central computing methods. Software approximation schemes include the design of approximate multipliers, probabilistic approximation mechanisms, approximating convolutional, and fully connected layers for CNNs/DNNs. For the next-generation and memory/compute-intensive vision transformer models, this work proposes software-level approximation schemes based on variational inference, combined with post-training quantization and quantization-aware training, which show up to 35% improvements in energy efficiency for 6-8% quality loss. As the backbone of these next-generation vision models also includes multi-precision operands such as 8-bit, 16-bit, and 32-bit in layers and channels, the research also explores the advantage of mixed-precision operation to facilitate a balanced trade-off between models' energy usage and accuracy.
This research is among the first to investigate energy-aware requirements for application deployment beyond the traditional approach that generally focuses on cloud-based offloading mechanisms and model compression in the context of connected vehicle services and systems. The evaluation of the energy-aware framework on the popular edge devices shows the contribution of the thesis within the scope of distributed model computing using edge AI and sustainable computing practices. The research is set within the area of tiny machine learning and green AI principles. Future research can further develop adaptive algorithms that dynamically optimize energy use in real-time and investigate predictive models under varying conditions. Additionally, exploring the integration of Approximate Computing with emerging technologies like neuromorphic computing can improve processing efficiency in vehicular systems.