JB
J.L. Buijnsters
info
Please Note
<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>
2 records found
1
Industry 4.0 and the Industrial Internet of Things (IIoT) growth will result in an explosion of data generated by connected devices. Adapting 5G and 6G technology could be the leading enabler of the broad possibilities of connecting IIoT devices in masses. However, the edge solution has some disadvantages, such as the loss of resource elasticity compared to cloud solutions. The research questions of this thesis are whether Deep Neural Networks (DNN)-porting can solve the accuracy-performance trade-off of edge computing solutions and how to implement an edge computing system based on open-source container-orchestrated DNN model inference platforms to enable vertical model autoscaling capabilities.
The thesis shows how porting techniques like structured pruning on DNN enable the accuracy performance trade-off in hardware-constrained settings. It generates models with reduced complexity and size while minimally degrading the accuracy. By using these ported models in the proposed inference platform, the thesis demonstrates how an edge computing system can achieve vertical model autoscaling capabilities, enabling efficient use of computational resources. This research focuses on CPU hardware and Real-Time (RT) request scenarios, where the latency Service Level Objective (SLO) combined with current demand are crucial factors. When the resources in an inference system deplete, the latency of individual requests can increase significantly due to queuing. The results show how an orchestrator can make live model version selections based on the model versions and demand. The proposed system increases the maximum possible throughput compared to the state-of-the-art while avoiding creating a queue in the RT scenario and improving system accuracy when CPU resources are available. Additionally, this work proposes a design to implement these benefits in industry-adopted open-source DNN inference platforms. ...
The thesis shows how porting techniques like structured pruning on DNN enable the accuracy performance trade-off in hardware-constrained settings. It generates models with reduced complexity and size while minimally degrading the accuracy. By using these ported models in the proposed inference platform, the thesis demonstrates how an edge computing system can achieve vertical model autoscaling capabilities, enabling efficient use of computational resources. This research focuses on CPU hardware and Real-Time (RT) request scenarios, where the latency Service Level Objective (SLO) combined with current demand are crucial factors. When the resources in an inference system deplete, the latency of individual requests can increase significantly due to queuing. The results show how an orchestrator can make live model version selections based on the model versions and demand. The proposed system increases the maximum possible throughput compared to the state-of-the-art while avoiding creating a queue in the RT scenario and improving system accuracy when CPU resources are available. Additionally, this work proposes a design to implement these benefits in industry-adopted open-source DNN inference platforms. ...
Industry 4.0 and the Industrial Internet of Things (IIoT) growth will result in an explosion of data generated by connected devices. Adapting 5G and 6G technology could be the leading enabler of the broad possibilities of connecting IIoT devices in masses. However, the edge solution has some disadvantages, such as the loss of resource elasticity compared to cloud solutions. The research questions of this thesis are whether Deep Neural Networks (DNN)-porting can solve the accuracy-performance trade-off of edge computing solutions and how to implement an edge computing system based on open-source container-orchestrated DNN model inference platforms to enable vertical model autoscaling capabilities.
The thesis shows how porting techniques like structured pruning on DNN enable the accuracy performance trade-off in hardware-constrained settings. It generates models with reduced complexity and size while minimally degrading the accuracy. By using these ported models in the proposed inference platform, the thesis demonstrates how an edge computing system can achieve vertical model autoscaling capabilities, enabling efficient use of computational resources. This research focuses on CPU hardware and Real-Time (RT) request scenarios, where the latency Service Level Objective (SLO) combined with current demand are crucial factors. When the resources in an inference system deplete, the latency of individual requests can increase significantly due to queuing. The results show how an orchestrator can make live model version selections based on the model versions and demand. The proposed system increases the maximum possible throughput compared to the state-of-the-art while avoiding creating a queue in the RT scenario and improving system accuracy when CPU resources are available. Additionally, this work proposes a design to implement these benefits in industry-adopted open-source DNN inference platforms.
The thesis shows how porting techniques like structured pruning on DNN enable the accuracy performance trade-off in hardware-constrained settings. It generates models with reduced complexity and size while minimally degrading the accuracy. By using these ported models in the proposed inference platform, the thesis demonstrates how an edge computing system can achieve vertical model autoscaling capabilities, enabling efficient use of computational resources. This research focuses on CPU hardware and Real-Time (RT) request scenarios, where the latency Service Level Objective (SLO) combined with current demand are crucial factors. When the resources in an inference system deplete, the latency of individual requests can increase significantly due to queuing. The results show how an orchestrator can make live model version selections based on the model versions and demand. The proposed system increases the maximum possible throughput compared to the state-of-the-art while avoiding creating a queue in the RT scenario and improving system accuracy when CPU resources are available. Additionally, this work proposes a design to implement these benefits in industry-adopted open-source DNN inference platforms.
Bachelor thesis
(2020)
-
J.L. Buijnsters, D. Hofman, J.G.P. Klein Kranenbarg, C. El Moussaoui, K. Zheng, B.H.M. Gerritsen, K.F. Chan, H. Wang, O.W. Visser
ScenWise is an innovative company that specializes in data science revolving around traffic management. ScenWise strives to use the newest and best technologies and practices when it comes to web applications, data science and traffic management. The reason for this is that they provide tools to analyse and visualise a variety of situations that occur in traffic management. One such tool is SmartRoads 1.0, which allows users to analyse traffic data and situations via a web application. Unfortunately SmartRoads 1.0 does not perform as desired. Additionally, ScenWise itself has the problem of not being able to integrate previously made products by student groups into their own existing products. During the research aimed to resolve these problems another issue arose; the software development life cycle of ScenWise is very lacking. Research on the SmartRoads 1.0 performance problem showed that the bottleneck of its performance is due to the front-end. The outdated SmartRoads 1.0 front-end was thus replaced with a new and better SmartRoads 2.0 front-end. The integration problem and development life cycle problem are both addressed in the Longterm evolution (LTE) design found in appendix I. This LTE design contains the architecture migration plan. This plan will transform the current software architecture to a Service-oriented architecture (SOA) providing a solution for the current integration problems. A result of the first steps of this architecture migration plan is the Application Programming Interface (API) Gateway, which has been implemented in the aforementioned SmartRoads 2.0. Next to the migration plan, guidelines for ScenWise to improve their software development life cycle are elaborated in the LTE design. In this report the identified problems, their solutions and executions are explained, discussed and evaluated.
...
ScenWise is an innovative company that specializes in data science revolving around traffic management. ScenWise strives to use the newest and best technologies and practices when it comes to web applications, data science and traffic management. The reason for this is that they provide tools to analyse and visualise a variety of situations that occur in traffic management. One such tool is SmartRoads 1.0, which allows users to analyse traffic data and situations via a web application. Unfortunately SmartRoads 1.0 does not perform as desired. Additionally, ScenWise itself has the problem of not being able to integrate previously made products by student groups into their own existing products. During the research aimed to resolve these problems another issue arose; the software development life cycle of ScenWise is very lacking. Research on the SmartRoads 1.0 performance problem showed that the bottleneck of its performance is due to the front-end. The outdated SmartRoads 1.0 front-end was thus replaced with a new and better SmartRoads 2.0 front-end. The integration problem and development life cycle problem are both addressed in the Longterm evolution (LTE) design found in appendix I. This LTE design contains the architecture migration plan. This plan will transform the current software architecture to a Service-oriented architecture (SOA) providing a solution for the current integration problems. A result of the first steps of this architecture migration plan is the Application Programming Interface (API) Gateway, which has been implemented in the aforementioned SmartRoads 2.0. Next to the migration plan, guidelines for ScenWise to improve their software development life cycle are elaborated in the LTE design. In this report the identified problems, their solutions and executions are explained, discussed and evaluated.