TY - GEN
T1 - Contention-aware Performance Modeling for Heterogeneous Edge and Cloud Systems
AU - Dagli, Ismet
AU - Depke, Andrew
AU - Mueller, Andrew
AU - Hassan, Md Sahil
AU - Akoglu, Ali
AU - Belviranli, Mehmet Esat
N1 - Publisher Copyright: © 2023 Owner/Author.
PY - 2023/8/14
Y1 - 2023/8/14
N2 - Diversely Heterogeneous System-on-Chips (DH-SoC) are increasingly popular computing platforms in many fields, such as autonomous driving and AR/VR applications, due to their ability to effectively balance performance and energy efficiency. Having multiple target accelerators for multiple concurrent workloads requires a careful runtime analysis of scheduling. In this study, we examine a scenario that mandates several concerns to be carefully addressed: 1) exploring the mapping of various workloads to heterogeneous accelerators to optimize the system for better performance, 2) analyzing data from the physical world in runtime to minimize the response time of the system 3) accurately estimating the resource contention by workloads during runtime since there will be con- current operations running under the same die, and 4) deferring the operation to the cloud for computationally more demanding operations such as continuous learning or real-time rendering, de- pending on the complexity of the computation. We demonstrate our analysis and approach on a VR project as a case study by using NVIDIA Xavier NX Edge DH-SoC and a server equipped with NVIDIA GeForce RTX 3080 GPU and AMD EPYC 7402 CPU.
AB - Diversely Heterogeneous System-on-Chips (DH-SoC) are increasingly popular computing platforms in many fields, such as autonomous driving and AR/VR applications, due to their ability to effectively balance performance and energy efficiency. Having multiple target accelerators for multiple concurrent workloads requires a careful runtime analysis of scheduling. In this study, we examine a scenario that mandates several concerns to be carefully addressed: 1) exploring the mapping of various workloads to heterogeneous accelerators to optimize the system for better performance, 2) analyzing data from the physical world in runtime to minimize the response time of the system 3) accurately estimating the resource contention by workloads during runtime since there will be con- current operations running under the same die, and 4) deferring the operation to the cloud for computationally more demanding operations such as continuous learning or real-time rendering, de- pending on the complexity of the computation. We demonstrate our analysis and approach on a VR project as a case study by using NVIDIA Xavier NX Edge DH-SoC and a server equipped with NVIDIA GeForce RTX 3080 GPU and AMD EPYC 7402 CPU.
KW - edge-cloud systems
KW - heterogeneous accelerators
KW - performance modeling
UR - http://www.scopus.com/inward/record.url?scp=85171459547&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85171459547&partnerID=8YFLogxK
U2 - 10.1145/3589010.3594889
DO - 10.1145/3589010.3594889
M3 - Conference contribution
T3 - FRAME 2023 - Proceedings of the 3rd Workshop on Flexible Resource and Application Management on the Edge
SP - 27
EP - 31
BT - FRAME 2023 - Proceedings of the 3rd Workshop on Flexible Resource and Application Management on the Edge
PB - Association for Computing Machinery, Inc
T2 - 3rd Workshop on Flexible Resource and Application Management on the Edge, FRAME 2023, held in conjunction with the HPDC 2023 and the FCRC 2023 Conferences
Y2 - 20 June 2023
ER -