Breadcrumb

Automating Deep Neural Network Model Selection for Edge Inference

Prof. Shaolei Ren, Department of Electrical and Computer Engineering, UCR
ABSTRACT –

The ever increasing size of deep neural network (DNN) models once implied that they were only limited to cloud data centers for runtime inference. Nonetheless, the recent plethora of DNN model compression techniques have successfully overcome this limit, turning into a reality that DNN-based inference can be run on numerous resource-constrained edge devices including mobile phones, drones, robots, medical devices, wearables, Internet of Things devices, among many others. Naturally, edge devices are highly heterogeneous in terms of hardware specification and usage scenarios. On the other hand, compressed DNN models are so diverse that they exhibit different tradeoffs in a multi-dimension space, and not a single model can achieve optimality in terms of all important metrics such as accuracy, latency and energy consumption. Consequently, how to automatically select a compressed DNN model for an edge device to run inference with optimal quality of experience (QoE) arises as a new challenge. The state-of-the-art approaches either choose a common model for all/most devices, which is optimal for a small fraction of edge devices at best, or apply device-specific DNN model optimization, which is not scalable. Moreover, the existing DNN model selection approaches are "open loop" and focus on optimizing objective metrics without end users' feedback, which may not necessarily translate into improved QoE for end users.

In this talk, I present an automated device-level DNN model selection engine for QoE-optimal edge inference, by leveraging the predictive power of machine learning and keeping end users into a closed loop. More concretely, I formulate the DNN model selection problem into a contextual online learning framework, where features of edge devices and DNN models are incorporated contextual information and end users' QoE feedback are rewards. I develop an efficient online learning algorithm to balance exploration and exploitation. Our evaluation results based on prototype and simulation validate our design, highlighting the potential of machine learning for automating DNN model selection to achieve QoE-optimal edge inference.

Prof. Shaolei Ren