RESOURCE-GUARANTEEING DEEP LEARNING

Deep Neural Networks (DNN) are increasingly deployed in highly resource-constrained environments such as autonomous drones and wearable devices, which have specific energy budgets and/or real-time requirements. While lots of recent work has studied empirical techniques to reduce the energy consumption and latency of DNNs, our research is the first to propose an end-to-end DNN training framework that provides quantitative resource guarantees.

Without losing generality, we focus on the energy constraints. Specifically, our learning algorithm directly trains a DNN model that meets a given energy budget while maximizing model accuracy without incremental hyper-parameter tuning. Our training algorithm leverages the network sparsity as the knob to control network energy consumption, but in principle supports other techniques such as quantization.

The key idea is to formulate the DNN training process as a constrained optimization problem in which the energy budget imposes a previously unconsidered optimization constraint. Formally, the optimization problem is formulated as follows:

opt

Crucially, our technique works for both platforms where the hardware architecture details are known and where the hardware architectures are closed and have to be treated as blackboxes.

Platform-Specific Learning Algorithm

Our ICLR'19 work addresses the former case, in which the energy consumption of a DNN inference could be analytical modeled as a function of the network sparsity. Given the energy model, we propose an optimization algorithm to approximately solve the optimization problem. A key step in optimization is the projection operation onto the energy constraint. We prove that this projection can be casted into a 0/1 knapsack problem and show that it can be solved very efficiently.

Under the same energy budget, our approach achieves noticeably higher accuracy than the previously best platform-aware learning algorithm such as EAP.

Platform-Independent Learning Algorithm

Our CVPR'19 work proposes a framework called ECC that addresses the latter case. ECC has two phases: an offline energy modeling phase and an online training/compression phase. Given a particular network to compress, the offline component profiles the network on a particular target platform and constructs an energy estimation model without requiring any knowledge of the underlying hardware platform. The online component leverages the energy model to solve the constrained optimization problem followed by an optional fine-tuning phase before generating a compressed model.

ecc

The optimization problem, however, has nontrivial constraints. Therefore, existing deep learning solvers do not apply directly. We propose an optimization algorithm that combines the essence of the Alternating Direction Method of Multipliers (ADMM) framework with gradient-based learning algorithms.

Under the same energy budget, our approach achieves 17.5% and 43.4% higher accuracy than NetAdapt and AMC, respectively, which are the two previously best platform-agnostic learning algorithm.