Table of links
ABSTRACT
1 INTRODUCTION
2 BACKGROUND: OMNIDIRECTIONAL 3D OBJECT DETECTION
3 PRELIMINARY EXPERIMENT
3.1 Experiment Setup
3.2 Observations
3.3 Summary and Challenges
4 OVERVIEW OF PANOPTICUS
5 MULTI-BRANCH OMNIDIRECTIONAL 3D OBJECT DETECTION
5.1 Model Design
6 SPATIAL-ADAPTIVE EXECUTION
6.1 Performance Prediction
5.2 Model Adaptation
6.2 Execution Scheduling
7 IMPLEMENTATION
8 EVALUATION
8.1 Testbed and Dataset
8.2 Experiment Setup
8.3 Performance
8.4 Robustness
8.5 Component Analysis
8.6 Overhead
9 RELATED WORK
10 DISCUSSION AND FUTURE WORK
11 CONCLUSION AND REFERENCES
9 RELATED WORK
9 RELATED WORK Adaptive object detection systems. Many computer vision systems designed for resource-constrained edge devices have achieved efficient resource utilization by adapting to the video content or computing budgets [14, 15, 19, 22, 50]. In parallel with these advancements, several works have specifically focused on adaptive object detection methods [3, 28, 49], facilitating valuable applications such as edge video analytics or mobile augmented reality. For instance, ApproxDet [49] presented a multi-branch framework that switches between a detector and a tracker based on awareness of video content characteristics and resource contention. Remix [28] brings video content adaptiveness by partitioning images and selectively applying neural networks with different scales. However, existing adaptive methods fall short in supporting resource-efficient omnidirectional 3D detection. This type of detection requires processing each camera view using different 3D perception capabilities, such as enhanced estimation of objects’ 3D location and velocity. To process each view optimally, Panopticus predicts its expected performance based on short-term future dynamics in the spatial distribution, which is crucial for mobile scenarios. Although Remix [28] utilizes performance estimation based on object distribution, it relies on long-term historical distribution and does not account for various object characteristics in 3D, which is unsuitable for our system’s setting
3D object detection on edge devices. With the rapid advancement of edge computing, there is an increasing demand for employing 3D object detection on edge devices. DeepMix [18] addressed the limitations of edge resources by delegating the compute-intensive 2D detection tasks to a server equipped with high-performance GPUs. The other lightweight tasks, such as estimating the 3D location of detected objects using a depth sensor, are efficiently handled by mobile devices. Another solution, PointSplit [37], proposed a parallel processing technique that utilizes edge NPU and GPU to facilitate on-device execution of RGB-D camera-based 3D detection. This approach exemplifies the trend of harnessing the power of specialized AI accelerators to meet the demands of edge computing [26, 27, 33, 59]. VIPS [44], designed for self-driving vehicles, introduced an edge-based system that collaborates with outdoor infrastructures equipped with computing units and LiDAR sensors. This strategy effectively extended the vehicles’ perception ranges by fusing data from the onboard system and the infrastructure. Apart from these efforts, Panopticus effectively initiates a self-contained, comprehensive perception system for resource-constrained edge devices. The system introduced a novel concept of camera-based omnidirectional 3D perception optimized for edge computing abilities, eliminating the need for depth sensors or computation offloading.
This paper is