Selected article for: "local global and long range"

Author: Vo, X. T.; Tran, T. D.; Nguyen, D. L.; Jo, K. H.
Title: Regression-Aware Classification Feature for Pedestrian Detection and Tracking in Video Surveillance Systems
  • Cord-id: d44uqs9g
  • Document date: 2021_1_1
  • ID: d44uqs9g
    Snippet: Pedestrian detection and tracking in video surveillance systems is a complex task in computer vision research, which has widely used in many applications such as abnormal action detection, human pose, crowded scenes, fall detection in elderly humans, social distancing detection in the Covid-19 pandemic. This task is categorized into two sub-tasks: detection, and re-identification task. Previous methods independently treat two sub-tasks, only focusing on the re-identification task without employi
    Document: Pedestrian detection and tracking in video surveillance systems is a complex task in computer vision research, which has widely used in many applications such as abnormal action detection, human pose, crowded scenes, fall detection in elderly humans, social distancing detection in the Covid-19 pandemic. This task is categorized into two sub-tasks: detection, and re-identification task. Previous methods independently treat two sub-tasks, only focusing on the re-identification task without employing re-detection. Since the performance of pedestrian detection directly affects the results of tracking, leveraging the detection task is crucial for improving the re-identification task. The total inference time is computed in both the detection and re-identification process, quite far from real-time speed. This paper joins both sub-tasks in a single end-to-end network based on Convolutional Neural Networks (CNNs). Moreover, the detection includes the classification and regression task. As both tasks have a positive correlation, separately learning classification and regression hurts the overall performance. Hence, this work introduces the Regression-Aware Classification Feature (RACF) module to improve feature representation. The convolutional layer is the core component of CNNs, which extracts local features without modeling global features. Therefore, the Cross-Global Context (CGC) is proposed to form long-range dependencies for learning appearance embedding of re-identification features. The proposed model is conducted on the challenging benchmark datasets, MOT17, which surpasses the state-of-the-art online trackers. © 2021, Springer Nature Switzerland AG.

    Search related documents: