AR, Person Re-ID, Deep Metric Learning

Video-based Human Tracking Robust to Dynamic Camera Position and Orientation Changes

In recent years, RGB cameras and depth cameras integrated into Augmented Reality (AR) glasses and smart glasses have been utilized for recognizing people and objects in real-world environments. Various applications have been proposed to overlay cyber information onto real-world visuals or directly onto smart glasses. For instance, in the field of sports, systems have been developed to detect and identify players during games or training sessions and provide relevant player information to spectators and coaches wearing smart glasses. To realize such smart glasses applications, tracking people in captured video footage is essential.
Numerous object tracking methods have been proposed. For example, DeepSORT detects target objects in images using the object detection method YOLO and tracks them across consecutive frames in a 2D image sequence using a Kalman filter, which assumes stable motion. However, most existing person tracking methods, including DeepSORT, are designed for fixed-camera video footage. In situations where the camera position and orientation change, extended occlusions lasting several frames, unexpected movement within or outside the image frame, and other disruptions frequently occur. As a result, tracking accuracy significantly deteriorates when relying on a Kalman filter, which assumes smooth movement.
This study proposes a method to determine whether a newly detected person in a video corresponds to a previously tracked individual, thereby extending conventional multiple object tracking (MOT) methods designed for fixed-camera footage. By appropriately applying person re-identification (Re-ID) when a previously detected individual reappears, we enable continuous tracking of people in variable viewpoint videos captured by smart glasses. Furthermore, our approach estimates the last known location and disappearance time of a detected person to predict their likely position over time. By utilizing the gaze direction and position of the smart glasses, we estimate whether a person is within the current field of view, thereby suppressing unnecessary re-identification of individuals who are unlikely to be present. This enhances the accuracy of person re-identification.



発表論文

  • 高橋直也, 天野辰哉, 山口弘純, & 東野輝夫. (2021). 深層距離学習を用いた AR デバイス向けの人物識別手法. マルチメディア, 分散協調とモバイルシンポジウム 2021 論文集2021(1), 388-394.
  • 高橋直也, 天野辰哉, & 山口弘純. (2021). スマートグラスの可視領域情報を用いた不連続動画上の人物追跡. 研究報告モバイルコンピューティングと新社会システム (MBL)2021(29), 1-8.
  • Takahashi, N., Amano, T., & Yamaguchi, H. (2023, June). Multi-Person Tracking Method Robust to Dynamic Viewport Changes for AR apps. In 2023 19th International Conference on Intelligent Environments (IE) (pp. 1-4). IEEE.
Back to Research Themes