Describing a useful performance evaluation method for object tracking algorithms is difficult. Algorithms that are very successful w.r.t. general-purpose performance metrics may perform poorly for a specific scenario. Additionally, algorithm developers frequently face an unanswerable question: will it satisfy the needs of that system (which is currently in the design phase)?”. Even when special time and resources can be allocated to collect reasonably representative data for the scenarios of interest, the answer usually remains ambiguous. Many times, during field tests or usage, the user experiences insufficient performance and the algorithm needs to be revised. In this study, we propose an approach to address this problem. Our approach is based on iterative improvement of the evaluation process. The performance requirements are determined by the field experts or the system designers. Standard questions are asked to the user/system developer and the test dataset is determined in cooperation. Each video segment in the dataset is assigned several tags for scenario type, difficulty and importance. For any novel failure case, representative videos are added to the dataset. This way, quantitative results can be organized to be more informative for the user and improvements to the algorithms can be evaluated more systematically.
|