BA/MA Thesis: Single Shot Multi-Object Detection and 6D Pose Estimation

Roland Jung
Dronehub K
Published in
2 min readNov 29, 2021

Context

The Control of Networked Systems Group (CNS) is researching the autonomous inspection of infrastructure objects with unmanned aerial vehicles (UAS). As a first step of the whole inspection pipeline, it is important to estimate the relative 6D pose between a camera mounted on the UAV and the objects currently present in the image. At CNS we have developed an artificial intelligence (AI)-based framework for multi-object 6D pose estimation. The student’s task is to further extend this framework. This thesis offers the perfect opportunity to get hands-on experience with AI and deep learning. If you are interested and want to know more about this thesis, do not hesitate to contact me at: thomas.jantos@aau.at

Description

The goal of this thesis is to incorporate deep learning based object detection into our pose estimation framework. Detecting objects in an image is one of the main disciplines of computer vision and crucial for object pose estimation. Currently, our framework works in a two-stage manner: First, an object detector localizes all objects in the image. Second, this information is provided to our framework and utilized to estimate the pose of all objects present. The student’s task is to use state-of-the-art neural networks to combine the tasks of object detection and pose estimation into a single step.

Tasks

At the beginning, the student will read current literature on the topics of object detection and pose estimation. Afterwards, the student will get acquainted with our framework and how to use it. This also includes refreshing their knowledge in Python and especially PyTorch. The main task is to extend our framework to perform object detection and pose estimation in a single shot. Finally, the student will train the framework on a state-of-the-art dataset and evaluate its performance with common metrics. In case of a master’s thesis, the student will also be instructed to implement data augmentation for the training process.

Milestones and Extensions

Milestones M1-M3 are for bachelor students. M4 is an additional point for master students.

  • M1: Literature study on object detection and pose estimation.
  • M2: Extending the current pose estimation framework with object detection.
  • M3: Training a single-shot object detector and pose estimator and evaluating its performance.
  • (M4: Implementing data augmentation for our framework.)

Preferred Skill Set

  • Very good knowledge in Python and preferably first experiences in PyTorch.
  • Knowledge in Computer Vision
  • Preferably knowledge in artificial intelligence, neural networks and deep learning.
  • Most importantly: self-driven motivation to learn new topics

Period and Contacts

Time period: 6 months, beginning as soon as possible
Supervisors: Thomas Jantos (thomas.jantos@aau.at), Jan Steinbrener (jan.steinbrener@aau.at)

--

--

Roland Jung
Dronehub K

Senior Scientist | PhD Candidate @ Networked Autonomous Aerial Vehicles (NAV) Karl Popper Kolleg, University of Klagenfurt