Markerless Tracking

From Wikiversity
Jump to: navigation, search

The Markerless Tracking project is doing original research in the field of computer based object recognition and tracking. The aim of this project is to archive a general and widely accepted technique to perform tracking and recognition of the real environment without using any special placed markers eg. fiducials.

In order to achieve this goal we will analyze the captured reality by comparing the data to a huge number of on-the-fly generated possible configurations. This approach is called Analysis-by-Synthesis (AbS).


Motivation[edit]

A big challenge in Computer Vision is the recognition and tracking of real objects through sensor-data streams. This is needed for example in Robotics and Augmented Reality to gather information about the surrounding environment. Standard Computer Vision techniques gives good results if the objects are very simple and the sensor data is not too biased. Generally spoken we have a huge amount of computer systems which works well on recognizing and tracking special fiducial markers which must be distributed in the environment. Unfortunately there are only a few special cases where markerless tracking and recognition could be archived with todays techniques.

About Tracking[edit]

In its simplest form, tracking can be defined as the problem of estimating the trajectory of an object in the image plane as it moves around a scene. In other words, a tracker assigns consistent labels to the tracked objects in different frames of a video. Additionally, depending on the tracking domain, a tracker can also provide object-centric information, such as orientation, area, or shape of an object.[1]

Idea[edit]

Illustration of the idea using image datasets.

The basic idea is very simple. Just

  1. capture the reality through sensor data (e.g. an camera image),
  2. generate all possible virtual sensor datasets with the computer (e.g. computer graphics rendering) and
  3. select the virtual dataset which is most similar with the captured one for getting the current configuration in reality (e.g. position and orientation of the camera).

This sounds not very high performance and maybe seems to be not accurate enough for special purposes. Also a potential implementation may be questionable at the beginning. Please read about the resarch questions and answers below before you abandon the whole idea.

Assumptions[edit]

As far as known this technique works under with three assumptions:

  1. the reality can be captured from the computer through sensors
  2. the computer can simulate all possibly captured configurations of the reality
  3. captured and generated datasets can be compared in a meaningful way

Research questions and answers[edit]

  • How to increase the performance of the process?
  • Which assumptions will introduce errors and what is the overall accuracy?
  • Could this idea be realized with todays soft and hardware?
  • What is the best method to compare special kinds of data sets for selection?
  • Which research articles are helpfull and what kind of work is related to this project?
  • ...

How to contribute[edit]

Like all research projects this one really needs collaboration between people. So first thing you could do is being bold while reading these wikipages. Go ahead and edit the content! Especially if you have questions or disagree with given statements. Don't forget to write your name into the list of contributors below.

There a also a lot of places which need to be filled up with more thoughts, comments and ideas. Look at the research questions above for a start.

Contributors[edit]

References[edit]

  1. Object tracking: A survey by Alper Yilmaz and Omar Javed and Mubarak Shah in ACM Comput. Surv., web: ACM Portal

External links[edit]