Object detection and localization using approximate nearest neighbor search: random tree implementation
Bekele, Esubalew Tamirat
:
2009-04-21
Abstract
A machine’s intelligence is related to its autonomy. Making robots autonomous is perhaps the most difficult current challenge in the state-of-the art of robotics research. Robots are said to be autonomous, if they can learn and represent percepts through their own ways and internal representations, and as a result, can undertake tasks in a real, dynamic physical world without significant human intervention. The major problems of autonomy involve acquiring, representing, and learning percepts and using them for autonomous control. Today’s robots are far from complete autonomy. There is extensive human intervention in the design and construction of the perceptual and control systems. Rather than allowing the robot to learn percepts and control itself, human programmers encode their own ideas about knowledge and control algorithms in to the robots. As it is cumbersome and practically impossible to encode all knowledge about a dynamic and unstructured environment, autonomously learning the percepts may be the only way for a robot to acquire complete autonomy. Visual percepts would be among those to be learned by the robot autonomously. The large amounts of data associated with vision necessitate an efficient representation for visual percepts. The visual features to use, the data structures to represent them, and the algorithms used to learn them are critical to the success of autonomous perception.
This work builds on existing algorithms for computer vision that learn a specific percept with minimal intervention. It combines two features, a color probability density function and a texture measure gleaned from overlapping portions of an image. It uses an approximate nearest neighbor learning algorithm implemented with a random 3-way search tree to learn the visual input features and to associate them with labeled percepts. Once learned, the search tree is used to segment the input images into classes. The time and space complexities of the algorithm are studied and compared with a pure (naïve) nearest neighbor search implementation. The accuracy of the system is then tested.
The results of the experiments suggest that the random tree implementation is an efficient and accurate algorithm. The classification running time of the algorithm is found to be approximately logarithmic with linear space requirements, whereas the run time of the pure algorithm is approximately quadratic for a very high dimensional feature vectors like the ones used in this project.