Learning a Sequential Search for Landmarks

Learning Sequential Search, Landmarks

Teaser Image

People

An algorithm to automatically learn to detect landmarks in a image specific learned order.

Abstract

We propose a general method to find landmarks in images of objects using both appearance and spatial context. This method is applied without changes to two problems: parsing human body layouts, and finding landmarks in images of birds. Our method learns a sequential search for localizing landmarks, iteratively detecting new landmarks given the appearance and contextual information from the already detected ones. The choice of landmark to be added is opportunistic and depends on the image; for example, in one image a head-shoulder group might be expanded to a head-shoulder-hip group but in a different image to a head-shoulder-elbow group. The choice of initial landmark is similarly image dependent. Groups are scored using a learned function, which is used to expand them greedily. Our scoring function is learned from data labelled with landmarks but without any labeling of a detection order. Our method represents a novel spatial model for the kinematics of groups of landmarks, and displays strong performance on two different model problems.

Paper

Paper: CVPR 2015 Pdf (2.9 MB)


Citation

Saurabh Singh, Derek Hoiem and David Forsyth. Learning a Sequential Search for Landmarks . In Computer Vision and Pattern Recognition (2015).

BibTeX

@inproceedings{Singh2015lsslandmark,
  author = {Saurabh Singh and Derek Hoiem and David Forsyth},
  title = {Learning a Sequential Search for Landmarks},
  booktitle={Computer Vision and Pattern Recognition},
  year = {2015},
  url = {http://vision.cs.uiuc.edu/projects/lssland/}
}

Code

Coming soon.

Data

Predictions on the test images of Leeds Sports and the Fashion Pose dataset are [here].

Funding

This material is based upon the work supported in part by the National Science Foundation under Grants No. IIS 09-16014, IIS-1421521, and IIS-1029035, ONR MURI Award N00014-10-10934, and a Sloan Fellowship. We would also like to thank NVIDIA for donating some of the GPUs used in this work.