Method and Apparatus for Image Annotation and Multimodal Image Retrieval
The technology is based on a probabilistic semantic model in which visual features and textual words are connected via a hidden layer, which constitutes the semantic concepts to be discovered to explicitly exploit the synergy between the two modalities. The association of visual features and textual words is determined in a Bayesian framework such that the confidence of the association can be provided. In the proposed probabilistic model, a hidden concept layer which connects the visual feature and the word layer is discovered by fitting a generative model to the training image and annotation words. An Expectation-Maximization (EM) based iterative learning procedure is developed to determine the conditional probabilities of the visual features and the textual words given a hidden concept class. Based on the discovered hidden concept layer and the corresponding conditional probabilities, the image annotation and the text-to-image retrieval are performed using the Bayesian framework.
- Much better image retrieval performance
- Better image annotation performance
U.S. 7,814,040; 8,204,842; 9,280,562
Binghamton Univeristy RB209