Explore MINets: Redantant Exploring Multimedia
Information Networks
ABSTRACT:
The abundance of multimedia data on the Web presents both challenges (how
to annotate, search and mine?) and opportunities (crawling
the Web to create large structured multimedia networked knowledge bases which
can be used to conduct inference effectively). The proposed research has the
ambitious aim of building a unified STRUCTURED networked database
representation called Multimedia Information Networks (MINets).
In MINets, concept nodes and multimedia data nodes
will be connected together by the ontological and cross-domain links, covering
both content and context knowledge. Characterizing a dense graph of relations
among all concepts will provide a strong graph theoretical framework to study
information fusion and inference strategies. Information fusion procedure will
occur as a natural consequence of having a complete linkage representation
among all kinds of nodes in the network.
In particular, we aspire to construct and utilize the MINets for recognition and inference tasks based on the
complementary cross-domain links between multi-modal contents. Moreover, the resultant MINets
representation will also enable cross-domain correspondence for effective
information transfer and inferences which would be otherwise much more
difficult using traditional single-domain ontology structures, and therefore
can unleash much richer knowledge resources for many fields.
SYSTEM
ARCHITECTURE:
PROBLEM
DEFINITION:
The goal of this approach is to annotate
the images with some manually defined concepts, using visual and contextual
features for learning a latent space. Specifically, by feeding the latent vectors
into existing classification models, it can be applied to multimedia
annotation, which is one of the most important problems in multimedia
retrieval. Furthermore, we show a more sophisticated algorithm which can
directly incorporate the discriminant information in training example for
multimedia annotation without using mapping as a prestep. It jointly explores
the context and content information based on a latent structure in the semantic
concept space.
EXISTING
SYSTEM:
The general approach of learning latent
semantic space has been extensively studied in the field of information retrieval.
Popular techniques include Latent Semantic Indexing (LSI), Probabilistic Latent
Semantic Indexing (PLSI), and Latent Dirichlet Allocation (LDA). These algorithms
have also been applied to multimedia domain for problems such as indexing and
retrieval. For example, learn latent feature vectors by LSI for natural scene
images, and the learned features can be used effectively with general purpose
SVM classifiers.
DISADVANTAGES
OF EXISTING SYSTEM:
Some preliminary results have shown the
effectiveness of these algorithms; however, all these methods suffer from the
problem with sparse context links, which we solve with the use of content
links.
PROPOSED
SYSTEM:
In this paper, we show that a compact
latent space can be discovered to summarize the semantic structure in MINs, which
can be seamlessly applied in the state-of-the-art multimedia information
retrieval systems. Specifically, this algorithm maps each MO into a latent
feature vector that encodes the information in both context and content
information. Based on these latent feature vectors, MOs can be effectively
classified, indexed, and retrieved in a vector space by many mature off-the shelf
vector-based multimedia retrieval methods, like clustering, Re-Ranking for
multimedia retrieval.
ADVANTAGES
OF PROPOSED SYSTEM:
Our approach is a “general purpose
technique” which can be leveraged to improve the effectiveness of a wide
variety of techniques.
MODULES:
ü Indexing
and Training Module
ü Annotation
Construction Module
ü Mapping
of the multimedia objects
ü Exploring
Context Links
ü Exploring
Content Links
ü Ambiguity
Module
MODULES
DESCRIPTION:
Indexing
and Training Module
First module of our
project is Indexing and training module for the whole set of images. Indexing
is done using an implementation of the Document Builder Interface. A simple
approach is to use the Document Builder Factory, which creates Document Builder
instances for all available features as well as popular combinations of
features (e.g. all JPEG features or all avail-able features). In a content based
image retrieval system, target images are sorted by feature similarities with
respect to the query (CBIR). In this module, we index the images and training
the images accordingly.
Annotation
Construction Module
These are the virtual
links which are created as a result of user feedback (e.g., tags), and may be
represented as the linkages between the MOs and the contextual objects such as
tags. In the real-world contextual links, the number of user tags attached to
an MO is usually quite small. In some extreme cases, only a few or even no tag
may be attached to an object, which often leads to sparse contextual links. In
such cases, it is hard to derive meaningful latent features for MOs because the
determination of the correlation structure in the latent space requires a
sufficient number of such contextual objects to occur together. A reasonable
solution to this problem is to exploit the content links between MOs. In this
paper, we will show how the content links can effectively complement the sparse
contextual links by incorporating acoustical and/or visual information to
discover the underlying latent semantic space.
Mapping
of the multimedia objects
In this paper, content
links represent the content similarities between MOs, i.e., those visually and/or
acoustically similar objects are assumed to have strong content links between
them. Content links contain important knowledge complementary to that embedded
in context links. However, to the best of our knowledge, the existing latent space
methods, LSI, PLSI, and LDA, cannot seamlessly incorporate the content and
context links in a unified framework. Some attempts have been made to jointly
model content and context information to learn the latent space. They quantify
the MOs into visual words, which are treated in a way similar to some COs by
linking them to MOs. However, such approaches greatly increase the number of
parameters in the latent space model, and make it more prone to
quantization-induced noise and overfitting due to the sparse context links. In
contrast, we will show that content and context links can be seamlessly modeled
to learn the underlying latent space. The content information does not have to
be quantified into some discrete elements such as visual words. Instead, the content
link structure will be directly leveraged to discover latent features together
with context links. Therefore, we propose an elegant mapping of MINs to the latent
space which can support an emerging paradigm of multimedia retrieval which
unifies the information in context and content links. In other words, the goal
of this approach is to annotate the images with some manually defined concepts,
using visual and contextual features for learning a latent space.
Exploring
Context Links
With the development of Web 2.0
infrastructures, rich context links are often connected to MOs on the
media-rich web sites such as Flickr, Youtube, and Facebook. In contrast to pure
content information, these links provide extra semantic information to retrieve
and index MOs in the Web environment. For a simple example, the images of “sea”
and “sky” have similar color features which are difficult to distinguish by
similarity in content feature space. However, by leveraging the user tags in
their context links and mapping them into a new latent space by LSI, PLSI, and LDA,
they can be distinguished with the semantics in their COs. Context-based
Multimedia Retrieval (CxMR) approaches have been widely used in many practical multimedia
search engines such as Google Images, which utilize the context links such as
surrounding text and user tags. Although the information in the context links
is useful in many cases, they are often sparse and noisy. In some cases, it can
lead to questionable performance, when the context contains much more
irrelevant information to the mining process. This is often evident from the
Google Image results when the images do not match the corresponding search at
all.
Exploring Content Links
The CMR approach
attempts to model high-level concepts from low-level concepts extracted from
the MOs. In a typical multimedia retrieval system like QBIC and Virage, the query
is formulated by some example MOs and/or textbased keywords. Then, the relevant
MOs are retrieved based on their content features. The advantage of CMR is that
it is an automatic retrieval approach. Once the concepts are modeled, no human
labels are required to maintain it.
Ambiguity
Module
In this module we propose our contribution of providing more accuracy to
the proposed system by enhancing using ambiguity resolving problem. Ambiguity
is, Middle vision is the stage in visual processing that combines all the basic
features in the scene into distinct, recognizable object groups. This stage of
vision comes before high-level vision (understanding the scene) and after early
vision (determining the basic features of an image). When perceiving and
recognizing images, mid-level vision comes into use when we need to classify
the object we are seeing. Higher-level vision is used when the object
classified must now be recognized as a specific member of its group. For
example, through mid-level vision we perceive a face, then through high-level
vision we recognize a face of a familiar person. Mid-level vision and
high-level vision are crucial for understanding a reality that is filled with
ambiguous perceptual inputs. Thus in this module we resolve the problem of
ambiguity and enhance the accuracy and propose an efficient system.
HARDWARE
REQUIREMENTS:
•
System : Pentium IV 2.4 GHz.
•
Hard
Disk : 40 GB.
•
Floppy
Drive : 1.44 Mb.
•
Monitor : 15 VGA Colour.
•
Mouse : Logitech.
•
Ram : 512 Mb.
SOFTWARE
REQUIREMENTS:
•
Operating system : - Windows XP.
•
Coding Language : ASP.NET, C#.Net.
•
Data Base : SQL Server 2005
REFERENCE:
Guo-Jun Qi, Charu Aggarwal, Qi Tian,
Heng Ji, and Thomas S. Huang, “Exploring Context and Content Links in Social
Media: A Latent Space Method”, IEEE
TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 34, NO. 5, MAY
2012