Distributed
Processing of Probabilistic Top-k Queries in Wireless Sensor Networks
ABSTRACT:
In this paper, we introduce the notion
of sufficient set and necessary set for distributed processing of probabilistic
top-k queries in cluster-based wireless sensor networks. These two concepts
have very nice properties that can facilitate localized data pruning in
clusters. Accordingly, we develop a suite of algorithms, namely, sufficient
set-based (SSB), necessary set-based (NSB), and boundary-based (BB), for
intercluster query processing with bounded rounds of communications. Moreover,
in responding to dynamic changes of data distribution in the network, we
develop an adaptive algorithm that dynamically switches among the three
proposed algorithms to minimize the transmission cost. We show the
applicability of sufficient set and necessary set to wireless sensor networks with
both two-tier hierarchical and tree-structured network topologies. Experimental
results show that the proposed algorithms reduce data transmissions
significantly and incur only small constant rounds of data communications. The
experimental results also demonstrate the superiority of the adaptive
algorithm, which achieves a near-optimal performance under various conditions.
EXISTING SYSTEM
This new technology has resulted in
significant impacts on a wide array of applications in various fields,
including military, science, industry, commerce, transportation, and
health-care. However, the quality of sensors varies significantly in terms of
their sensing precision, accuracy, tolerance to hardware/external noise, and so
on. For example, studies show that the distribution of noise varies widely in
different photovoltic sensors, precision and accuracy of readings usually vary
significantly in humidity sensors, and the errors in GPS devices can be up to
several meters. Nevertheless, they have mostly been studied under a centralized
system setting. In this paper, we explore the problem of processing
probabilistic top-k queries in distributed wireless sensor networks. Here, we
first use an environmental monitoring application of wireless sensor network to
introduce some basics of probabilistic databases. Due to sensing imprecision
and environmental interferences, the sensor readings are usually noisy. Thus,
multiple sensors are deployed at certain zones in order to improve monitoring
quality. In this network, sensor nodes are grouped into clusters, within each
of which one of sensors is selected as the cluster head for performing
localized data processing.boudary based algorithm using hts to the data
processing.
DISADVANTAGE
OF EXISTING SYSTEM:
·
We explore the problem of processing
probabilistic top-k queries in distributed wireless sensor networks.
·
The wind station very slowly
·
Data is not accuracy purify
·
The one station to another station delay
the communication rate
PROPOSED SYSTEM
There are three proposed algorithms to
minimize the transmission cost. We show the applicability of sufficient set and
necessary set to wireless sensor networks with both two-tier hierarchical and
tree-structured network topologies. There are several top-k query semantics and
solutions proposed recently, including U-Topk and UkRanks in PT-Topk in PK-Topk
in expected rank in and so on. A common
way to process probabilistic top-k queries is to first sort all tuples based on
the scoring attribute, and then process tuples in the sorted order to compute the final answer set.
Nevertheless, while focusing on optimizing the transmission bandwidth, the proposed
techniques require numerous iterations of computation and communication,
introducing tremendous communication overhead and resulting in long latency. As
argued in this is not desirable for many
distributed applications, e.g., network monitoring, that require the queries to
be answered in a good response time, with a minimized energy consumption. In
this paper, we aim at developing energy efficient algorithms optimized for
fixed rounds of communications.
ADVANTAGE
OF PROPOSED SYSTEM:
·
Additionally, NSB and BB take advantage
of the skewed necessary sets and necessary boundaries among local clusters to
obtain their global boundaries, respectively, which are very effective for
intercluster pruning.
·
The transmission cost increases for all
algorithms because the number of tuples needed for query processing is
increased.
MODULES:
1. PT-Topk
Query Processing
2. Sensor
Networks
3. Data
pruning
4. Structured network topology
5. Data
transmission
6. Performance
evaluation
MODULE DESCRIPTION:
PT-Topk
Query Processing
The
PT-Topk queries in a centralized uncertain database, which provides a good
background for the targeted distributed processing problem. The query answer
can be obtained by examining the tuples in descending ranking order from the
sorted table (which is still denoted as T for simplicity). We can easily
determine that the highest ranked k tuples are definitely in the answer set as
long as their confidences are greater than p since their qualifications as
PT-Topk answers are not dependent on the existence of any other tuples.
Sensor
Networks
The
extensive number of research work in this area has appeared in the
literature. Due to the limited energy budget available at sensor nodes, the
primary issue is how to develop energy-efficient techniques to reduce communication
and energy costs in the networks. Approximate-based data aggregation techniques
have also been proposed. The idea is to tradeoff some data quality for improved
energy efficiency. Silberstein et al. develop a sampling-based approach to
evaluate approximate top-k queries in wireless sensor networks. Based on
statistical modeling techniques, a model-driven approach was proposed in to balance the confidence of the query answer
against the communication cost in the network. Moreover, continuous top-k queries
for sensor networks have been studied in
and . In addition, a distributed threshold join algorithm has been
developed for top-k queries. These studies, considering no uncertain data, have
a different focus from our study.
Data
pruning
The cluster heads are responsible for
generating uncertain data tuples from the collected raw sensor readings within
their clusters. To answer a query, it’s natural for the cluster heads to prune
redundant uncertain data tuples before delivery to the base station in order to
reduce communication and energy cost. The key issue here is how to derive a
compact set of tuples essential for the base station to answer the
probabilistic top-k queries.
Structured
network topology
To perform in-network query processing,
a routing tree is often formed among
sensor nodes and the base station. A query is issued at the root of the
routing tree and propagated along the tree to all sensor nodes. Although the
concepts of sufficient set and necessary set introduced earlier are based on
two-tier hierarchical sensor networks, they are applicable to tree-structured
sensor network.
Data
transmission
The total amount of data transmission as
the performance metrics. Notice that, response time is another important
metrics to evaluate query processing algorithms in wireless sensor networks.
All of those three algorithms, i.e., SSB, NSB, and BB, perform at most two
rounds of message exchange there is not much difference among SSB, NSB, and BB
in terms of query response time, thus we focus on the data transmission costin
the evaluation. Finally, we also conduct experiments to evaluate algorithms,
SSB-T, NSB-T, and NSB-T-Opt under the tree-structured network topology.
Performance
evaluation
The performance evaluation on the
distributed algorithms for processing PT-top k queries in two-tier hierarchical
cluster based wireless sensor monitoring system. As discussed, limited energy
budget is a critical issue for wireless sensor network and radio transmission
is the most dominate source of energy consumption. Thus, we measure the total
amount of data transmission as the performance metrics. Notice that, response time is another
important metrics to evaluate query processing algorithms in wireless sensor networks.
SYSTEM
FLOW:
SYSTEM CONFIGURATION:-
HARDWARE CONFIGURATION:-
ü Processor - Pentium –IV
ü Speed - 1.1
Ghz
ü RAM - 256
MB(min)
ü Hard
Disk - 20
GB
ü Key
Board - Standard
Windows Keyboard
ü Mouse - Two
or Three Button Mouse
ü Monitor - SVGA
SOFTWARE CONFIGURATION:-
ü Operating System : Windows XP
ü Programming Language : JAVA
ü Java Version : JDK
1.6 & above.
REFERENCE:
Mao Ye, Wang-Chien Lee, Dik Lun Lee, and
Xingjie Liu, “Distributed Processing of
Probabilistic Top-k Queries in Wireless Sensor Networks”, IEEE TRANSACTIONS
ON KNOWLEDGE AND DATA ENGINEERING, VOL. 25, NO. 1, JANUARY 2013.