m-Privacy for
Collaborative Data Publishing
ABSTRACT:
In this paper, we consider the collaborative data publishing problem
for anonymizing horizontally partitioned data at multiple data providers. We
consider a new type of “insider attack” by colluding data providers who may use
their own data records (a subset of the overall data) to infer the data records
contributed by other data providers. The paper addresses this new threat, and
makes several contributions. First, we introduce the notion of m-privacy,
which guarantees that the anonymized data satisfies a given privacy constraint
against any group of up to m colluding data providers. Second, we
present heuristic algorithms exploiting the monotonicity of privacy constraints
for efficiently checking m-privacy given a group of records. Third, we
present a data provider-aware
anonymization algorithm with adaptive m-privacy checking strategies to
ensure high utility and m-privacy of anonymized data with efficiency.
Finally, we propose secure multi-party computation protocols for collaborative
data publishing with m-privacy. All protocols are extensively analyzed
and their security and efficiency are formally proved. Experiments on real-life
datasets suggest that our approach achieves better or comparable utility and
efficiency than existing and baseline algorithms while satisfying m-privacy.
EXISTING
SYSTEM:
Most work has focused on a single data provider
setting and considered the data recipient as an attacker. A large body of
literature assumes limited background knowledge of the attacker, and defines
privacy using relaxed adversarial notion by considering specific types
of attacks. Representative principles include k-anonymity, ldiversity,
and t-closeness. A few recent works have modeled the instance level
background knowledge as corruption, and studied perturbation techniques under
these syntactic privacy notions
DISADVANTAGES
OF EXISTING SYSTEM:
1.
Collaborative data publishing can be considered as a multi-party computation
problem, in which multiple providers wish to compute an anonymized view of
their data without disclosing any private and sensitive information
2. The
problem of inferring information from anonymized data has been widely studied
in a single data provider setting. A data recipient that is an attacker, e.g., P0,
attempts to infer additional information about data records using the published
data, T
,
and background knowledge, BK.
PROPOSED
SYSTEM:
We consider the collaborative data publishing
setting with horizontally partitioned data across multiple data providers, each
contributing a subset of records Ti. As a special case, a data provider could
be the data owner itself who is contributing its own records. This is a very
common scenario in social networking and recommendation systems. Our goal is to
publish an anonymized view of the integrated data such that a data recipient
including the data providers will not be able to compromise the privacy of the
individual records provided by other parties.
ADVANTAGES OF PROPOSED SYSTEM:
Compared
to our preliminary version, our new contributions extend above results. First,
we adapt privacy verification and anonymization mechanisms to work for m-privacy
with respect to any privacy constraint, including nonmonotonic ones. We list
all necessary privacy checks and prove that no fewer checks are enough to
confirm m-privacy. Second, we propose SMC protocols for secure m-privacy
verification and anonymization. For all protocols we prove their security,
complexity and experimentally confirm their efficiency.
Modules:
1. Patient
Registration
2. Attacks
by External Data Recipient Using Anonymized
Data
3. Attacks
by Data Providers Using Anonymized Data and Their Own Data
4. Doctor
Login
5. Admin
Login
Modules
Description
Patient
Registration:
In this module if patients have to take treatment, he/she
should register their details like Name, Age, and
Disease they get affected, Email etc. These details are maintained in a
Database by the Hospital management. Only Doctors can see all their details. Patient
can only see his own record.
Based
on this paper:
When the data are distributed among multiple data
providers or data owners, two main settings are used for anonymization. One
approach is for each provider to anonymize the data independently
(anonymize-and-aggregate, Figure 1A), which results in potential loss of
integrated data utility. A more desirable approach is collaborative data
publishing which anonymize data from all Providers as if they would come from
one source (aggregate-and-anonymize, Figure 1B), using either a trusted
third-party(TTP) or Secure Multi-party Computation (SMC) protocols to do
computations .
Attacks
by External Data Recipient Using Anonymized Data:
A data recipient, e.g. P0, could be an attacker and
attempts to infer additional information about the records using the published
data (T∗)
and some background knowledge (BK) such as publicly available external data.
Attacks
by Data Providers Using Anonymized Data and Their Own Data:
Each data
provider, such as P1 in Figure 1, can also use anonymized data T∗
and his own data (T1) to infer additional information about other records.
Compared to the attack by the external recipient in the first attack scenario,
each provider has additional data knowledge of their own records, which can
help with the attack. This issue can be further worsened when multiple data
providers collude with each other.
FIGURE 1
FIGURE: 2
Doctor
Login:
In this module Doctor
can see all the patients details and will get the background knowledge(BK),by
the chance he will see horizontally partitioned data of distributed data base
of the group of hospitals and can see how many patients are affected without
knowing of individual records of the patients and sensitive information about
the individuals.
Admin
Login:
In
this module Admin acts as Trusted Third Party (TTP).He can see all individual
records and their sensitive information among the overall hospital distributed
data base. Anonymation can be done by this people. He/She collected
information’s from various hospitals and grouped into each other and make them
as an anonymized data.
SYSTEM ARCHITECTURE:
SYSTEM CONFIGURATION:-
HARDWARE CONFIGURATION:-
ü Processor - Pentium –IV
ü Speed - 1.1
Ghz
ü RAM - 256
MB(min)
ü Hard Disk -
20 GB
ü Key Board -
Standard Windows Keyboard
ü Mouse - Two
or Three Button Mouse
ü Monitor - SVGA
SOFTWARE CONFIGURATION:-
ü Operating System : Windows XP
ü Programming Language :
JAVA
ü Java Version :
JDK 1.6 & above.
REFERENCE:
Slawomir Goryczka, Li Xiong, and Benjamin C. M. Fung-“m-Privacy
for Collaborative Data Publishing”- IEEE TRANSACTIONS ON KNOWLEDGE AND DATA
ENGINEERING” 2013.