Crowdsourcing
Predictors of Behavioral Outcomes
ABSTRACT:
Generating
models from large data sets—and determining which subsets of data to mine—is
becoming increasingly automated. However, choosing what data to collect in the
first place requires human intuition or experience, usually supplied by a
domain expert. This paper describes a new approach to machine science which
demonstrates for the first time that nondomain experts can collectively
formulate features and provide values for those features such that they are
predictive of some behavioral outcome of interest. This was accomplished by
building a Web platform in which human groups interact to both respond to questions
likely to help predict a behavioral outcome and pose new questions to their
peers. This results in a dynamically growing online survey, but the result of
this cooperative behavior also leads to models that can predict the user’s
outcomes based on their responses to the user-generated survey questions. Here,
we describe two Web-based experiments that instantiate this approach: The first
site led to models that can predict users’ monthly electric energy consumption,
and the other led to models that can predict users’ body mass index. As
exponential increases in content are often observed in successful online
collaborative communities, the proposed methodology may, in the future, lead to
similar exponential rises in discovery and insight into the causal factors of
behavioral outcomes.
EXISTING SYSTEM:
Statistical tools such as multiple regression or
neural networks provide mature methods for computing model parameters when the
set of predictive covariates and the model structure are prespecified.
Furthermore, recent research is providing new tools for inferring the
structural form of nonlinear predictive models, given good input and output
data.
DISADVANTAGES
OF EXISTING SYSTEM:
THERE ARE many problems
in which one seeks to develop predictive models to map between a set of
predictor variables and an outcome.
One aspect of the scientific method that has not yet
yielded to automation is the selection of variables for which data should be
collected to evaluate hypotheses. In the case of a prediction problem, machine
science is not yet able to select the independent variables that might predict
an outcome of interest, and for which data collection is required
PROPOSED SYSTEM:
The goal of this research was to test an alternative
approach to modeling in which the wisdom of crowds is harnessed to both propose
which potentially predictive variables to study by asking questions and to
provide the data by responding to those questions. The result is a crowd sourced
predictive model.
This paper introduces, for the first time, a method
by which non-domain experts can be motivated to formulate independent variables
as well as populate enough of these variables for successful modeling. In
short, this is accomplished as follows. Users arrive at a Web site in which a
behavioral outcome [such as household electricity usage or body mass index
(BMI)] is to be modeled. Users provide their own outcome (such as their own
BMI) and then answer questions that may be predictive of that outcome (such as
“how often per week do you exercise”). Periodically, models are constructed
against the growing data set that predict each user’s behavioral outcome. Users
may also pose their own questions that, when answered by other users, become
new independent variables in the modeling process. In essence, the task of
discovering and populating predictive independent variables is outsourced to
the user community.
ADVANTAGES
OF PROPOSED SYSTEM:
Participants successfully uncovered at least one
statistically significant predictor of the outcome variable. For the BMI outcome,
the participants successfully formulated many of the correlates known to
predict BMI and provided sufficiently honest values for those correlates to
become predictive during the experiment. While, our instantiations focus on
energy and BMI, the proposed method is general and might, as the method improves,
be useful to answer many difficult questions regarding why some outcomes are
different than others.
SYSTEM CONFIGURATION:-
HARDWARE CONFIGURATION:-
ü Processor - Pentium –IV
ü Speed - 1.1
Ghz
ü RAM - 256
MB(min)
ü Hard Disk -
20 GB
ü Key Board -
Standard Windows Keyboard
ü Mouse - Two
or Three Button Mouse
ü Monitor - SVGA
SOFTWARE CONFIGURATION:-
ü Operating System : Windows XP
ü Programming Language :
JAVA/J2EE.
ü Java Version :
JDK 1.6 & above.
ü Database :
MYSQL
REFERENCE:
Josh C. Bongard, Member, IEEE,
Paul D. H. Hines, Member, IEEE, Dylan Conger, Peter Hurd, and Zhenyu Lu,
“Crowdsourcing Predictors of Behavioral Outcomes”, IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 43,
NO. 1, JANUARY 2013.