Difference between revisions of "Research"

(Bayesian Modeling)
(Bayesian Modeling)
Line 27: Line 27:
 
| colspan="3" align="center" |
 
| colspan="3" align="center" |
  
While I agree with Reviewer 3 that the experiments involve an admirable diversity of datasets (as compared to a typical NIPS paper), I personally don't feel that the contribution has been compellingly described or validated, and suspect on the basis of the presentation that the actual improvements are pretty marginal. I attached a confidence of 4 to my review partly because of the difficulty I had in assessing the precise contribution --- I read and write lots of structured generative models and it took several tries for me to get a sense for how exactly this paper differed from previous efforts --- and partly because I'm not in the trenches of jointly modeling text and relational data, so maybe somebody closer to those datasets would be more informed and therefore more impressed by the results as reported.
+
The Web promises unprecedented access to the perspectives of an
 +
enormous number of people on a wide range of issues.  Turning that
 +
still untamed cacophony into meaningful insights requires dealing with
 +
the linguistic diversity and scale of the Web.  Most current research
 +
focuses on specialized tasks such as tracking consumer opinions, and
 +
virtually all current research treats the Web as both monolithic and
 +
monolingual, ignoring the variety of languages represented and the
 +
rich interplay between topics and issues under discussion.
 +
 
 +
This project moves the state of the art forward by focusing on two key
 +
challenges. First, highly-scalable MapReduce algorithms for
 +
linguistic modeling within a Bayesian framework, making use of
 +
variational inference to achieve a high degree of parallelization on
 +
Web-scale datasets.  Second, novel Bayesian models that learn
 +
consistent interpretations of text across languages and a wide range
 +
of response variables of interest (for example, views on an issue,
 +
strength of emotion relative to an event, and focus of attention).
 +
 
 +
The techniques developed in this project will be demonstrated on large
 +
crawls of Web pages and blogs.  Potential applications for these
 +
technologies include helping a schoolchild learn that people in
 +
different countries may view some issues very differently, helping a
 +
politician understand how constituents are reacting to proposed
 +
legislation, or helping an intelligence analyst understand how public
 +
opinion is evolving in a hostile country.
  
 
|}
 
|}

Revision as of 01:13, 11 August 2010

Machine Translation

Summarization

Parsing and Tagging

Sentiment Analysis

Bayesian Modeling

Cross‐language Bayesian models for Web‐scale text analysis using MapReduce
PI Jimmy Lin
Other Faculty Jordan Boyd-Graber, Philip Resnik
Students Lisa Simpson
Funding NSF 1018625

The Web promises unprecedented access to the perspectives of an enormous number of people on a wide range of issues. Turning that still untamed cacophony into meaningful insights requires dealing with the linguistic diversity and scale of the Web. Most current research focuses on specialized tasks such as tracking consumer opinions, and virtually all current research treats the Web as both monolithic and monolingual, ignoring the variety of languages represented and the rich interplay between topics and issues under discussion.

This project moves the state of the art forward by focusing on two key challenges. First, highly-scalable MapReduce algorithms for linguistic modeling within a Bayesian framework, making use of variational inference to achieve a high degree of parallelization on Web-scale datasets. Second, novel Bayesian models that learn consistent interpretations of text across languages and a wide range of response variables of interest (for example, views on an issue, strength of emotion relative to an event, and focus of attention).

The techniques developed in this project will be demonstrated on large crawls of Web pages and blogs. Potential applications for these technologies include helping a schoolchild learn that people in different countries may view some issues very differently, helping a politician understand how constituents are reacting to proposed legislation, or helping an intelligence analyst understand how public opinion is evolving in a hostile country.