|
|
(169 intermediate revisions by 10 users not shown) |
Line 1: |
Line 1: |
| ==Machine Translation==
| | In the CLIP lab, we approach research on computational linguistics and information processing from a variety of angles. Some of our ongoing projects focus on the following challenges: |
|
| |
|
| ==Summarization ==
| | * Computational psycholinguistics |
| | * Computational social science |
| | * Cross-language information retrieval |
| | * Data science for finance / social good |
| | * Deep learning |
| | * Pattern discover in graphs / ranking and recommendation |
| | * Human-in-the-loop machine learning |
| | * Machine translation |
| | * Mental health |
| | * Privacy-aware information retrieval |
| | * Speech retrieval |
| | * Urban computing / smart environments |
|
| |
|
| ==Parsing and Tagging==
| | CLIP research has been supported by the following organizations: NSF, DARPA, ARL, IARPA, OFR (Treasury), NIST, IMLS, Google, Yahoo and the World Bank. |
| | |
| ==Sentiment Analysis==
| |
| | |
| ==Bayesian Modeling==
| |
| | |
| {| border="0" cellpadding="5" cellspacing="0" align="center"
| |
| |-
| |
| ! colspan="3" style="background: #ffefef;" | Cross‐language Bayesian models for Web‐scale text analysis using MapReduce
| |
| |-
| |
| | PI
| |
| | Jimmy Lin
| |
| |-
| |
| | Other Faculty
| |
| | Jordan Boyd-Graber, Philip Resnik
| |
| |-
| |
| | Students
| |
| | Lisa Simpson
| |
| |-
| |
| | style="border-bottom: 3px solid grey;" | Funding
| |
| | style="border-bottom: 3px solid grey;" | NSF 1018625
| |
| |-
| |
| | colspan="3" align="left" |
| |
| | |
| The Web promises unprecedented access to the perspectives of an
| |
| enormous number of people on a wide range of issues. Turning that
| |
| still untamed cacophony into meaningful insights requires dealing with
| |
| the linguistic diversity and scale of the Web. Most current research
| |
| focuses on specialized tasks such as tracking consumer opinions, and
| |
| virtually all current research treats the Web as both monolithic and
| |
| monolingual, ignoring the variety of languages represented and the
| |
| rich interplay between topics and issues under discussion.
| |
| | |
| This project moves the state of the art forward by focusing on two key
| |
| challenges. First, highly-scalable MapReduce algorithms for
| |
| linguistic modeling within a Bayesian framework, making use of
| |
| variational inference to achieve a high degree of parallelization on
| |
| Web-scale datasets. Second, novel Bayesian models that learn
| |
| consistent interpretations of text across languages and a wide range
| |
| of response variables of interest (for example, views on an issue,
| |
| strength of emotion relative to an event, and focus of attention).
| |
| | |
| The techniques developed in this project will be demonstrated on large
| |
| crawls of Web pages and blogs. Potential applications for these
| |
| technologies include helping a schoolchild learn that people in
| |
| different countries may view some issues very differently, helping a
| |
| politician understand how constituents are reacting to proposed
| |
| legislation, or helping an intelligence analyst understand how public
| |
| opinion is evolving in a hostile country.
| |
| | |
| |}
| |
In the CLIP lab, we approach research on computational linguistics and information processing from a variety of angles. Some of our ongoing projects focus on the following challenges:
- Computational psycholinguistics
- Computational social science
- Cross-language information retrieval
- Data science for finance / social good
- Deep learning
- Pattern discover in graphs / ranking and recommendation
- Human-in-the-loop machine learning
- Machine translation
- Mental health
- Privacy-aware information retrieval
- Speech retrieval
- Urban computing / smart environments
CLIP research has been supported by the following organizations: NSF, DARPA, ARL, IARPA, OFR (Treasury), NIST, IMLS, Google, Yahoo and the World Bank.