Actions

Research: Difference between revisions

Computational Linguistics and Information Processing

No edit summary
(137 intermediate revisions by 10 users not shown)
Line 1: Line 1:
==Machine Translation and Paraphrasing==
In the CLIP lab, we approach research on computational linguistics and information processing from a variety of angles. Some of our ongoing projects focus on the following challenges:


{| border="0" cellpadding="5" cellspacing="0" align="center"
* Computational psycholinguistics
|-
* Computational social science
| style="border-right: 1px solid grey; background:#ffefef" | <b>Faculty</b>
* Cross-language information retrieval
[http://www.umiacs.umd.edu/~bonnie Bonnie Dorr] (interlingual and hybrid MT, semantically-informed syntactic MT) [http://umiacs.umd.edu/~mharper Mary Harper] (multilingual parsing, language modeling) [http://www.umiacs.umd.edu/~jimmylin/ Jimmy Lin] [http://www.umiacs.umd.edu/~resnik/ http://umiacs.umd.edu/~resnik/]
* Data science for finance / social good
|-
* Deep learning
| style="border-right: 1px solid grey; background:#ffefef" | <b>Postdocs </b>
* E-discovery
|
* Pattern discover in graphs / ranking and recommendation
|-
* Human-in-the-loop machine learning
| style="border-bottom: 3px solid grey; border-right: 1px solid grey; background:#ffefef" | <b>Graduate Students </b>
* Machine translation
| style="border-bottom: 3px solid grey;" | [http://www.cs.umd.edu/~hardisty/ Eric Hardisty] [http://www.umiacs.umd.edu/~asayeed/ Asad Sayed]
* Mental health
|-
* Privacy-aware information retrieval
| colspan="3" align="left" |
* Speech retrieval
* Urban computing / smart environments


The CLIP Laboratory's work in <b>machine translation</b> continues the lab's long tradition of research in translation.  Like most of the field, we work within the framework of statistical MT, but with an emphasis on taking appropriate advantage of knowledge driven or linguistically informed model structures, features, and priors.  Some current areas of research include syntactically informed language models, linguistically informed translation model features, the use of unsupervised methods in translation modeling, exploitation of large scale "cloud computing" methods, and human-machine collaborative translation via crowdsourcing. 
CLIP research has been supported by the following organizations: NSF, DARPA, ARL, IARPA, OFR (Treasury), NIST, IMLS, Google, Yahoo and the World Bank.
 
<b>Paraphrase</b>, the ability to express the same meaning in multiple ways, is an active area of research within the NLP community and here in the CLIP Laboratory.  Our work in paraphrase  includes the use of paraphrase in MT evaluation and parameter estimation, lattice and forest translation, and collaborative translation, as well as research on lexical and phrasal semantic similarity measures, meaning preservation in machine translation and summarization, and large-scale document similarity computation via cloud computing methods. 
 
<b>Representative Publications and Project Pages:</b>
* Greene and Resnik, NAACL 2009: [http://umiacs.umd.edu/~resnik/pubs/greene_resnik_naacl2009.pdf More Than Words: Syntactic Packaging and Implicit Sentiment]
|}
 
==Summarization ==
 
==Parsing and Tagging==
 
==Computational Social Science==
 
 
{| border="0" cellpadding="5" cellspacing="0" align="center"
|-
| style="border-right: 1px solid grey; background:#ffefef" | <b>Faculty</b>
| [http://www.umiacs.umd.edu/~jbg/ Jordan Boyd-Graber] [http://www.umiacs.umd.edu/~bonnie Bonnie Dorr] [http://www.umiacs.umd.edu/~jimmylin/ Jimmy Lin] [http://www.umiacs.umd.edu/~oard/ Doug Oard] [http://www.umiacs.umd.edu/~weinberg Amy Weinberg]
|-
| style="border-right: 1px solid grey; background:#ffefef" | <b>Postdocs </b>
|
|-
| style="border-bottom: 3px solid grey; border-right: 1px solid grey; background:#ffefef" | <b>Graduate Students </b>
| style="border-bottom: 3px solid grey;" | [http://www.cs.umd.edu/~hardisty/ Eric Hardisty] [http://www.umiacs.umd.edu/~asayeed/ Asad Sayed]
|-
| colspan="3" align="left" |
 
<b>Computational social science</b> involves the use of computational methods and models to leverage [http://www.sciencemag.org/cgi/content/summary/323/5915/721 "the capacity to collect and analyze data at a scale that may reveal patterns of individual and group behaviors"].  Research in the CLIP Laboratory is at the forefront of this emerging area, and includes sentiment analysis (computational modeling and prediction of opinions, perspective, and other private states), automatic analysis and visualization of the scientific literature, modeling the diffusion of technological innovations, and modeling and prediction of social goals and actions such as persuasion. 
 
<b>Representative Publications and Project Pages:</b>
* Greene and Resnik, NAACL 2009: [http://umiacs.umd.edu/~resnik/pubs/greene_resnik_naacl2009.pdf More Than Words: Syntactic Packaging and Implicit Sentiment]
 
|}
 
==Information Retrieval: From Tweets to Tomes ==
 
{| border="0" cellpadding="5" cellspacing="0" align="center"
|-
| style="border-right: 1px solid grey; background:#ffefef" | <b>Faculty</b>
| [http://www.umiacs.umd.edu/~jimmylin/ Jimmy Lin] [http://www.umiacs.umd.edu/~oard/ Doug Oard]
|-
| style="border-right: 1px solid grey; background:#ffefef" | <b>Postdocs </b>
|
|-
| style="border-bottom: 3px solid grey; border-right: 1px solid grey; background:#ffefef" | <b>Graduate Students </b>
| style="border-bottom: 3px solid grey;" |
|-
|  colspan="3" align="left" |
 
The goal of information retrieval is to help people find what they are looking for.  Information retrieval research in the CLIP lab focuses principally on retrieval based on the language contained in text, in speech, and in document images.  We work across a broad range of content types, from tweets to tomes, from talking to texting, and from Cebuano to Chinese.  Three perspectives inform our work:
* we integrate a broad range of computational linguistics techniques,
* we focus on scalable techniques that can accommodate very large collections
* we sometimes draw the boundaries of our “systems” very broadly to include both the automated tools that we create and the process by which users can best employ those tools.  
 
One example that illustrates these perspectives is our work with “cross-language information retrieval,” in which close coupling of machine translation and information retrieval techniques make it possible for people to find and use information written in languages that they can neither read nor write.  Another example is our work on the design and evaluation of “question answering” systems that can automatically find and present answers to complex questions, which serves as a bridge between our work on information retrieval and summarization.
 
<b>Representative Publications and Project Pages:</b>
* Publication 1
* Publication 2
 
|}

Revision as of 01:51, 8 September 2017

In the CLIP lab, we approach research on computational linguistics and information processing from a variety of angles. Some of our ongoing projects focus on the following challenges:

  • Computational psycholinguistics
  • Computational social science
  • Cross-language information retrieval
  • Data science for finance / social good
  • Deep learning
  • E-discovery
  • Pattern discover in graphs / ranking and recommendation
  • Human-in-the-loop machine learning
  • Machine translation
  • Mental health
  • Privacy-aware information retrieval
  • Speech retrieval
  • Urban computing / smart environments

CLIP research has been supported by the following organizations: NSF, DARPA, ARL, IARPA, OFR (Treasury), NIST, IMLS, Google, Yahoo and the World Bank.