CLIP Colloquium (Fall 2010)

October 20, Kristy Hollingshead: Search Errors and Model Errors in Pipeline Systems

Pipeline systems, in which data is sequentially processed in stages with the output of one stage providing input to the next, are ubiquitous in the field of natural language processing (NLP) as well as many other research areas. The popularity of the pipeline system architecture may be attributed to the utility of pipelines in improving scalability by reducing search complexity and increasing efficiency of the system. However, pipelines can suffer from the well-known problem of "cascading errors," where errors earlier in the pipeline propagate to later stages in the pipeline. In this talk I will make a distinction between two different type of cascading errors in pipeline systems. The first I will term "search errors," where there exists a higher-scoring candidate (according to the model), but that candidate has been excluded from the search space. The second type of error that I will address might be termed "model errors," where the highest-scoring candidate (according to the model) is not the best candidate (according to some gold standard). Statistical NLP models are imperfect by nature, resulting in model errors. Interestingly, the same pipeline framework that causes search errors can also resolve (or work around) model errors; in this talk I will demonstrate several techniques for detecting and resolving search and model errors, which can result in improved efficiency with no loss in accuracy. I will briefly mention the technique of pipeline iteration, introduced in my ACL'07 paper, and introduce some related results from my dissertation. I will then focus on work done with my PhD advisor Brian Roark on chart cell constraints, as published in our COLING'08 and NAACL'09 papers; this work provably reduces the complexity of a context-free parser to quadratic performance in the worst case (observably linear) with a slight gain in accuracy using the Charniak parser. While much of this talk will be on parsing pipelines, I am currently extending some of this work to MT pipelines and would welcome discussion along those lines.

Kristy Hollingshead earned her PhD in Computer Science and Engineering this year, from the Center for Spoken Language Understanding (CSLU) at the Oregon Health & Science University (OHSU). She received her B.A. in English-Creative Writing from the University of Colorado in 2000 and her M.S. in Computer Science from OHSU in 2004. Her research interests in natural language processing include parsing, machine translation, evaluation metrics, and assistive technologies. She is also interested in general techniques on improving system efficiency, to allow for richer contextual information to be extracted for use in downstream stages of a pipeline system. Kristy was a National Science Foundation Graduate Research Fellow from 2004-2007.

October 27, Stanley Kok: Structure Learning in Markov Logic Networks

Statistical learning handles uncertainty in a robust and principled way. Relational learning (also known as inductive logic programming) models domains involving multiple relations. Recent years have seen a surge of interest in the statistical relational learning (SRL) community in combining the two, driven by the realization that many (if not most) applications require both and by the growing maturity of the two fields.

Markov logic networks (MLNs) is a statistical relational model that has gained traction within the AI community in recent years because of its robustness to noise and its ability to compactly model complex domains. MLNs combine probability and logic by attaching weights to first-order formulas, and viewing these as templates for features of Markov networks. Learning the structure of an MLN consists of learning both formulas and their weights.

To obtain weighted MLN formulas, we could rely on human experts to specify them. However, this approach is error-prone and requires painstaking knowledge engineering. Further, it will not work on domains where there is no human expert. The ideal solution is to automatically learn MLN structure from data. However, this is a challenging task because of its super-exponential search space. In this talk, we present a series of algorithms that efficiently and accurately learn MLN structure.

November 1, Owen Rambow: Relating Language to Cognitive State

In the 80s and 90s of the last century, in subdisciplines such as planning, text generation, and dialog systems, there was considerable interest in modeling the cognitive states of interacting autonomous agents. Theories such as Speech Act Theory (Austin 1962), the belief-desire-intentions model of Bratman (1987), and Rhetorical Structure Theory (Mann and Thompson 1988) together provide a framework in which to link cognitive state with language use. However, in general natural language processing (NLP), little use was made of such theories, presumably because of the difficulty at the time of some underlying tasks (such as syntactic parsing). In this talk, I propose that it is time to again think about the explicit modeling of cognitive state for participants in discourse. In fact, that is the natural way to formulate what NLP is all about. The perspective of cognitive state can provide a context in which many disparate NLP tasks can be classified and related. I will present two NLP projects at Columbia which relate to the modeling of cognitive state:

Discourse participants need to model each other's cognitive states, and language makes this possible by providing special morphological, syntactic, and lexical markers. I present results in automatically determining the degree of belief of a speaker in the propositions in his or her utterance.

Bio: PhD from University of Pennsylvania, 1994, working on German syntax. My office mate was Philip Resnik. I worked at CoGentex, Inc (a small company) and AT&T Labs -- Research until 2002, and since then at Columbia as a Research Scientist. My research interests cover both the nuts-and-bolts of languages, specifically syntax, and how language is used in context.

November 10, Bob Carpenter: Whence Linguistic Data?

The empirical approach to linguistic theory involves collecting data and annotating it according to a coding standard. The ability of multiple annotators to consistently annotate new data reflects the applicability of the theory. In this talk, I'll introduce a generative probabilistic model of the annotation process for categorical data. Given a collection of annotated data, we can infer the true labels of items, the prevalence of some phenomenon (e.g. a given intonation or syntactic alternation), the accuracy and category bias of each annotator, and the codability of the theory as measured by the mean accuracy and bias of annotators and their variability. Hierarchical model extensions allow us to model item labeling difficulty and take into account annotator background and experience. I'll demonstrate the efficacy of the approach using expert and non-expert pools of annotators for simple linguistic labeling tasks such as textual inference, morphological tagging, and named-entity extraction. I'll discuss applications such as monitoring an annotation effort, selecting items with active learning, and generating a probabilistic gold standard for machine learning training and evaluation.

November 15, William Webber: Information retrieval effectiveness: measurably going nowhere?

Information retrieval works by heuristics; correctness cannot be formally proved, but must be empirically assessed. Test collections make this evaluation automated and repeatable. Collection-based evaluation has been standard for half a century. The IR community prides itself on the rigour of the experimental tradition that has been built upon this foundation; it is notoriously difficult to publish in the field without a thorough experimental validation. No attention, however, has been paid to the question of whether methodological rigour in evaluation has to verifiable. In this talk, we present a survey of retrieval results published over the past decade, which fails to find evidence that retrieval effectiveness is in fact improving. Rather, each experiment's impressive leap forward is preceded by a few careful steps back.

Bio: William Webber is a Research Associate in the Department of Computer Science and Software Engineering at the University of Melbourne, Australia. He has recently completed his PhD thesis, "Measurement in Information Retrieval Evaluation", under the supervision of Professors Alistair Moffat and Justin Zobel.

December 8: Michael Paul: Summarizing Contrastive Viewpoints in Opinionated Text

Performing multi-document summarization of opinionated text has unique challenges because it is important to recognize that the same information may be presented in different ways from different viewpoints. In this talk, we will present a special kind of contrastive summarization approach intended to highlight this phenomenon and to help users digest conflicting opinions. To do this, we introduce a new graph-based algorithm, Comparative LexRank, to score sentences in a summary based on a combination of both representativeness of the collection and comparability between opposing viewpoints. We then address the issue of how to automatically discover and extract viewpoints from unlabeled text, and we experiment with a novel two-dimensional topic model for the task of unsupervised clustering of documents by viewpoint. Finally, we discuss how these two stages can be combined to both automatically extract and summarize viewpoints in an interesting way. Results are presented on two political opinion data sets.

This project was joint work with ChengXiang Zhai and Roxana Girju.

Bio: Michael Paul is a first-year Ph.D. student of Computer Science at the Johns Hopkins University and a member of the Center for Language and Speech Processing. He earned a B.S. from the University of Illinois at Urbana-Champaign in 2009. He is currently a Graduate Research Fellow of the National Science Foundation and a Dean's Fellow of the Whiting School of Engineering.

Roger Levy

Considering the adversity of the conditions under which linguistic communication takes place in everyday life -- ambiguity of the signal, environmental competition for our attention, speaker error, our limited memory, and so forth -- it is perhaps remarkable that we are as successful at it as we are. Perhaps the leading explanation of this success is that (a) the linguistic signal is redundant, (b) diverse information sources are generally available that can help us obtain infer something close to the intended message when comprehending an utterance, and (c) we use these diverse information sources very quickly and to the fullest extent possible. This explanation suggests a theory of language comprehension as a rational, evidential process. In this talk, I describe recent research on how we can use the tools of computational linguistics to formalize and implement such a theory, and to apply it to a variety of problems in human sentence comprehension, including classic cases of garden-path disambiguation as well as processing difficulty in the absence of structural ambiguity. In addition, I address a number of phenomena that remain clear puzzles for the rational approach, due to an apparent failure to use information available in a sentence appropriately in global or incremental inferences about the correct interpretation of a sentence. I argue that the apparent puzzle posed by these phenomena for models of rational sentence comprehension may derive from the failure of existing models to appropriately account for the environmental and cognitive constraints -- in this case, the inherent uncertainty of perceptual input, and humans' ability to compensate for it -- under which comprehension takes place. I present a new probabilistic model of language comprehension under uncertain input and show that this model leads to solutions to the above puzzles. I also present behavioral data in support of novel predictions made by the model. More generally, I suggest that appropriately accounting for environmental and cognitive constraints in probabilistic models can lead to a more nuanced and ultimately more satisfactory picture of key aspects of human cognition.

Earl Wagner

Eugene Charniak

We present a new syntactic parser that works left-to-right and top down, thus maintaining a fully-connected parse tree for a few alternative parse hypotheses. All of the commonly used statistical parsers use context-free dynamic programming algorithms and as such work bottom up on the entire sentence. Thus they only find a complete fully connected parse at the very end. In contrast, both subjective and experimental evidence show that people understand a sentence word-to-word as they go along, or close to it. The constraint that the parser keeps one or more fully connected syntactic trees is intended to operationalize this cognitive fact. Our parser achieves a new best result for top-down parsers of 89.4%,a 20% error reduction over the previous single-parser best result for parsers of this type of 86.8% (Roark01). The improved performance is due to embracing the very large feature set available in exchange for giving up dynamic programming.

Eugene Charniak is University Professor of Computer Science and Cognitive Science at Brown University and past chair of the Department of Computer Science. He received his A.B. degree in Physics from University of Chicago, and a Ph.D. from M.I.T. in Computer Science. He has published four books the most recent being Statistical Language Learning. He is a Fellow of the American Association of Artificial Intelligence and was previously a Councilor of the organization. His research has always been in the area of language understanding or technologies which relate to it. Over the last 20 years years he has been interested in statistical techniques for many areas of language processing including parsing and discourse,

Dave Newman

Ray Mooney

Current systems that learn to process natural language require laboriously constructed human-annotated training data. Ideally, a computer would be able to acquire language like a child by being exposed to linguistic input in the context of a relevant but ambiguous perceptual environment. As a step in this direction, we present a system that learns to sportscast simulated robot soccer games by example. The training data consists of textual human commentaries on Robocup simulation games. A set of possible alternative meanings for each comment is automatically constructed from game event traces. Our previously developed systems for learning to parse and generate natural language (KRISP and WASP) were augmented to learn from this data and then commentate novel games. Using this approach, the system has learned to sportscast in both English and Korean. The system has been evaluated based on its ability to properly match sentences to the events being described, parse sentences into correct meanings, and generate accurate linguistic descriptions of events. Human evaluation was also conducted on the overall quality of the generated sportscasts and compared to human-generated commentaries, demonstrating that its sportscasts are on par with those generated by humans.

Biographical Sketch: Raymond J. Mooney is a Professor in the Department of Computer Sciences at the University of Texas at Austin. He received his Ph.D. in 1988 from the University of Illinois at Urbana/Champaign. He is an author of over 150 published research papers, primarily in the areas of machine learning and natural language processing. He is the current President of the International Machine Learning Society, was program co-chair for the 2006 AAAI Conference on Artificial Intelligence, general chair of the 2005 Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, and co-chair of the 1990 International Conference on Machine Learning. He is a Fellow of the American Association for Artificial Intelligence and recipient of best paper awards from the National Conference on Artificial Intelligence, the SIGKDD International Conference on Knowledge Discovery and Data Mining, the International Conference on Machine Learning, and the Annual Meeting of the Association for Computational Linguistics. His recent research has focused on learning for natural-language processing, connecting language and perception, statistical relational learning, and transfer learning.