<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.umiacs.umd.edu/clip/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Jimmylin</id>
	<title>CLIP - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.umiacs.umd.edu/clip/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Jimmylin"/>
	<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php/Special:Contributions/Jimmylin"/>
	<updated>2026-04-18T14:30:53Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.43.7</generator>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=759</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=759"/>
		<updated>2013-10-25T23:51:43Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
== [https://talks.cs.umd.edu/lists/7 Talk Calendar] ==&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Fall 2013)|Fall 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=CLIP_Colloquium_(Fall_2013)&amp;diff=758</id>
		<title>CLIP Colloquium (Fall 2013)</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=CLIP_Colloquium_(Fall_2013)&amp;diff=758"/>
		<updated>2013-10-25T23:48:52Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=CLIP_Colloquium_(Spring_2013)&amp;diff=757</id>
		<title>CLIP Colloquium (Spring 2013)</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=CLIP_Colloquium_(Spring_2013)&amp;diff=757"/>
		<updated>2013-10-25T23:48:38Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
== 01/30/2013: Human Translation and Machine Translation ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/pkoehn/ Philipp Koehn],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, January 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Despite all the recent successes of machine translation, when it&lt;br /&gt;
comes to high quality publishable translation, human translators&lt;br /&gt;
are still unchallenged. Since we can&#039;t beat them, can we help&lt;br /&gt;
them to become more productive? I will talk about some recent&lt;br /&gt;
work on developing assistance tools for human translators.&lt;br /&gt;
You can also check out a prototype [http://www.caitra.org/ here]&lt;br /&gt;
and learn about our ongoing European projects [http://www.casmacat.eu/ CASMACAT]&lt;br /&gt;
and [http://www.matecat.com/ MATECAT].&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Philipp Koehn is Professor of Machine Translation at the&lt;br /&gt;
School of Informatics at the University of Edinburgh, Scotland.&lt;br /&gt;
He received his PhD at the University of Southern California&lt;br /&gt;
and spent a year as postdoctoral researcher at MIT.&lt;br /&gt;
He is well-known in the field of statistical machine translation&lt;br /&gt;
for the leading open source toolkit Moses, the organization&lt;br /&gt;
of the annual Workshop on Statistical Machine Translation&lt;br /&gt;
and its evaluation campaign as well as the Machine Translation&lt;br /&gt;
Marathon. He is founding president of the ACL SIG MT and&lt;br /&gt;
currently serves a vice president-elect of the ACL SIG DAT.&lt;br /&gt;
He has published over 80 papers and the textbook in the&lt;br /&gt;
field. He manages a number of EU and DARPA funded&lt;br /&gt;
research projects aimed at morpho-syntactic models, machine&lt;br /&gt;
learning methods and computer assisted translation tools.&lt;br /&gt;
&lt;br /&gt;
== 02/06/2013: A New Recommender System for Large-scale Document Exploration ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.cmu.edu/~chongw/ Chong Wang],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
How can we help people quickly navigate the vast amount of data&lt;br /&gt;
and acquire useful knowledge from it? Recommender systems provide&lt;br /&gt;
a promising solution to this problem. They narrow down the search&lt;br /&gt;
space by providing a few recommendations that are tailored to&lt;br /&gt;
users&#039; personal preferences. However, these systems usually work&lt;br /&gt;
like a black box, limiting further opportunities to provide more&lt;br /&gt;
exploratory experiences to their users.&lt;br /&gt;
&lt;br /&gt;
In this talk, I will describe how we build a new recommender&lt;br /&gt;
system for document exploration. Specially, I will talk about two&lt;br /&gt;
building blocks of the system in detail. The first is about a new&lt;br /&gt;
probabilistic model for document recommendation that is both&lt;br /&gt;
predictive and interpretable. It not only gives better predictive&lt;br /&gt;
performance, but also provides better transparency than&lt;br /&gt;
traditional approaches. This transparency creates many new&lt;br /&gt;
opportunities for exploratory analysis---For example, a user can&lt;br /&gt;
manually adjust her preferences and the system responds to this&lt;br /&gt;
by changing its recommendations. Second, building a recommender&lt;br /&gt;
system like this requires learning the probabilistic model from&lt;br /&gt;
large-scale empirical data. I will describe a scalable approach&lt;br /&gt;
for learning a wide class of probabilistic models that include&lt;br /&gt;
our recommendation model as a special case.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Chong is a Project Scientist in Eric Xing&#039;s group, Machine Learning Department, Carnegie Mellon University.  His PhD advisor was David M. Blei from Princeton University.&lt;br /&gt;
&lt;br /&gt;
== 02/13/2013: Computational Modeling of Sociopragmatic Language Use in Arabic and English Social Media ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www1.ccls.columbia.edu/~mdiab/ Mona Diab], Columbia University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Social media language is a treasure trove for mining and understanding human interactions. In discussion fora, people naturally form groups and subgroups aligning along points of consensus and contention. These subgroup formations are quite nuanced as people could agree on some topic such as liking the movie the matrix, but some within that group might disagree on rating the acting skills of Keanu Reeves. Languages manifest these alignments exploiting  interesting sociolinguistic devices in different ways. In this talk, I will present our work on subgroup modeling and detection in both Arabic and English social media language. I will share with you our experiences with modeling both explicit and implicit attitude using high and low dimensional feature modeling. This work is the beginning of an interesting exploration into the realm of building computational  models of some aspects of the sociopragmatics of human language with the hopes that this research could lead to a  better understanding of human interaction. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Mona Diab is an Associate Professor of Computer Science at the George Washington University. She is also a cofounder of the CADIM (Columbia Arabic Dialect Modeling) group at Columbia University. Mona earned her PhD in Computational Linguistics from University of Maryland College Park with Philip Resnik in 2003 and then did her postdoctoral training with Daniel Jurafsky at Stanford University where she was part of the NLP group.  from 2005 till 2012, before joining GWU in Jan of 2013, Mona held the position of Research Scientist/Principle Investigator at Columbia University Center for Computational Learning Systems (CCLS). Mona&#039;s research  interests span computational lexical semantics, multilingual processing (with a special interest in Arabic and low resource languages), unsupervised learning for NLP, computational sociopragmatic modeling, information extraction and machine translation. Over the past 9 years, Mona has developed significant expertise in modeling low resource languages with a focus on Arabic dialect processing. She is especially interested in ways to leverage existing rich resources to inform algorithms for processing low resource languages. Her research has been published in over 90 papers in various internationally recognized scientific venues. Mona serves as the current elected President of the ACL SIG on Semitic Language Processing, she is also the elected Secretary for the ACL SIG on issues in the Lexicon (SIGLEX). She also serves on the NAACL board as an elected member. Mona recently (2012) co-founded  the yearly *SEM conference that attempts to bring together all aspects of semantic processing under the same umbrella venue. &lt;br /&gt;
&lt;br /&gt;
== 02/14/2013: Efficient Probabilistic Models for Rankings and Orderings ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://stanford.edu/~jhuang11/ Jon Huang], Stanford University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Thursday, February 14, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The need to reason probabilistically with rankings and orderings arises &lt;br /&gt;
in a number of real world problems.  Probability distributions over &lt;br /&gt;
rankings and orderings arise naturally, for example, in preference data, &lt;br /&gt;
and political election data, as well as a number of less obvious &lt;br /&gt;
settings such as topic analysis and neurodegenerative disease &lt;br /&gt;
progression modeling. Representing distributions over the space of all &lt;br /&gt;
rankings is challenging, however, due to the factorial number of ways to &lt;br /&gt;
rank a collection of items.  The focus of my talk is to discuss methods &lt;br /&gt;
for combatting this factorial explosion in probabilistic representation &lt;br /&gt;
and inference.&lt;br /&gt;
&lt;br /&gt;
Ordinarily, a typical machine learning method for dealing with &lt;br /&gt;
combinatorial complexity might be to exploit conditional independence &lt;br /&gt;
relations in order to decompose a distribution into compact factors of a &lt;br /&gt;
graphical model.  For ranked data, however, a far more natural and &lt;br /&gt;
useful probabilistic relation is that of `riffled independence&#039;.  I will &lt;br /&gt;
introduce the concept of riffled independence and discuss how these &lt;br /&gt;
riffle independent relations can be used to decompose a distribution &lt;br /&gt;
over rankings into a product of compactly represented factors.  These &lt;br /&gt;
so-called hierarchical riffle-independent distributions are particularly &lt;br /&gt;
amenable to efficient inference and learning algorithms and in many &lt;br /&gt;
cases lead to intuitively interpretable probabilistic models. To &lt;br /&gt;
illustrate the power of exploiting riffled independence, I will discuss &lt;br /&gt;
a few applications, including Irish political election analysis, &lt;br /&gt;
visualizing the japanese preferences of sushi types and modeling the &lt;br /&gt;
progression of Alzheimer&#039;s disease, showing results on real datasets in &lt;br /&gt;
each problem.&lt;br /&gt;
&lt;br /&gt;
This is joint work with Carlos Guestrin (University of Washington), &lt;br /&gt;
Ashish Kapoor (Microsoft Research) and Daniel Alexander (University &lt;br /&gt;
College London).&lt;br /&gt;
&lt;br /&gt;
== 02/27/2013: Building Scholarly Methodologies with Large-Scale Topic Analysis ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.princeton.edu/~mimno/ David Mimno], Princeton University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 27, 2013, 9:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; Hornbake (South Wing) Room 2119&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;NOTE SPECIAL TIME AND LOCATION!!!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
In the last ten years we have seen the creation of massive digital text collections, from Twitter feeds to million-book libraries, all in dozens of languages. At the same time, researchers have developed text mining methods that go beyond simple word frequency analysis to uncover thematic patterns. When we combine big data with powerful algorithms, we enable analysts in many different fields to enhance qualitative perspectives with quantitative measurements. But these methods are only useful if we can apply them at massive scale and distinguish consistent patterns from random variations. In this talk I will describe my work building reliable topic modeling methodologies for humanists, social scientists and science policy officers.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; David Mimno is a postdoctoral researcher in the Computer Science department at Princeton University. He received his PhD from the University of Massachusetts, Amherst. Before graduate school, he served as Head Programmer at the Perseus Project, a digital library for cultural heritage materials, at Tufts University. He is supported by a CRA Computing Innovation fellowship.&lt;br /&gt;
&lt;br /&gt;
== 03/13/2013: Is Any Politics Local?  An Automated Analysis of Mayoral and Gubernatorial Addresses ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://explore.georgetown.edu/people/dh335/ Dan Hopkins],  Georgetown University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, March 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Dubbed &amp;quot;laboratories of democracy,&amp;quot; America&#039;s states and its large cities face a wide variety of public policy challenges.  But in a period of expanding federal authority and increased long-distance communication, the extent to which U.S. states and large cities pursue varying policy agendas is at once important and unknown.  This paper draws on techniques from automated content analysis to measure the major topics in more than 500 &amp;quot;State of the State&amp;quot; and &amp;quot;State of the City&amp;quot; addresses given by American executive officials since 2000.  Drawing on the Correlated Topic Model (Blei and Lafferty 2006) and other approaches to topic modeling, it demonstrates that big-city mayors do address a distinctive set of topics from their counterparts in state capitols, but one that is surprisingly consistent across cities.  Knowing a mayor&#039;s political party provides little leverage on the topics he or she is likely to highlight, while the same is true for objective indicators such as economic conditions or the city&#039;s crime rate.  At the state level, partisanship proves more predictive of the topics addressed by Governors, but there, too, institutional responsibilities constrain leaders to emphasize a broad and similar set of issues.  American political institutions inscribe a substantial role for geographic and institutional differences, but the policy agendas of America&#039;s states and largest cities are homogeneous and overlapping.      &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Daniel J. Hopkins is an Assistant Professor of Government at Georgetown University whose research focuses on American politics, with a special emphasis on political behavior, urban and local politics, racial and ethnic politics, and statistical methods.  Specifically, his research has addressed issues including the role of rhetoric and of local contexts in shaping political behavior.  It has also involved the development and application of automated techniques for analyzing political rhetoric.  Professor Hopkins&#039; work has appeared in a variety of scholarly and popular outlets, including the American Political Science Review, the American Journal of Political Science, the Journal of Politics, and The Washington Post.  Professor Hopkins received his Ph.D. from Harvard University in 2007. &lt;br /&gt;
&lt;br /&gt;
== 03/27/2013: Corpora and Statistical Analysis of Non-Linguistic Symbol Systems ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Richard Sproat, Google New York&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, March 27, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We report on the creation and analysis of a set of corpora of non-linguistic symbol systems. &lt;br /&gt;
The resource, the first of its kind, consists of data from seven systems, both ancient and modern, &lt;br /&gt;
with four further systems under development, and several others planned. The systems represent &lt;br /&gt;
a range of types, including heraldic systems, formal systems, and systems that are mostly or purely &lt;br /&gt;
decorative. We also compare these systems statistically with a large set of linguistic systems, which &lt;br /&gt;
also range over both time and type.&lt;br /&gt;
&lt;br /&gt;
We show that none of the measures proposed in published work by Rao and colleagues (Rao et al., 2009a; Rao, 2010) &lt;br /&gt;
or Lee and colleagues (Lee et al., 2010a) works. In particular, Rao’s entropic measures are evidently useless when &lt;br /&gt;
one considers a wider range of examples of real non-linguistic symbol systems. And Lee’s measures, with the cutoff &lt;br /&gt;
values they propose, misclassify nearly all of our non-linguistic systems. However, we also show that one of Lee’s &lt;br /&gt;
measures, with different cutoff values, as well as another measure we develop here, do seem useful. We further &lt;br /&gt;
demonstrate that they are useful largely because they are both highly correlated with a rather trivial feature: &lt;br /&gt;
mean text length. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Richard Sproat received his Ph.D. in Linguistics from the Massachusetts&lt;br /&gt;
Institute of Technology in 1985. He has worked at AT&amp;amp;T Bell Labs, at&lt;br /&gt;
Lucent&#039;s Bell Labs and at AT&amp;amp;T Labs -- Research, before joining the faculty of&lt;br /&gt;
the University of Illinois. From there he moved to the Center for Spoken&lt;br /&gt;
Language Understanding at the Oregon Health &amp;amp; Science University. In the Fall of&lt;br /&gt;
2012 he moved to Google, New York as a Research Scientist.&lt;br /&gt;
&lt;br /&gt;
Sproat has worked in numerous areas relating to language and computational&lt;br /&gt;
linguistics, including syntax, morphology, computational morphology,&lt;br /&gt;
articulatory and acoustic phonetics, text processing, text-to-speech synthesis,&lt;br /&gt;
and text-to-scene conversion. Some of his recent work includes multilingual&lt;br /&gt;
named entity transliteration, the effects of script layout on readers&#039;&lt;br /&gt;
phonological awareness, and tools for automated assessment of child language. At&lt;br /&gt;
Google he works on multilingual text normalization and finite-state methods for&lt;br /&gt;
language processing. He also has a long-standing interest in writing systems and&lt;br /&gt;
symbol systems more generally.&lt;br /&gt;
&lt;br /&gt;
== 04/10/2013: Learning with Marginalized Corrupted Features ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cse.wustl.edu/~kilian/ Kilian Weinberger],  Washington University in St. Louis&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, April 10, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If infinite amounts of labeled data are provided, many machine learning algorithms become perfect. With finite amounts of data, regularization or priors have to be used to introduce bias into a classifier. We propose a third option: learning with marginalized corrupted features. We (implicitly) corrupt existing data as a means to generate additional, infinitely many, training samples from a slightly different data distribution -- this is computationally tractable, because the corruption can be marginalized out in closed form. Our framework leads to machine learning algorithms that are fast, generalize well and naturally scale to very large data sets. We showcase this technology as regularization for general risk minimization and for marginalized deep learning for document representations. We provide experimental results on part of speech tagging as well as document and image classification. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Kilian Q. Weinberger is an Assistant Professor in the Department of Computer Science &amp;amp; Engineering at Washington University in St. Louis. He received his Ph.D. from the University of Pennsylvania in Machine Learning under the supervision of Lawrence Saul. Prior to this, he obtained his undergraduate degree in Mathematics and Computer Science at the University of Oxford. During his career he has won several best paper awards at ICML, CVPR and AISTATS. In 2011 he was awarded the AAAI senior program chair award and in 2012 he received the NSF CAREER award. Kilian Weinberger&#039;s research is in Machine Learning and its applications. In particular, he focuses on high dimensional data analysis, metric learning, machine learned web-search ranking, transfer- and multi-task learning as well as bio medical applications.&lt;br /&gt;
&lt;br /&gt;
== 04/17/2013: Recursive Deep Learning in Natural Language Processing and Computer Vision ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.socher.org/ Richard Socher],  Stanford University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, April 17, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Hierarchical and recursive structure is commonly found in different&lt;br /&gt;
modalities, including natural language sentences and scene images. I&lt;br /&gt;
will introduce several recursive deep learning models that, unlike &lt;br /&gt;
standard deep learning methods can learn compositional meaning vector &lt;br /&gt;
representations for phrases or images.&lt;br /&gt;
&lt;br /&gt;
These recursive neural network based models obtain state-of-the-art&lt;br /&gt;
performance on a variety of syntactic and semantic language tasks&lt;br /&gt;
such as parsing, sentiment analysis, paraphrase detection and relation&lt;br /&gt;
classification for extracting knowledge from the web. Because often no&lt;br /&gt;
language specific assumptions are made the same architectures can be&lt;br /&gt;
used for visual scene understanding and object classification from 3d &lt;br /&gt;
images.&lt;br /&gt;
&lt;br /&gt;
Besides the good performance, the models capture interesting phenomena&lt;br /&gt;
in language such as compositionality. For instance the models learn&lt;br /&gt;
that “not good” has worse sentiment than “good” or that high level&lt;br /&gt;
negation can change the meaning of longer phrases with many positive &lt;br /&gt;
words. Furthermore, unlike most machine learning approaches that rely on &lt;br /&gt;
human designed feature sets, features are learned as part of the model.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Richard Socher is a PhD student at Stanford working with Chris Manning&lt;br /&gt;
and Andrew Ng. His research interests are machine learning for NLP and&lt;br /&gt;
vision. He is interested in developing new models that learn useful &lt;br /&gt;
features, capture compositional and hierarchical structure in multiple&lt;br /&gt;
modalities and perform well across multiple tasks. He was awarded the &lt;br /&gt;
2011 Yahoo! Key Scientific Challenges Award, the Distinguished &lt;br /&gt;
Application Paper Award at ICML 2011 and a Microsoft Research PhD &lt;br /&gt;
Fellowship in 2012.&lt;br /&gt;
&lt;br /&gt;
== 05/01/2013: Probabilistic Soft Logic, Stephen Bach ==&lt;br /&gt;
&lt;br /&gt;
In this talk, we will give an overview of probabilistic soft logic (PSL), a tool being developed in the LINQS group at UMD for modeling, learning, and inference in structured and multi-relational domains. We&#039;ll describe the basic syntax and semantics for the language and then describe the underlying mathematical framework upon which efficient inference and learning is built. We refer to the underlying mathematical model as a hinge-loss Markov random field (HL-MRF). HL-MRFs have a number of nice properties, including the fact that most probable explanation (MPE) inference corresponds to a convex optimization problem. We present recent results showing that, using state–of-the-art optimization techniques, we can perform inference on problems with tens of thousands of random variables in seconds, and problems with hundreds of thousands of random variables in minutes. We are currently working on several approaches for distributed inference in PSL, which promise even greater scalability. We will conclude by discussing applications of PSL to problems such as: group identification in social media, activity recognition in videos, image reconstruction, knowledge graph identification, schema mapping, drug target prediction, and others as time permits.&lt;br /&gt;
&lt;br /&gt;
== 05/08/2013: The Foreseer: Integrative Retrieval and Mining of Information in Online Communities ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www-personal.umich.edu/~qmei/ Qiaozhu Mei], University of Michigan&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, May 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With the growth of online communities, the Web has evolved from networks of shared documents into networks of knowledge-sharing groups and individuals. A vast amount of heterogeneous yet interrelated information is being generated, making existing information analysis techniques inadequate. Current data mining tools often neglect the actual context, creators, and consumers of information. Foreseer is a user-centric framework for the next generation of information retrieval and mining for online communities. It represents a new paradigm of information analysis through the integration of the four “C’s”: content, context, crowd, and cloud. &lt;br /&gt;
&lt;br /&gt;
In this talk, we will introduce our recent effort of integrative analysis and mining of information in online communities. We will highlight the real world problems in online communities to which the Foreseer techniques have been successfully applied. These topics include the identification of information needs from social media, the prediction of the adoption of hashtags in microblogging communities, and the prediction of social lending behaviors in microfinance communities.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Qiaozhu Mei is an assistant professor at the School of Information, the University of Michigan. He is widely interested in information retrieval, text mining, natural language processing and their applications in web search, social computing, and health informatics. He has served in the program committee of almost all major conferences in these areas. He is also a recipient of the NSF CAREER Award, two runner-up best student paper awards at KDD, and a SIGKDD dissertation award.&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=CLIP_Colloquium_(Fall_2013)&amp;diff=756</id>
		<title>CLIP Colloquium (Fall 2013)</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=CLIP_Colloquium_(Fall_2013)&amp;diff=756"/>
		<updated>2013-10-25T23:48:19Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: Created page with &amp;quot; == 9/4/2013 and 9/11/2013: N-Minute Madness ==  The people of CLIP talk about what&amp;#039;s going on in N minutes.  &amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&amp;#039;ll be in AVW 4172. ...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=755</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=755"/>
		<updated>2013-10-25T23:47:24Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Development of a Term Weighting Formula for Search Result Ranking ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.isical.ac.in/~jia_r/ Jiaul Paik],  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The effectiveness of all well known search engines crucially &lt;br /&gt;
depends on the quality of the underlying term weighting mechanism. &lt;br /&gt;
In this talk, first, I will briefly talk about the grand hypotheses &lt;br /&gt;
which build the foundation for effective term weighting, followed by the &lt;br /&gt;
limitations of the state of the art methods. I will then describe the &lt;br /&gt;
development of a novel TF-IDF term weighting scheme. Finally, I will &lt;br /&gt;
show the experimental resuls and compare them with the state of the &lt;br /&gt;
art term weghting schemes. The talk will conclude with some potential&lt;br /&gt;
future directions. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Jiaul Paik is a new CLIP postdoc.  He earned his PhD in Computer Science &lt;br /&gt;
from the Indian Statistical Institute, Kolkata, India. He has published a &lt;br /&gt;
number of papers in ACM TOIS, ACM TALIP and ACM SIGIR.  His research mainly &lt;br /&gt;
focuses on challenges in information retrieval. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/4/2013: Recent Advances in Automatic Summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.hlt.utdallas.edu/~yangl/ Yang Liu],  University of Texas, Dallas&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, October 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
There has been great progress on automatic summarization over the past two decades, most notably approaches based on integer linear programming (ILP). In this talk I’ll present some recent work on summarization, focusing first on extractive summarization.  I will describe work with my students on the use of a supervised regression model for bigram weight estimation and the application of an ILP model to maximize the bigram coverage in the summaries.&lt;br /&gt;
&lt;br /&gt;
In the second part of the talk, I will discuss new approaches to compressive summarization, moving a step closer to abstractive summarization.  Using a pipeline compression and summarization framework, I will show how to create a summary guided compression module and to use it for generating multiple compression candidates.  Compressed summary sentences are selected from these candidates by the application of a subsequent ILP system.&lt;br /&gt;
 &lt;br /&gt;
Finally, I will introduce a graph-cut based method for joint compression and summarization. This is more efficient than the standard ILP-based joint compressive summarization method, and has the flexibility to incorporate grammar constraints in order to generate summaries with better readability. I will present various experimental results to demonstrate the effectiveness of our approaches.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Dr. Yang Liu is currently an Associate Professor in the Computer Science Department at the University of Texas at Dallas (UTD).  She received her B.S. and M.S degree from Tsinghua University, and Ph.D from Purdue University in 2004.  She was a researcher at the International Computer Science Institute (ICSI) at Berkeley for 3 years before she joined UTD as an assistant professor in 2005.  Dr. Liu&#039;s research interest is in speech and natural language processing.  She has published over 100 papers in this field.  Dr. Liu received the NSF CAREER award in 2009 and the Air Force Young Investigator award in 2010.  She is currently an Associate editor of IEEE Transactions on Audio, Speech, and Language Processing; ACM Transactions on Speech and Language Processing; ACM Transactions on Asian Language Information Processing; and Speech Communication.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Text Analysis and Social Science: Learning to Extract International Relations from the News ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
What can text analysis tell us about society?  Enormous corpora of news, social media, and historical documents record events, beliefs, and culture.  Automated text analysis is interesting since it scales to large data sets, and can assist in discovering patterns and themes.  My research develops practical and scientifically rigorous text analysis methods that can help answer research questions in sociolinguistics and political science.&lt;br /&gt;
&lt;br /&gt;
For this talk I&#039;ll focus on our work on events and international politics.  Political scientists are interested in studying international relations through *event data*: time series records of who did what to whom, as described in news articles.  Rule-based information extraction systems have been used for 20 years to study these phenomena.  We develop a dynamic logistic normal statistical model for unsupervised learning of event classes and political dynamics from news text.  It learns what verbs and textual descriptions correspond to different types of diplomatic and military interactions between countries, and simultaneously infers the time-series of interactions between countries.  Unlike a topic model, it leverages syntactic parsing and argument structure, which is critical in this domain.  Using a parsed corpus of several million news articles over 15 years, we evaluate how well its learned event classes match ones defined by experts in previous work, how well its inferences about countries correspond to real-world conflict, and conduct a qualitative case study illustrating its inferences for the recent history of Israeli-Palestinian relations.&lt;br /&gt;
&lt;br /&gt;
This is joint work with Brandon M. Stewart (Harvard University) and Noah A. Smith (CMU).  Publication (ACL 2013) and more information here: http://brenocon.com/irevents/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Brendan O&#039;Connor (http://brenocon.com/) is a 5th year Ph.D. Candidate in Carnegie Mellon University&#039;s Machine Learning Deptartment.  He is interested in machine learning and natural language processing, especially when informed by or applied to the social sciences.  In the past he has interned in the Facebook Data Science group, and worked on crowdsourcing (Crowdflower / Dolores Labs) and &amp;quot;semantic&amp;quot; search (Powerset).  His undergraduate degree was Symbolic Systems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/16/2013: Using Semantics to help learn Phonetic Categories ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Stella Frank, University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 16, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Computational models of language acquisition seek to replicate human&lt;br /&gt;
linguistic learning capabilities, such as an infant&#039;s ability to&lt;br /&gt;
identify the relevant sound categories in a language, given similar&lt;br /&gt;
inputs.  In this talk I will present some on-going work which extends a&lt;br /&gt;
Bayesian model of phonetic categorisation (Feldman et al., 2013).  The&lt;br /&gt;
original model learns a lexicon as well as phonetic categories,&lt;br /&gt;
incorporating the constraint that phonemes appear in word contexts.&lt;br /&gt;
However, it has trouble separating minimal pairs (such as&lt;br /&gt;
&#039;cat&#039;/&#039;caught&#039;/&#039;kite&#039;).  The proposed extension adds further information&lt;br /&gt;
via situational context information, a form of weak semantics or world&lt;br /&gt;
knowledge, to disambiguate potential minimal pairs.  I will present our&lt;br /&gt;
current results and discuss potential next steps.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Stella Frank is currently a postdoc at the University of Edinburgh,&lt;br /&gt;
from whence she received a PhD in Informatics in 2013.&lt;br /&gt;
Her research interests lie in computational modelling of language&lt;br /&gt;
acquisition using unsupervised Bayesian modelling techniques.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/08/2013: Computational style analysis, with practical applications to automatic summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cis.upenn.edu/~nenkova/ Ani Nenkova],  University of Pennsylvania&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, November 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Natural language research is often equated with attempts to derive structure and meaning from unstructured data. Given this focus, computational treatment of style has largely remained unexplored. In this talk I will argue that elements of style such as the redundancy in text, the level of specificity or its entertaining effect, affect the performance of standard systems and that good approaches to computational style analysis will be beneficial for such systems. &lt;br /&gt;
&lt;br /&gt;
I will first briefly present our findings from a corpus study of local coherence motivating the need for style analysis. Theories of discourse relations and entity coherence fail to explain local coherence for a large portion of newspaper text, with stylistic factors often involved in the cases where standard theories do not apply. &lt;br /&gt;
&lt;br /&gt;
Next I will present our work on developing measures of one particular element of style, text specificity. I will discuss how we successfully developed a classifier for sentence specificity. The classifier allows us to analyze specificity in summaries produced by people and machines and reveals that machine summaries are overly specific. Furthermore analysis of sentence compression data shows that when summarizing people often edit a specific sentence in the source into general one for the summary, indicating that specificity is a suitable objective for compression systems that will naturally lead to the need for compression. &lt;br /&gt;
&lt;br /&gt;
I will conclude with a brief overview of other style-related tasks which affect the performance of summarization systems. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Ani Nenkova is an Assistant Professor of Computer and Information Science at the University of Pennsylvania. Her main areas of research are automatic text summarisation, affect recognition and text quality. She obtained her PhD degree in Computer Science from Columbia University in 2006. She also spent a year and a half as a postdoctoral fellow at Stanford University before joining Penn in Fall 2007. Ani and her collaborators are recipients of the  best student paper award at SIGDial in 2010 and best paper award at EMNLP in 2012.  She received an NSF CAREER award in 2010. The Penn team co-led by Ani won the audio-visual emotion recognition challenge (AVEC) for word-level prediction in 2012. Ani was a member of the editorial board of Computational Linguistics (2009--2011) and has served as an area chair/senior program committee member for ACL (2013, 2012, 2010), NAACL (2010, 2007), AAAI (2013) and IJCAI (2011).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/13/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, November 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/04/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ttic.uchicago.edu/~klivescu/ Karen Livescu],  Toyota Technological Institute at Chicago&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/06/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.stonybrook.edu/~ychoi/ Yejin Choi],  Stony Brook University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Fall 2013)|Fall 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=754</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=754"/>
		<updated>2013-10-10T15:24:25Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Development of a Term Weighting Formula for Search Result Ranking ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.isical.ac.in/~jia_r/ Jiaul Paik],  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The effectiveness of all well known search engines crucially &lt;br /&gt;
depends on the quality of the underlying term weighting mechanism. &lt;br /&gt;
In this talk, first, I will briefly talk about the grand hypotheses &lt;br /&gt;
which build the foundation for effective term weighting, followed by the &lt;br /&gt;
limitations of the state of the art methods. I will then describe the &lt;br /&gt;
development of a novel TF-IDF term weighting scheme. Finally, I will &lt;br /&gt;
show the experimental resuls and compare them with the state of the &lt;br /&gt;
art term weghting schemes. The talk will conclude with some potential&lt;br /&gt;
future directions. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Jiaul Paik is a new CLIP postdoc.  He earned his PhD in Computer Science &lt;br /&gt;
from the Indian Statistical Institute, Kolkata, India. He has published a &lt;br /&gt;
number of papers in ACM TOIS, ACM TALIP and ACM SIGIR.  His research mainly &lt;br /&gt;
focuses on challenges in information retrieval. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/4/2013: Recent Advances in Automatic Summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.hlt.utdallas.edu/~yangl/ Yang Liu],  University of Texas, Dallas&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, October 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
There has been great progress on automatic summarization over the past two decades, most notably approaches based on integer linear programming (ILP). In this talk I’ll present some recent work on summarization, focusing first on extractive summarization.  I will describe work with my students on the use of a supervised regression model for bigram weight estimation and the application of an ILP model to maximize the bigram coverage in the summaries.&lt;br /&gt;
&lt;br /&gt;
In the second part of the talk, I will discuss new approaches to compressive summarization, moving a step closer to abstractive summarization.  Using a pipeline compression and summarization framework, I will show how to create a summary guided compression module and to use it for generating multiple compression candidates.  Compressed summary sentences are selected from these candidates by the application of a subsequent ILP system.&lt;br /&gt;
 &lt;br /&gt;
Finally, I will introduce a graph-cut based method for joint compression and summarization. This is more efficient than the standard ILP-based joint compressive summarization method, and has the flexibility to incorporate grammar constraints in order to generate summaries with better readability. I will present various experimental results to demonstrate the effectiveness of our approaches.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Dr. Yang Liu is currently an Associate Professor in the Computer Science Department at the University of Texas at Dallas (UTD).  She received her B.S. and M.S degree from Tsinghua University, and Ph.D from Purdue University in 2004.  She was a researcher at the International Computer Science Institute (ICSI) at Berkeley for 3 years before she joined UTD as an assistant professor in 2005.  Dr. Liu&#039;s research interest is in speech and natural language processing.  She has published over 100 papers in this field.  Dr. Liu received the NSF CAREER award in 2009 and the Air Force Young Investigator award in 2010.  She is currently an Associate editor of IEEE Transactions on Audio, Speech, and Language Processing; ACM Transactions on Speech and Language Processing; ACM Transactions on Asian Language Information Processing; and Speech Communication.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Text Analysis and Social Science: Learning to Extract International Relations from the News ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
What can text analysis tell us about society?  Enormous corpora of news, social media, and historical documents record events, beliefs, and culture.  Automated text analysis is interesting since it scales to large data sets, and can assist in discovering patterns and themes.  My research develops practical and scientifically rigorous text analysis methods that can help answer research questions in sociolinguistics and political science.&lt;br /&gt;
&lt;br /&gt;
For this talk I&#039;ll focus on our work on events and international politics.  Political scientists are interested in studying international relations through *event data*: time series records of who did what to whom, as described in news articles.  Rule-based information extraction systems have been used for 20 years to study these phenomena.  We develop a dynamic logistic normal statistical model for unsupervised learning of event classes and political dynamics from news text.  It learns what verbs and textual descriptions correspond to different types of diplomatic and military interactions between countries, and simultaneously infers the time-series of interactions between countries.  Unlike a topic model, it leverages syntactic parsing and argument structure, which is critical in this domain.  Using a parsed corpus of several million news articles over 15 years, we evaluate how well its learned event classes match ones defined by experts in previous work, how well its inferences about countries correspond to real-world conflict, and conduct a qualitative case study illustrating its inferences for the recent history of Israeli-Palestinian relations.&lt;br /&gt;
&lt;br /&gt;
This is joint work with Brandon M. Stewart (Harvard University) and Noah A. Smith (CMU).  Publication (ACL 2013) and more information here: http://brenocon.com/irevents/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Brendan O&#039;Connor (http://brenocon.com/) is a 5th year Ph.D. Candidate in Carnegie Mellon University&#039;s Machine Learning Deptartment.  He is interested in machine learning and natural language processing, especially when informed by or applied to the social sciences.  In the past he has interned in the Facebook Data Science group, and worked on crowdsourcing (Crowdflower / Dolores Labs) and &amp;quot;semantic&amp;quot; search (Powerset).  His undergraduate degree was Symbolic Systems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/16/2013: Using Semantics to help learn Phonetic Categories ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Stella Frank, University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 16, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Computational models of language acquisition seek to replicate human&lt;br /&gt;
linguistic learning capabilities, such as an infant&#039;s ability to&lt;br /&gt;
identify the relevant sound categories in a language, given similar&lt;br /&gt;
inputs.  In this talk I will present some on-going work which extends a&lt;br /&gt;
Bayesian model of phonetic categorisation (Feldman et al., 2013).  The&lt;br /&gt;
original model learns a lexicon as well as phonetic categories,&lt;br /&gt;
incorporating the constraint that phonemes appear in word contexts.&lt;br /&gt;
However, it has trouble separating minimal pairs (such as&lt;br /&gt;
&#039;cat&#039;/&#039;caught&#039;/&#039;kite&#039;).  The proposed extension adds further information&lt;br /&gt;
via situational context information, a form of weak semantics or world&lt;br /&gt;
knowledge, to disambiguate potential minimal pairs.  I will present our&lt;br /&gt;
current results and discuss potential next steps.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Stella Frank is currently a postdoc at the University of Edinburgh,&lt;br /&gt;
from whence she received a PhD in Informatics in 2013.&lt;br /&gt;
Her research interests lie in computational modelling of language&lt;br /&gt;
acquisition using unsupervised Bayesian modelling techniques.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/08/2013: Computational style analysis, with practical applications to automatic summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cis.upenn.edu/~nenkova/ Ani Nenkova],  University of Pennsylvania&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, November 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Natural language research is often equated with attempts to derive structure and meaning from unstructured data. Given this focus, computational treatment of style has largely remained unexplored. In this talk I will argue that elements of style such as the redundancy in text, the level of specificity or its entertaining effect, affect the performance of standard systems and that good approaches to computational style analysis will be beneficial for such systems. &lt;br /&gt;
&lt;br /&gt;
I will first briefly present our findings from a corpus study of local coherence motivating the need for style analysis. Theories of discourse relations and entity coherence fail to explain local coherence for a large portion of newspaper text, with stylistic factors often involved in the cases where standard theories do not apply. &lt;br /&gt;
&lt;br /&gt;
Next I will present our work on developing measures of one particular element of style, text specificity. I will discuss how we successfully developed a classifier for sentence specificity. The classifier allows us to analyze specificity in summaries produced by people and machines and reveals that machine summaries are overly specific. Furthermore analysis of sentence compression data shows that when summarizing people often edit a specific sentence in the source into general one for the summary, indicating that specificity is a suitable objective for compression systems that will naturally lead to the need for compression. &lt;br /&gt;
&lt;br /&gt;
I will conclude with a brief overview of other style-related tasks which affect the performance of summarization systems. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Ani Nenkova is an Assistant Professor of Computer and Information Science at the University of Pennsylvania. Her main areas of research are automatic text summarisation, affect recognition and text quality. She obtained her PhD degree in Computer Science from Columbia University in 2006. She also spent a year and a half as a postdoctoral fellow at Stanford University before joining Penn in Fall 2007. Ani and her collaborators are recipients of the  best student paper award at SIGDial in 2010 and best paper award at EMNLP in 2012.  She received an NSF CAREER award in 2010. The Penn team co-led by Ani won the audio-visual emotion recognition challenge (AVEC) for word-level prediction in 2012. Ani was a member of the editorial board of Computational Linguistics (2009--2011) and has served as an area chair/senior program committee member for ACL (2013, 2012, 2010), NAACL (2010, 2007), AAAI (2013) and IJCAI (2011).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/13/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, November 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/04/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ttic.uchicago.edu/~klivescu/ Karen Livescu],  Toyota Technological Institute at Chicago&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/06/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.stonybrook.edu/~ychoi/ Yejin Choi],  Stony Brook University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=753</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=753"/>
		<updated>2013-10-04T03:55:44Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Development of a Term Weighting Formula for Search Result Ranking ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.isical.ac.in/~jia_r/ Jiaul Paik],  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The effectiveness of all well known search engines crucially &lt;br /&gt;
depends on the quality of the underlying term weighting mechanism. &lt;br /&gt;
In this talk, first, I will briefly talk about the grand hypotheses &lt;br /&gt;
which build the foundation for effective term weighting, followed by the &lt;br /&gt;
limitations of the state of the art methods. I will then describe the &lt;br /&gt;
development of a novel TF-IDF term weighting scheme. Finally, I will &lt;br /&gt;
show the experimental resuls and compare them with the state of the &lt;br /&gt;
art term weghting schemes. The talk will conclude with some potential&lt;br /&gt;
future directions. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Jiaul Paik is a new CLIP postdoc.  He earned his PhD in Computer Science &lt;br /&gt;
from the Indian Statistical Institute, Kolkata, India. He has published a &lt;br /&gt;
number of papers in ACM TOIS, ACM TALIP and ACM SIGIR.  His research mainly &lt;br /&gt;
focuses on challenges in information retrieval. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/4/2013: Recent Advances in Automatic Summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.hlt.utdallas.edu/~yangl/ Yang Liu],  University of Texas, Dallas&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, October 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
There has been great progress on automatic summarization over the past two decades, most notably approaches based on integer linear programming (ILP). In this talk I’ll present some recent work on summarization, focusing first on extractive summarization.  I will describe work with my students on the use of a supervised regression model for bigram weight estimation and the application of an ILP model to maximize the bigram coverage in the summaries.&lt;br /&gt;
&lt;br /&gt;
In the second part of the talk, I will discuss new approaches to compressive summarization, moving a step closer to abstractive summarization.  Using a pipeline compression and summarization framework, I will show how to create a summary guided compression module and to use it for generating multiple compression candidates.  Compressed summary sentences are selected from these candidates by the application of a subsequent ILP system.&lt;br /&gt;
 &lt;br /&gt;
Finally, I will introduce a graph-cut based method for joint compression and summarization. This is more efficient than the standard ILP-based joint compressive summarization method, and has the flexibility to incorporate grammar constraints in order to generate summaries with better readability. I will present various experimental results to demonstrate the effectiveness of our approaches.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Dr. Yang Liu is currently an Associate Professor in the Computer Science Department at the University of Texas at Dallas (UTD).  She received her B.S. and M.S degree from Tsinghua University, and Ph.D from Purdue University in 2004.  She was a researcher at the International Computer Science Institute (ICSI) at Berkeley for 3 years before she joined UTD as an assistant professor in 2005.  Dr. Liu&#039;s research interest is in speech and natural language processing.  She has published over 100 papers in this field.  Dr. Liu received the NSF CAREER award in 2009 and the Air Force Young Investigator award in 2010.  She is currently an Associate editor of IEEE Transactions on Audio, Speech, and Language Processing; ACM Transactions on Speech and Language Processing; ACM Transactions on Asian Language Information Processing; and Speech Communication.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Text Analysis and Social Science: Learning to Extract International Relations from the News ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
What can text analysis tell us about society?  Enormous corpora of news, social media, and historical documents record events, beliefs, and culture.  Automated text analysis is interesting since it scales to large data sets, and can assist in discovering patterns and themes.  My research develops practical and scientifically rigorous text analysis methods that can help answer research questions in sociolinguistics and political science.&lt;br /&gt;
&lt;br /&gt;
For this talk I&#039;ll focus on our work on events and international politics.  Political scientists are interested in studying international relations through *event data*: time series records of who did what to whom, as described in news articles.  Rule-based information extraction systems have been used for 20 years to study these phenomena.  We develop a dynamic logistic normal statistical model for unsupervised learning of event classes and political dynamics from news text.  It learns what verbs and textual descriptions correspond to different types of diplomatic and military interactions between countries, and simultaneously infers the time-series of interactions between countries.  Unlike a topic model, it leverages syntactic parsing and argument structure, which is critical in this domain.  Using a parsed corpus of several million news articles over 15 years, we evaluate how well its learned event classes match ones defined by experts in previous work, how well its inferences about countries correspond to real-world conflict, and conduct a qualitative case study illustrating its inferences for the recent history of Israeli-Palestinian relations.&lt;br /&gt;
&lt;br /&gt;
This is joint work with Brandon M. Stewart (Harvard University) and Noah A. Smith (CMU).  Publication (ACL 2013) and more information here: http://brenocon.com/irevents/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Brendan O&#039;Connor (http://brenocon.com/) is a 5th year Ph.D. Candidate in Carnegie Mellon University&#039;s Machine Learning Deptartment.  He is interested in machine learning and natural language processing, especially when informed by or applied to the social sciences.  In the past he has interned in the Facebook Data Science group, and worked on crowdsourcing (Crowdflower / Dolores Labs) and &amp;quot;semantic&amp;quot; search (Powerset).  His undergraduate degree was Symbolic Systems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/16/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Stella Frank, University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 16, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/08/2013: Computational style analysis, with practical applications to automatic summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cis.upenn.edu/~nenkova/ Ani Nenkova],  University of Pennsylvania&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, November 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Natural language research is often equated with attempts to derive structure and meaning from unstructured data. Given this focus, computational treatment of style has largely remained unexplored. In this talk I will argue that elements of style such as the redundancy in text, the level of specificity or its entertaining effect, affect the performance of standard systems and that good approaches to computational style analysis will be beneficial for such systems. &lt;br /&gt;
&lt;br /&gt;
I will first briefly present our findings from a corpus study of local coherence motivating the need for style analysis. Theories of discourse relations and entity coherence fail to explain local coherence for a large portion of newspaper text, with stylistic factors often involved in the cases where standard theories do not apply. &lt;br /&gt;
&lt;br /&gt;
Next I will present our work on developing measures of one particular element of style, text specificity. I will discuss how we successfully developed a classifier for sentence specificity. The classifier allows us to analyze specificity in summaries produced by people and machines and reveals that machine summaries are overly specific. Furthermore analysis of sentence compression data shows that when summarizing people often edit a specific sentence in the source into general one for the summary, indicating that specificity is a suitable objective for compression systems that will naturally lead to the need for compression. &lt;br /&gt;
&lt;br /&gt;
I will conclude with a brief overview of other style-related tasks which affect the performance of summarization systems. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Ani Nenkova is an Assistant Professor of Computer and Information Science at the University of Pennsylvania. Her main areas of research are automatic text summarisation, affect recognition and text quality. She obtained her PhD degree in Computer Science from Columbia University in 2006. She also spent a year and a half as a postdoctoral fellow at Stanford University before joining Penn in Fall 2007. Ani and her collaborators are recipients of the  best student paper award at SIGDial in 2010 and best paper award at EMNLP in 2012.  She received an NSF CAREER award in 2010. The Penn team co-led by Ani won the audio-visual emotion recognition challenge (AVEC) for word-level prediction in 2012. Ani was a member of the editorial board of Computational Linguistics (2009--2011) and has served as an area chair/senior program committee member for ACL (2013, 2012, 2010), NAACL (2010, 2007), AAAI (2013) and IJCAI (2011).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/13/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, November 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/04/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ttic.uchicago.edu/~klivescu/ Karen Livescu],  Toyota Technological Institute at Chicago&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/06/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.stonybrook.edu/~ychoi/ Yejin Choi],  Stony Brook University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=752</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=752"/>
		<updated>2013-09-30T18:41:22Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Development of a Term Weighting Formula for Search Result Ranking ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.isical.ac.in/~jia_r/ Jiaul Paik],  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The effectiveness of all well known search engines crucially &lt;br /&gt;
depends on the quality of the underlying term weighting mechanism. &lt;br /&gt;
In this talk, first, I will briefly talk about the grand hypotheses &lt;br /&gt;
which build the foundation for effective term weighting, followed by the &lt;br /&gt;
limitations of the state of the art methods. I will then describe the &lt;br /&gt;
development of a novel TF-IDF term weighting scheme. Finally, I will &lt;br /&gt;
show the experimental resuls and compare them with the state of the &lt;br /&gt;
art term weghting schemes. The talk will conclude with some potential&lt;br /&gt;
future directions. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Jiaul Paik is a new CLIP postdoc.  He earned his PhD in Computer Science &lt;br /&gt;
from the Indian Statistical Institute, Kolkata, India. He has published a &lt;br /&gt;
number of papers in ACM TOIS, ACM TALIP and ACM SIGIR.  His research mainly &lt;br /&gt;
focuses on challenges in information retrieval. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/4/2013: Recent Advances in Automatic Summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.hlt.utdallas.edu/~yangl/ Yang Liu],  University of Texas, Dallas&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, October 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
There has been great progress on automatic summarization over the past two decades, most notably approaches based on integer linear programming (ILP). In this talk I’ll present some recent work on summarization, focusing first on extractive summarization.  I will describe work with my students on the use of a supervised regression model for bigram weight estimation and the application of an ILP model to maximize the bigram coverage in the summaries.&lt;br /&gt;
&lt;br /&gt;
In the second part of the talk, I will discuss new approaches to compressive summarization, moving a step closer to abstractive summarization.  Using a pipeline compression and summarization framework, I will show how to create a summary guided compression module and to use it for generating multiple compression candidates.  Compressed summary sentences are selected from these candidates by the application of a subsequent ILP system.&lt;br /&gt;
 &lt;br /&gt;
Finally, I will introduce a graph-cut based method for joint compression and summarization. This is more efficient than the standard ILP-based joint compressive summarization method, and has the flexibility to incorporate grammar constraints in order to generate summaries with better readability. I will present various experimental results to demonstrate the effectiveness of our approaches.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Dr. Yang Liu is currently an Associate Professor in the Computer Science Department at the University of Texas at Dallas (UTD).  She received her B.S. and M.S degree from Tsinghua University, and Ph.D from Purdue University in 2004.  She was a researcher at the International Computer Science Institute (ICSI) at Berkeley for 3 years before she joined UTD as an assistant professor in 2005.  Dr. Liu&#039;s research interest is in speech and natural language processing.  She has published over 100 papers in this field.  Dr. Liu received the NSF CAREER award in 2009 and the Air Force Young Investigator award in 2010.  She is currently an Associate editor of IEEE Transactions on Audio, Speech, and Language Processing; ACM Transactions on Speech and Language Processing; ACM Transactions on Asian Language Information Processing; and Speech Communication.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
What can text analysis tell us about society?  Enormous corpora of news, social media, and historical documents record events, beliefs, and culture.  Automated text analysis is interesting since it scales to large data sets, and can assist in discovering patterns and themes.  My research develops practical and scientifically rigorous text analysis methods that can help answer research questions in sociolinguistics and political science.&lt;br /&gt;
&lt;br /&gt;
For this talk I&#039;ll focus on our work on events and international politics.  Political scientists are interested in studying international relations through *event data*: time series records of who did what to whom, as described in news articles.  Rule-based information extraction systems have been used for 20 years to study these phenomena.  We develop a dynamic logistic normal statistical model for unsupervised learning of event classes and political dynamics from news text.  It learns what verbs and textual descriptions correspond to different types of diplomatic and military interactions between countries, and simultaneously infers the time-series of interactions between countries.  Unlike a topic model, it leverages syntactic parsing and argument structure, which is critical in this domain.  Using a parsed corpus of several million news articles over 15 years, we evaluate how well its learned event classes match ones defined by experts in previous work, how well its inferences about countries correspond to real-world conflict, and conduct a qualitative case study illustrating its inferences for the recent history of Israeli-Palestinian relations.&lt;br /&gt;
&lt;br /&gt;
This is joint work with Brandon M. Stewart (Harvard University) and Noah A. Smith (CMU).  Publication (ACL 2013) and more information here: http://brenocon.com/irevents/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Brendan O&#039;Connor (http://brenocon.com/) is a 5th year Ph.D. Candidate in Carnegie Mellon University&#039;s Machine Learning Deptartment.  He is interested in machine learning and natural language processing, especially when informed by or applied to the social sciences.  In the past he has interned in the Facebook Data Science group, and worked on crowdsourcing (Crowdflower / Dolores Labs) and &amp;quot;semantic&amp;quot; search (Powerset).  His undergraduate degree was Symbolic Systems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/16/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Stella Frank, University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 16, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/08/2013: Computational style analysis, with practical applications to automatic summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cis.upenn.edu/~nenkova/ Ani Nenkova],  University of Pennsylvania&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, November 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Natural language research is often equated with attempts to derive structure and meaning from unstructured data. Given this focus, computational treatment of style has largely remained unexplored. In this talk I will argue that elements of style such as the redundancy in text, the level of specificity or its entertaining effect, affect the performance of standard systems and that good approaches to computational style analysis will be beneficial for such systems. &lt;br /&gt;
&lt;br /&gt;
I will first briefly present our findings from a corpus study of local coherence motivating the need for style analysis. Theories of discourse relations and entity coherence fail to explain local coherence for a large portion of newspaper text, with stylistic factors often involved in the cases where standard theories do not apply. &lt;br /&gt;
&lt;br /&gt;
Next I will present our work on developing measures of one particular element of style, text specificity. I will discuss how we successfully developed a classifier for sentence specificity. The classifier allows us to analyze specificity in summaries produced by people and machines and reveals that machine summaries are overly specific. Furthermore analysis of sentence compression data shows that when summarizing people often edit a specific sentence in the source into general one for the summary, indicating that specificity is a suitable objective for compression systems that will naturally lead to the need for compression. &lt;br /&gt;
&lt;br /&gt;
I will conclude with a brief overview of other style-related tasks which affect the performance of summarization systems. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Ani Nenkova is an Assistant Professor of Computer and Information Science at the University of Pennsylvania. Her main areas of research are automatic text summarisation, affect recognition and text quality. She obtained her PhD degree in Computer Science from Columbia University in 2006. She also spent a year and a half as a postdoctoral fellow at Stanford University before joining Penn in Fall 2007. Ani and her collaborators are recipients of the  best student paper award at SIGDial in 2010 and best paper award at EMNLP in 2012.  She received an NSF CAREER award in 2010. The Penn team co-led by Ani won the audio-visual emotion recognition challenge (AVEC) for word-level prediction in 2012. Ani was a member of the editorial board of Computational Linguistics (2009--2011) and has served as an area chair/senior program committee member for ACL (2013, 2012, 2010), NAACL (2010, 2007), AAAI (2013) and IJCAI (2011).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/13/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, November 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/04/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ttic.uchicago.edu/~klivescu/ Karen Livescu],  Toyota Technological Institute at Chicago&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/06/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.stonybrook.edu/~ychoi/ Yejin Choi],  Stony Brook University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=751</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=751"/>
		<updated>2013-09-30T16:20:46Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Development of a Term Weighting Formula for Search Result Ranking ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.isical.ac.in/~jia_r/ Jiaul Paik],  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The effectiveness of all well known search engines crucially &lt;br /&gt;
depends on the quality of the underlying term weighting mechanism. &lt;br /&gt;
In this talk, first, I will briefly talk about the grand hypotheses &lt;br /&gt;
which build the foundation for effective term weighting, followed by the &lt;br /&gt;
limitations of the state of the art methods. I will then describe the &lt;br /&gt;
development of a novel TF-IDF term weighting scheme. Finally, I will &lt;br /&gt;
show the experimental resuls and compare them with the state of the &lt;br /&gt;
art term weghting schemes. The talk will conclude with some potential&lt;br /&gt;
future directions. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Jiaul Paik is a new CLIP postdoc.  He earned his PhD in Computer Science &lt;br /&gt;
from the Indian Statistical Institute, Kolkata, India. He has published a &lt;br /&gt;
number of papers in ACM TOIS, ACM TALIP and ACM SIGIR.  His research mainly &lt;br /&gt;
focuses on challenges in information retrieval. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/4/2013: Recent Advances in Automatic Summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.hlt.utdallas.edu/~yangl/ Yang Liu],  University of Texas, Dallas&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, October 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
There has been great progress on automatic summarization over the past two decades, most notably approaches based on integer linear programming (ILP). In this talk I’ll present some recent work on summarization, focusing first on extractive summarization.  I will describe work with my students on the use of a supervised regression model for bigram weight estimation and the application of an ILP model to maximize the bigram coverage in the summaries.&lt;br /&gt;
&lt;br /&gt;
In the second part of the talk, I will discuss new approaches to compressive summarization, moving a step closer to abstractive summarization.  Using a pipeline compression and summarization framework, I will show how to create a summary guided compression module and to use it for generating multiple compression candidates.  Compressed summary sentences are selected from these candidates by the application of a subsequent ILP system.&lt;br /&gt;
 &lt;br /&gt;
Finally, I will introduce a graph-cut based method for joint compression and summarization. This is more efficient than the standard ILP-based joint compressive summarization method, and has the flexibility to incorporate grammar constraints in order to generate summaries with better readability. I will present various experimental results to demonstrate the effectiveness of our approaches.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Dr. Yang Liu is currently an Associate Professor in the Computer Science Department at the University of Texas at Dallas (UTD).  She received her B.S. and M.S degree from Tsinghua University, and Ph.D from Purdue University in 2004.  She was a researcher at the International Computer Science Institute (ICSI) at Berkeley for 3 years before she joined UTD as an assistant professor in 2005.  Dr. Liu&#039;s research interest is in speech and natural language processing.  She has published over 100 papers in this field.  Dr. Liu received the NSF CAREER award in 2009 and the Air Force Young Investigator award in 2010.  She is currently an Associate editor of IEEE Transactions on Audio, Speech, and Language Processing; ACM Transactions on Speech and Language Processing; ACM Transactions on Asian Language Information Processing; and Speech Communication.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
What can text analysis tell us about society?  Enormous corpora of news, social media, and historical documents record events, beliefs, and culture.  Automated text analysis is interesting since it scales to large data sets, and can assist in discovering patterns and themes.  My research develops practical and scientifically rigorous text analysis methods that can help answer research questions in sociolinguistics and political science.&lt;br /&gt;
&lt;br /&gt;
For this talk I&#039;ll focus on our work on events and international politics.  Political scientists are interested in studying international relations through *event data*: time series records of who did what to whom, as described in news articles.  Rule-based information extraction systems have been used for 20 years to study these phenomena.  We develop a dynamic logistic normal statistical model for unsupervised learning of event classes and political dynamics from news text.  It learns what verbs and textual descriptions correspond to different types of diplomatic and military interactions between countries, and simultaneously infers the time-series of interactions between countries.  Unlike a topic model, it leverages syntactic parsing and argument structure, which is critical in this domain.  Using a parsed corpus of several million news articles over 15 years, we evaluate how well its learned event classes match ones defined by experts in previous work, how well its inferences about countries correspond to real-world conflict, and conduct a qualitative case study illustrating its inferences for the recent history of Israeli-Palestinian relations.&lt;br /&gt;
&lt;br /&gt;
This is joint work with Brandon M. Stewart (Harvard University) and Noah A. Smith (CMU).  Publication (ACL 2013) and more information here: http://brenocon.com/irevents/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Brendan O&#039;Connor (http://brenocon.com/) is a 5th year Ph.D. Candidate in Carnegie Mellon University&#039;s Machine Learning Deptartment.  He is interested in machine learning and natural language processing, especially when informed by or applied to the social sciences.  In the past he has interned in the Facebook Data Science group, and worked on crowdsourcing (Crowdflower / Dolores Labs) and &amp;quot;semantic&amp;quot; search (Powerset).  His undergraduate degree was Symbolic Systems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/16/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Stella Frank, University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 16, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/08/2013: Computational style analysis, with practical applications to automatic summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cis.upenn.edu/~nenkova/ Ani Nenkova],  University of Pennsylvania&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, November 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Natural language research is often equated with attempts to derive structure and meaning from unstructured data. Given this focus, computational treatment of style has largely remained unexplored. In this talk I will argue that elements of style such as the redundancy in text, the level of specificity or its entertaining effect, affect the performance of standard systems and that good approaches to computational style analysis will be beneficial for such systems. &lt;br /&gt;
&lt;br /&gt;
I will first briefly present our findings from a corpus study of local coherence motivating the need for style analysis. Theories of discourse relations and entity coherence fail to explain local coherence for a large portion of newspaper text, with stylistic factors often involved in the cases where standard theories do not apply. &lt;br /&gt;
&lt;br /&gt;
Next I will present our work on developing measures of one particular element of style, text specificity. I will discuss how we successfully developed a classifier for sentence specificity. The classifier allows us to analyze specificity in summaries produced by people and machines and reveals that machine summaries are overly specific. Furthermore analysis of sentence compression data shows that when summarizing people often edit a specific sentence in the source into general one for the summary, indicating that specificity is a suitable objective for compression systems that will naturally lead to the need for compression. &lt;br /&gt;
&lt;br /&gt;
I will conclude with a brief overview of other style-related tasks which affect the performance of summarization systems. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Ani Nenkova is an Assistant Professor of Computer and Information Science at the University of Pennsylvania. Her main areas of research are automatic text summarisation, affect recognition and text quality. She obtained her PhD degree in Computer Science from Columbia University in 2006. She also spent a year and a half as a postdoctoral fellow at Stanford University before joining Penn in Fall 2007. Ani and her collaborators are recipients of the  best student paper award at SIGDial in 2010 and best paper award at EMNLP in 2012.  She received an NSF CAREER award in 2010. The Penn team co-led by Ani won the audio-visual emotion recognition challenge (AVEC) for word-level prediction in 2012. Ani was a member of the editorial board of Computational Linguistics (2009--2011) and has served as an area chair/senior program committee member for ACL (2013, 2012, 2010), NAACL (2010, 2007), AAAI (2013) and IJCAI (2011).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/04/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ttic.uchicago.edu/~klivescu/ Karen Livescu],  Toyota Technological Institute at Chicago&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/06/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.stonybrook.edu/~ychoi/ Yejin Choi],  Stony Brook University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=750</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=750"/>
		<updated>2013-09-29T23:28:57Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Development of a Term Weighting Formula for Search Result Ranking ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.isical.ac.in/~jia_r/ Jiaul Paik],  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The effectiveness of all well known search engines crucially &lt;br /&gt;
depends on the quality of the underlying term weighting mechanism. &lt;br /&gt;
In this talk, first, I will briefly talk about the grand hypotheses &lt;br /&gt;
which build the foundation for effective term weighting, followed by the &lt;br /&gt;
limitations of the state of the art methods. I will then describe the &lt;br /&gt;
development of a novel TF-IDF term weighting scheme. Finally, I will &lt;br /&gt;
show the experimental resuls and compare them with the state of the &lt;br /&gt;
art term weghting schemes. The talk will conclude with some potential&lt;br /&gt;
future directions. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Jiaul Paik is a new CLIP postdoc.  He earned his PhD in Computer Science &lt;br /&gt;
from the Indian Statistical Institute, Kolkata, India. He has published a &lt;br /&gt;
number of papers in ACM TOIS, ACM TALIP and ACM SIGIR.  His research mainly &lt;br /&gt;
focuses on challenges in information retrieval. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/4/2013: Recent Advances in Automatic Summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.hlt.utdallas.edu/~yangl/ Yang Liu],  University of Texas, Dallas&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, October 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
There has been great progress on automatic summarization over the past two decades, most notably approaches based on integer linear programming (ILP). In this talk I’ll present some recent work on summarization, focusing first on extractive summarization.  I will describe work with my students on the use of a supervised regression model for bigram weight estimation and the application of an ILP model to maximize the bigram coverage in the summaries.&lt;br /&gt;
&lt;br /&gt;
In the second part of the talk, I will discuss new approaches to compressive summarization, moving a step closer to abstractive summarization.  Using a pipeline compression and summarization framework, I will show how to create a summary guided compression module and to use it for generating multiple compression candidates.  Compressed summary sentences are selected from these candidates by the application of a subsequent ILP system.&lt;br /&gt;
 &lt;br /&gt;
Finally, I will introduce a graph-cut based method for joint compression and summarization. This is more efficient than the standard ILP-based joint compressive summarization method, and has the flexibility to incorporate grammar constraints in order to generate summaries with better readability. I will present various experimental results to demonstrate the effectiveness of our approaches.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Dr. Yang Liu is currently an Associate Professor in the Computer Science Department at the University of Texas at Dallas (UTD).  She received her B.S. and M.S degree from Tsinghua University, and Ph.D from Purdue University in 2004.  She was a researcher at the International Computer Science Institute (ICSI) at Berkeley for 3 years before she joined UTD as an assistant professor in 2005.  Dr. Liu&#039;s research interest is in speech and natural language processing.  She has published over 100 papers in this field.  Dr. Liu received the NSF CAREER award in 2009 and the Air Force Young Investigator award in 2010.  She is currently an Associate editor of IEEE Transactions on Audio, Speech, and Language Processing; ACM Transactions on Speech and Language Processing; ACM Transactions on Asian Language Information Processing; and Speech Communication.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/16/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Stella Frank, University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 16, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/08/2013: Computational style analysis, with practical applications to automatic summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cis.upenn.edu/~nenkova/ Ani Nenkova],  University of Pennsylvania&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, November 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Natural language research is often equated with attempts to derive structure and meaning from unstructured data. Given this focus, computational treatment of style has largely remained unexplored. In this talk I will argue that elements of style such as the redundancy in text, the level of specificity or its entertaining effect, affect the performance of standard systems and that good approaches to computational style analysis will be beneficial for such systems. &lt;br /&gt;
&lt;br /&gt;
I will first briefly present our findings from a corpus study of local coherence motivating the need for style analysis. Theories of discourse relations and entity coherence fail to explain local coherence for a large portion of newspaper text, with stylistic factors often involved in the cases where standard theories do not apply. &lt;br /&gt;
&lt;br /&gt;
Next I will present our work on developing measures of one particular element of style, text specificity. I will discuss how we successfully developed a classifier for sentence specificity. The classifier allows us to analyze specificity in summaries produced by people and machines and reveals that machine summaries are overly specific. Furthermore analysis of sentence compression data shows that when summarizing people often edit a specific sentence in the source into general one for the summary, indicating that specificity is a suitable objective for compression systems that will naturally lead to the need for compression. &lt;br /&gt;
&lt;br /&gt;
I will conclude with a brief overview of other style-related tasks which affect the performance of summarization systems. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Ani Nenkova is an Assistant Professor of Computer and Information Science at the University of Pennsylvania. Her main areas of research are automatic text summarisation, affect recognition and text quality. She obtained her PhD degree in Computer Science from Columbia University in 2006. She also spent a year and a half as a postdoctoral fellow at Stanford University before joining Penn in Fall 2007. Ani and her collaborators are recipients of the  best student paper award at SIGDial in 2010 and best paper award at EMNLP in 2012.  She received an NSF CAREER award in 2010. The Penn team co-led by Ani won the audio-visual emotion recognition challenge (AVEC) for word-level prediction in 2012. Ani was a member of the editorial board of Computational Linguistics (2009--2011) and has served as an area chair/senior program committee member for ACL (2013, 2012, 2010), NAACL (2010, 2007), AAAI (2013) and IJCAI (2011).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/04/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ttic.uchicago.edu/~klivescu/ Karen Livescu],  Toyota Technological Institute at Chicago&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/06/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.stonybrook.edu/~ychoi/ Yejin Choi],  Stony Brook University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=749</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=749"/>
		<updated>2013-09-26T14:00:45Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/4/2013: Recent Advances in Automatic Summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.hlt.utdallas.edu/~yangl/ Yang Liu],  University of Texas, Dallas&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, October 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
There has been great progress on automatic summarization over the past two decades, most notably approaches based on integer linear programming (ILP). In this talk I’ll present some recent work on summarization, focusing first on extractive summarization.  I will describe work with my students on the use of a supervised regression model for bigram weight estimation and the application of an ILP model to maximize the bigram coverage in the summaries.&lt;br /&gt;
&lt;br /&gt;
In the second part of the talk, I will discuss new approaches to compressive summarization, moving a step closer to abstractive summarization.  Using a pipeline compression and summarization framework, I will show how to create a summary guided compression module and to use it for generating multiple compression candidates.  Compressed summary sentences are selected from these candidates by the application of a subsequent ILP system.&lt;br /&gt;
 &lt;br /&gt;
Finally, I will introduce a graph-cut based method for joint compression and summarization. This is more efficient than the standard ILP-based joint compressive summarization method, and has the flexibility to incorporate grammar constraints in order to generate summaries with better readability. I will present various experimental results to demonstrate the effectiveness of our approaches.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Dr. Yang Liu is currently an Associate Professor in the Computer Science Department at the University of Texas at Dallas (UTD).  She received her B.S. and M.S degree from Tsinghua University, and Ph.D from Purdue University in 2004.  She was a researcher at the International Computer Science Institute (ICSI) at Berkeley for 3 years before she joined UTD as an assistant professor in 2005.  Dr. Liu&#039;s research interest is in speech and natural language processing.  She has published over 100 papers in this field.  Dr. Liu received the NSF CAREER award in 2009 and the Air Force Young Investigator award in 2010.  She is currently an Associate editor of IEEE Transactions on Audio, Speech, and Language Processing; ACM Transactions on Speech and Language Processing; ACM Transactions on Asian Language Information Processing; and Speech Communication.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/16/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Stella Frank, University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 16, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/08/2013: Computational style analysis, with practical applications to automatic summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cis.upenn.edu/~nenkova/ Ani Nenkova],  University of Pennsylvania&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, November 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Natural language research is often equated with attempts to derive structure and meaning from unstructured data. Given this focus, computational treatment of style has largely remained unexplored. In this talk I will argue that elements of style such as the redundancy in text, the level of specificity or its entertaining effect, affect the performance of standard systems and that good approaches to computational style analysis will be beneficial for such systems. &lt;br /&gt;
&lt;br /&gt;
I will first briefly present our findings from a corpus study of local coherence motivating the need for style analysis. Theories of discourse relations and entity coherence fail to explain local coherence for a large portion of newspaper text, with stylistic factors often involved in the cases where standard theories do not apply. &lt;br /&gt;
&lt;br /&gt;
Next I will present our work on developing measures of one particular element of style, text specificity. I will discuss how we successfully developed a classifier for sentence specificity. The classifier allows us to analyze specificity in summaries produced by people and machines and reveals that machine summaries are overly specific. Furthermore analysis of sentence compression data shows that when summarizing people often edit a specific sentence in the source into general one for the summary, indicating that specificity is a suitable objective for compression systems that will naturally lead to the need for compression. &lt;br /&gt;
&lt;br /&gt;
I will conclude with a brief overview of other style-related tasks which affect the performance of summarization systems. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Ani Nenkova is an Assistant Professor of Computer and Information Science at the University of Pennsylvania. Her main areas of research are automatic text summarisation, affect recognition and text quality. She obtained her PhD degree in Computer Science from Columbia University in 2006. She also spent a year and a half as a postdoctoral fellow at Stanford University before joining Penn in Fall 2007. Ani and her collaborators are recipients of the  best student paper award at SIGDial in 2010 and best paper award at EMNLP in 2012.  She received an NSF CAREER award in 2010. The Penn team co-led by Ani won the audio-visual emotion recognition challenge (AVEC) for word-level prediction in 2012. Ani was a member of the editorial board of Computational Linguistics (2009--2011) and has served as an area chair/senior program committee member for ACL (2013, 2012, 2010), NAACL (2010, 2007), AAAI (2013) and IJCAI (2011).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/04/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ttic.uchicago.edu/~klivescu/ Karen Livescu],  Toyota Technological Institute at Chicago&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/06/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.stonybrook.edu/~ychoi/ Yejin Choi],  Stony Brook University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=748</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=748"/>
		<updated>2013-09-23T13:13:10Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/4/2013: Recent Advances in Automatic Summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.hlt.utdallas.edu/~yangl/ Yang Liu],  University of Texas, Dallas&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, October 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
There has been great progress on automatic summarization over the past two decades, most notably approaches based on integer linear programming (ILP). In this talk I’ll present some recent work on summarization, focusing first on extractive summarization.  I will describe work with my students on the use of a supervised regression model for bigram weight estimation and the application of an ILP model to maximize the bigram coverage in the summaries.&lt;br /&gt;
&lt;br /&gt;
In the second part of the talk, I will discuss new approaches to compressive summarization, moving a step closer to abstractive summarization.  Using a pipeline compression and summarization framework, I will show how to create a summary guided compression module and to use it for generating multiple compression candidates.  Compressed summary sentences are selected from these candidates by the application of a subsequent ILP system.&lt;br /&gt;
 &lt;br /&gt;
Finally, I will introduce a graph-cut based method for joint compression and summarization. This is more efficient than the standard ILP-based joint compressive summarization method, and has the flexibility to incorporate grammar constraints in order to generate summaries with better readability. I will present various experimental results to demonstrate the effectiveness of our approaches.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Dr. Yang Liu is currently an Associate Professor in the Computer Science Department at the University of Texas at Dallas (UTD).  She received her B.S. and M.S degree from Tsinghua University, and Ph.D from Purdue University in 2004.  She was a researcher at the International Computer Science Institute (ICSI) at Berkeley for 3 years before she joined UTD as an assistant professor in 2005.  Dr. Liu&#039;s research interest is in speech and natural language processing.  She has published over 100 papers in this field.  Dr. Liu received the NSF CAREER award in 2009 and the Air Force Young Investigator award in 2010.  She is currently an Associate editor of IEEE Transactions on Audio, Speech, and Language Processing; ACM Transactions on Speech and Language Processing; ACM Transactions on Asian Language Information Processing; and Speech Communication.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/08/2013: Computational style analysis, with practical applications to automatic summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cis.upenn.edu/~nenkova/ Ani Nenkova],  University of Pennsylvania&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, November 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Natural language research is often equated with attempts to derive structure and meaning from unstructured data. Given this focus, computational treatment of style has largely remained unexplored. In this talk I will argue that elements of style such as the redundancy in text, the level of specificity or its entertaining effect, affect the performance of standard systems and that good approaches to computational style analysis will be beneficial for such systems. &lt;br /&gt;
&lt;br /&gt;
I will first briefly present our findings from a corpus study of local coherence motivating the need for style analysis. Theories of discourse relations and entity coherence fail to explain local coherence for a large portion of newspaper text, with stylistic factors often involved in the cases where standard theories do not apply. &lt;br /&gt;
&lt;br /&gt;
Next I will present our work on developing measures of one particular element of style, text specificity. I will discuss how we successfully developed a classifier for sentence specificity. The classifier allows us to analyze specificity in summaries produced by people and machines and reveals that machine summaries are overly specific. Furthermore analysis of sentence compression data shows that when summarizing people often edit a specific sentence in the source into general one for the summary, indicating that specificity is a suitable objective for compression systems that will naturally lead to the need for compression. &lt;br /&gt;
&lt;br /&gt;
I will conclude with a brief overview of other style-related tasks which affect the performance of summarization systems. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Ani Nenkova is an Assistant Professor of Computer and Information Science at the University of Pennsylvania. Her main areas of research are automatic text summarisation, affect recognition and text quality. She obtained her PhD degree in Computer Science from Columbia University in 2006. She also spent a year and a half as a postdoctoral fellow at Stanford University before joining Penn in Fall 2007. Ani and her collaborators are recipients of the  best student paper award at SIGDial in 2010 and best paper award at EMNLP in 2012.  She received an NSF CAREER award in 2010. The Penn team co-led by Ani won the audio-visual emotion recognition challenge (AVEC) for word-level prediction in 2012. Ani was a member of the editorial board of Computational Linguistics (2009--2011) and has served as an area chair/senior program committee member for ACL (2013, 2012, 2010), NAACL (2010, 2007), AAAI (2013) and IJCAI (2011).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/04/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ttic.uchicago.edu/~klivescu/ Karen Livescu],  Toyota Technological Institute at Chicago&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/06/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.stonybrook.edu/~ychoi/ Yejin Choi],  Stony Brook University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=747</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=747"/>
		<updated>2013-09-23T13:12:49Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: /* 10/30/2013: Teaching machines to read for fun and profit */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/4/2013: Recent Advances in Automatic Summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.hlt.utdallas.edu/~yangl/ Yang Liu],  University of Texas, Dallas&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, October 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
There has been great progress on automatic summarization over the past two decades, most notably approaches based on integer linear programming (ILP). In this talk I’ll present some recent work on summarization, focusing first on extractive summarization.  I will describe work with my students on the use of a supervised regression model for bigram weight estimation and the application of an ILP model to maximize the bigram coverage in the summaries.&lt;br /&gt;
&lt;br /&gt;
In the second part of the talk, I will discuss new approaches to compressive summarization, moving a step closer to abstractive summarization.  Using a pipeline compression and summarization framework, I will show how to create a summary guided compression module and to use it for generating multiple compression candidates.  Compressed summary sentences are selected from these candidates by the application of a subsequent ILP system.&lt;br /&gt;
 &lt;br /&gt;
Finally, I will introduce a graph-cut based method for joint compression and summarization. This is more efficient than the standard ILP-based joint compressive summarization method, and has the flexibility to incorporate grammar constraints in order to generate summaries with better readability. I will present various experimental results to demonstrate the effectiveness of our approaches.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Dr. Yang Liu is currently an Associate Professor in the Computer Science Department at the University of Texas at Dallas (UTD).  She received her B.S. and M.S degree from Tsinghua University, and Ph.D from Purdue University in 2004.  She was a researcher at the International Computer Science Institute (ICSI) at Berkeley for 3 years before she joined UTD as an assistant professor in 2005.  Dr. Liu&#039;s research interest is in speech and natural language processing.  She has published over 100 papers in this field.  Dr. Liu received the NSF CAREER award in 2009 and the Air Force Young Investigator award in 2010.  She is currently an Associate editor of IEEE Transactions on Audio, Speech, and Language Processing; ACM Transactions on Speech and Language Processing; ACM Transactions on Asian Language Information Processing; and Speech Communication.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 11/08/2013: Computational style analysis, with practical applications to automatic summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cis.upenn.edu/~nenkova/ Ani Nenkova],  University of Pennsylvania&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, November 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Natural language research is often equated with attempts to derive structure and meaning from unstructured data. Given this focus, computational treatment of style has largely remained unexplored. In this talk I will argue that elements of style such as the redundancy in text, the level of specificity or its entertaining effect, affect the performance of standard systems and that good approaches to computational style analysis will be beneficial for such systems. &lt;br /&gt;
&lt;br /&gt;
I will first briefly present our findings from a corpus study of local coherence motivating the need for style analysis. Theories of discourse relations and entity coherence fail to explain local coherence for a large portion of newspaper text, with stylistic factors often involved in the cases where standard theories do not apply. &lt;br /&gt;
&lt;br /&gt;
Next I will present our work on developing measures of one particular element of style, text specificity. I will discuss how we successfully developed a classifier for sentence specificity. The classifier allows us to analyze specificity in summaries produced by people and machines and reveals that machine summaries are overly specific. Furthermore analysis of sentence compression data shows that when summarizing people often edit a specific sentence in the source into general one for the summary, indicating that specificity is a suitable objective for compression systems that will naturally lead to the need for compression. &lt;br /&gt;
&lt;br /&gt;
I will conclude with a brief overview of other style-related tasks which affect the performance of summarization systems. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Ani Nenkova is an Assistant Professor of Computer and Information Science at the University of Pennsylvania. Her main areas of research are automatic text summarisation, affect recognition and text quality. She obtained her PhD degree in Computer Science from Columbia University in 2006. She also spent a year and a half as a postdoctoral fellow at Stanford University before joining Penn in Fall 2007. Ani and her collaborators are recipients of the  best student paper award at SIGDial in 2010 and best paper award at EMNLP in 2012.  She received an NSF CAREER award in 2010. The Penn team co-led by Ani won the audio-visual emotion recognition challenge (AVEC) for word-level prediction in 2012. Ani was a member of the editorial board of Computational Linguistics (2009--2011) and has served as an area chair/senior program committee member for ACL (2013, 2012, 2010), NAACL (2010, 2007), AAAI (2013) and IJCAI (2011).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/04/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ttic.uchicago.edu/~klivescu/ Karen Livescu],  Toyota Technological Institute at Chicago&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/06/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.stonybrook.edu/~ychoi/ Yejin Choi],  Stony Brook University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=746</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=746"/>
		<updated>2013-09-19T21:33:56Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/4/2013: Recent Advances in Automatic Summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.hlt.utdallas.edu/~yangl/ Yang Liu],  University of Texas, Dallas&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, October 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
There has been great progress on automatic summarization over the past two decades, most notably approaches based on integer linear programming (ILP). In this talk I’ll present some recent work on summarization, focusing first on extractive summarization.  I will describe work with my students on the use of a supervised regression model for bigram weight estimation and the application of an ILP model to maximize the bigram coverage in the summaries.&lt;br /&gt;
&lt;br /&gt;
In the second part of the talk, I will discuss new approaches to compressive summarization, moving a step closer to abstractive summarization.  Using a pipeline compression and summarization framework, I will show how to create a summary guided compression module and to use it for generating multiple compression candidates.  Compressed summary sentences are selected from these candidates by the application of a subsequent ILP system.&lt;br /&gt;
 &lt;br /&gt;
Finally, I will introduce a graph-cut based method for joint compression and summarization. This is more efficient than the standard ILP-based joint compressive summarization method, and has the flexibility to incorporate grammar constraints in order to generate summaries with better readability. I will present various experimental results to demonstrate the effectiveness of our approaches.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Dr. Yang Liu is currently an Associate Professor in the Computer Science Department at the University of Texas at Dallas (UTD).  She received her B.S. and M.S degree from Tsinghua University, and Ph.D from Purdue University in 2004.  She was a researcher at the International Computer Science Institute (ICSI) at Berkeley for 3 years before she joined UTD as an assistant professor in 2005.  Dr. Liu&#039;s research interest is in speech and natural language processing.  She has published over 100 papers in this field.  Dr. Liu received the NSF CAREER award in 2009 and the Air Force Young Investigator award in 2010.  She is currently an Associate editor of IEEE Transactions on Audio, Speech, and Language Processing; ACM Transactions on Speech and Language Processing; ACM Transactions on Asian Language Information Processing; and Speech Communication.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/08/2013: Computational style analysis, with practical applications to automatic summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cis.upenn.edu/~nenkova/ Ani Nenkova],  University of Pennsylvania&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, November 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Natural language research is often equated with attempts to derive structure and meaning from unstructured data. Given this focus, computational treatment of style has largely remained unexplored. In this talk I will argue that elements of style such as the redundancy in text, the level of specificity or its entertaining effect, affect the performance of standard systems and that good approaches to computational style analysis will be beneficial for such systems. &lt;br /&gt;
&lt;br /&gt;
I will first briefly present our findings from a corpus study of local coherence motivating the need for style analysis. Theories of discourse relations and entity coherence fail to explain local coherence for a large portion of newspaper text, with stylistic factors often involved in the cases where standard theories do not apply. &lt;br /&gt;
&lt;br /&gt;
Next I will present our work on developing measures of one particular element of style, text specificity. I will discuss how we successfully developed a classifier for sentence specificity. The classifier allows us to analyze specificity in summaries produced by people and machines and reveals that machine summaries are overly specific. Furthermore analysis of sentence compression data shows that when summarizing people often edit a specific sentence in the source into general one for the summary, indicating that specificity is a suitable objective for compression systems that will naturally lead to the need for compression. &lt;br /&gt;
&lt;br /&gt;
I will conclude with a brief overview of other style-related tasks which affect the performance of summarization systems. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Ani Nenkova is an Assistant Professor of Computer and Information Science at the University of Pennsylvania. Her main areas of research are automatic text summarisation, affect recognition and text quality. She obtained her PhD degree in Computer Science from Columbia University in 2006. She also spent a year and a half as a postdoctoral fellow at Stanford University before joining Penn in Fall 2007. Ani and her collaborators are recipients of the  best student paper award at SIGDial in 2010 and best paper award at EMNLP in 2012.  She received an NSF CAREER award in 2010. The Penn team co-led by Ani won the audio-visual emotion recognition challenge (AVEC) for word-level prediction in 2012. Ani was a member of the editorial board of Computational Linguistics (2009--2011) and has served as an area chair/senior program committee member for ACL (2013, 2012, 2010), NAACL (2010, 2007), AAAI (2013) and IJCAI (2011).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/04/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ttic.uchicago.edu/~klivescu/ Karen Livescu],  Toyota Technological Institute at Chicago&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/06/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.stonybrook.edu/~ychoi/ Yejin Choi],  Stony Brook University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=745</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=745"/>
		<updated>2013-09-19T15:48:08Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/4/2013: Recent Advances in Automatic Summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.hlt.utdallas.edu/~yangl/ Yang Liu],  University of Texas, Dallas&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, October 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
There has been great progress on automatic summarization over the past two decades, most notably approaches based on integer linear programming (ILP). In this talk I’ll present some recent work on summarization, focusing first on extractive summarization.  I will describe work with my students on the use of a supervised regression model for bigram weight estimation and the application of an ILP model to maximize the bigram coverage in the summaries.&lt;br /&gt;
&lt;br /&gt;
In the second part of the talk, I will discuss new approaches to compressive summarization, moving a step closer to abstractive summarization.  Using a pipeline compression and summarization framework, I will show how to create a summary guided compression module and to use it for generating multiple compression candidates.  Compressed summary sentences are selected from these candidates by the application of a subsequent ILP system.&lt;br /&gt;
 &lt;br /&gt;
Finally, I will introduce a graph-cut based method for joint compression and summarization. This is more efficient than the standard ILP-based joint compressive summarization method, and has the flexibility to incorporate grammar constraints in order to generate summaries with better readability. I will present various experimental results to demonstrate the effectiveness of our approaches.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Dr. Yang Liu is currently an Associate Professor in the Computer Science Department at the University of Texas at Dallas (UTD).  She received her B.S. and M.S degree from Tsinghua University, and Ph.D from Purdue University in 2004.  She was a researcher at the International Computer Science Institute (ICSI) at Berkeley for 3 years before she joined UTD as an assistant professor in 2005.  Dr. Liu&#039;s research interest is in speech and natural language processing.  She has published over 100 papers in this field.  Dr. Liu received the NSF CAREER award in 2009 and the Air Force Young Investigator award in 2010.  She is currently an Associate editor of IEEE Transactions on Audio, Speech, and Language Processing; ACM Transactions on Speech and Language Processing; ACM Transactions on Asian Language Information Processing; and Speech Communication.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/08/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cis.upenn.edu/~nenkova/ Ani Nenkova],  University of Pennsylvania&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, November 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/04/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ttic.uchicago.edu/~klivescu/ Karen Livescu],  Toyota Technological Institute at Chicago&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/06/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.stonybrook.edu/~ychoi/ Yejin Choi],  Stony Brook University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=744</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=744"/>
		<updated>2013-09-19T15:29:14Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/4/2013: Recent Advances in Automatic Summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.hlt.utdallas.edu/~yangl/ Yang Liu],  University of Texas, Dallas&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, October 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
There has been great progress on automatic summarization over the past two decades, most notably approaches based on integer linear programming (ILP). In this talk I’ll present some recent work on summarization, focusing first on extractive summarization.  I will describe work with my students on the use of a supervised regression model for bigram weight estimation and the application of an ILP model to maximize the bigram coverage in the summaries.&lt;br /&gt;
&lt;br /&gt;
In the second part of the talk, I will discuss new approaches to compressive summarization, moving a step closer to abstractive summarization.  Using a pipeline compression and summarization framework, I will show how to create a summary guided compression module and to use it for generating multiple compression candidates.  Compressed summary sentences are selected from these candidates by the application of a subsequent ILP system.&lt;br /&gt;
 &lt;br /&gt;
Finally, I will introduce a graph-cut based method for joint compression and summarization. This is more efficient than the standard ILP-based joint compressive summarization method, and has the flexibility to incorporate grammar constraints in order to generate summaries with better readability. I will present various experimental results to demonstrate the effectiveness of our approaches.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Dr. Yang Liu is currently an Associate Professor in the Computer Science Department at the University of Texas at Dallas (UTD).  She received her B.S. and M.S degree from Tsinghua University, and Ph.D from Purdue University in 2004.  She was a researcher at the International Computer Science Institute (ICSI) at Berkeley for 3 years before she joined UTD as an assistant professor in 2005.  Dr. Liu&#039;s research interest is in speech and natural language processing.  She has published over 100 papers in this field.  Dr. Liu received the NSF CAREER award in 2009 and the Air Force Young Investigator award in 2010.  She is currently an Associate editor of IEEE Transactions on Audio, Speech, and Language Processing; ACM Transactions on Speech and Language Processing; ACM Transactions on Asian Language Information Processing; and Speech Communication.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/08/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cis.upenn.edu/~nenkova/ Ani Nenkova],  University of Pennsylvania&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, November 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/04/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ttic.uchicago.edu/~klivescu/ Karen Livescu],  Toyota Technological Institute at Chicago&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/06/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.stonybrook.edu/~ychoi/ Yejin Choi],  Stony Brook University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=743</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=743"/>
		<updated>2013-09-19T15:29:00Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: /* 12/06/2013: Title TBA */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/4/2013: Recent Advances in Automatic Summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.hlt.utdallas.edu/~yangl/ Yang Liu],  University of Texas, Dallas&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, October 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
There has been great progress on automatic summarization over the past two decades, most notably approaches based on integer linear programming (ILP). In this talk I’ll present some recent work on summarization, focusing first on extractive summarization.  I will describe work with my students on the use of a supervised regression model for bigram weight estimation and the application of an ILP model to maximize the bigram coverage in the summaries.&lt;br /&gt;
&lt;br /&gt;
In the second part of the talk, I will discuss new approaches to compressive summarization, moving a step closer to abstractive summarization.  Using a pipeline compression and summarization framework, I will show how to create a summary guided compression module and to use it for generating multiple compression candidates.  Compressed summary sentences are selected from these candidates by the application of a subsequent ILP system.&lt;br /&gt;
 &lt;br /&gt;
Finally, I will introduce a graph-cut based method for joint compression and summarization. This is more efficient than the standard ILP-based joint compressive summarization method, and has the flexibility to incorporate grammar constraints in order to generate summaries with better readability. I will present various experimental results to demonstrate the effectiveness of our approaches.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Dr. Yang Liu is currently an Associate Professor in the Computer Science Department at the University of Texas at Dallas (UTD).  She received her B.S. and M.S degree from Tsinghua University, and Ph.D from Purdue University in 2004.  She was a researcher at the International Computer Science Institute (ICSI) at Berkeley for 3 years before she joined UTD as an assistant professor in 2005.  Dr. Liu&#039;s research interest is in speech and natural language processing.  She has published over 100 papers in this field.  Dr. Liu received the NSF CAREER award in 2009 and the Air Force Young Investigator award in 2010.  She is currently an Associate editor of IEEE Transactions on Audio, Speech, and Language Processing; ACM Transactions on Speech and Language Processing; ACM Transactions on Asian Language Information Processing; and Speech Communication.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/08/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cis.upenn.edu/~nenkova/ Ani Nenkova],  University of Pennsylvania&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, November 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/04/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ttic.uchicago.edu/~klivescu/ Karen Livescu],  Toyota Technological Institute at Chicago&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 12/06/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.stonybrook.edu/~ychoi/ Yejin Choi],  Stony Brook University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=742</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=742"/>
		<updated>2013-09-19T15:28:43Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/4/2013: Recent Advances in Automatic Summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.hlt.utdallas.edu/~yangl/ Yang Liu],  University of Texas, Dallas&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, October 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
There has been great progress on automatic summarization over the past two decades, most notably approaches based on integer linear programming (ILP). In this talk I’ll present some recent work on summarization, focusing first on extractive summarization.  I will describe work with my students on the use of a supervised regression model for bigram weight estimation and the application of an ILP model to maximize the bigram coverage in the summaries.&lt;br /&gt;
&lt;br /&gt;
In the second part of the talk, I will discuss new approaches to compressive summarization, moving a step closer to abstractive summarization.  Using a pipeline compression and summarization framework, I will show how to create a summary guided compression module and to use it for generating multiple compression candidates.  Compressed summary sentences are selected from these candidates by the application of a subsequent ILP system.&lt;br /&gt;
 &lt;br /&gt;
Finally, I will introduce a graph-cut based method for joint compression and summarization. This is more efficient than the standard ILP-based joint compressive summarization method, and has the flexibility to incorporate grammar constraints in order to generate summaries with better readability. I will present various experimental results to demonstrate the effectiveness of our approaches.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Dr. Yang Liu is currently an Associate Professor in the Computer Science Department at the University of Texas at Dallas (UTD).  She received her B.S. and M.S degree from Tsinghua University, and Ph.D from Purdue University in 2004.  She was a researcher at the International Computer Science Institute (ICSI) at Berkeley for 3 years before she joined UTD as an assistant professor in 2005.  Dr. Liu&#039;s research interest is in speech and natural language processing.  She has published over 100 papers in this field.  Dr. Liu received the NSF CAREER award in 2009 and the Air Force Young Investigator award in 2010.  She is currently an Associate editor of IEEE Transactions on Audio, Speech, and Language Processing; ACM Transactions on Speech and Language Processing; ACM Transactions on Asian Language Information Processing; and Speech Communication.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/08/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cis.upenn.edu/~nenkova/ Ani Nenkova],  University of Pennsylvania&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, November 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/06/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ttic.uchicago.edu/~klivescu/ Karen Livescu],  Toyota Technological Institute at Chicago&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/06/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.stonybrook.edu/~ychoi/ Yejin Choi],  Stony Brook University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=741</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=741"/>
		<updated>2013-09-19T15:14:13Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: /* 12/06/2013: Title TBA */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/4/2013: Recent Advances in Automatic Summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.hlt.utdallas.edu/~yangl/ Yang Liu],  University of Texas, Dallas&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, October 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
There has been great progress on automatic summarization over the past two decades, most notably approaches based on integer linear programming (ILP). In this talk I’ll present some recent work on summarization, focusing first on extractive summarization.  I will describe work with my students on the use of a supervised regression model for bigram weight estimation and the application of an ILP model to maximize the bigram coverage in the summaries.&lt;br /&gt;
&lt;br /&gt;
In the second part of the talk, I will discuss new approaches to compressive summarization, moving a step closer to abstractive summarization.  Using a pipeline compression and summarization framework, I will show how to create a summary guided compression module and to use it for generating multiple compression candidates.  Compressed summary sentences are selected from these candidates by the application of a subsequent ILP system.&lt;br /&gt;
 &lt;br /&gt;
Finally, I will introduce a graph-cut based method for joint compression and summarization. This is more efficient than the standard ILP-based joint compressive summarization method, and has the flexibility to incorporate grammar constraints in order to generate summaries with better readability. I will present various experimental results to demonstrate the effectiveness of our approaches.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Dr. Yang Liu is currently an Associate Professor in the Computer Science Department at the University of Texas at Dallas (UTD).  She received her B.S. and M.S degree from Tsinghua University, and Ph.D from Purdue University in 2004.  She was a researcher at the International Computer Science Institute (ICSI) at Berkeley for 3 years before she joined UTD as an assistant professor in 2005.  Dr. Liu&#039;s research interest is in speech and natural language processing.  She has published over 100 papers in this field.  Dr. Liu received the NSF CAREER award in 2009 and the Air Force Young Investigator award in 2010.  She is currently an Associate editor of IEEE Transactions on Audio, Speech, and Language Processing; ACM Transactions on Speech and Language Processing; ACM Transactions on Asian Language Information Processing; and Speech Communication.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/08/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cis.upenn.edu/~nenkova/ Ani Nenkova],  University of Pennsylvania&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, November 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/06/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.stonybrook.edu/~ychoi/ Yejin Choi],  Stony Brook University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=740</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=740"/>
		<updated>2013-09-19T15:08:42Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/4/2013: Recent Advances in Automatic Summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.hlt.utdallas.edu/~yangl/ Yang Liu],  University of Texas, Dallas&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, October 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
There has been great progress on automatic summarization over the past two decades, most notably approaches based on integer linear programming (ILP). In this talk I’ll present some recent work on summarization, focusing first on extractive summarization.  I will describe work with my students on the use of a supervised regression model for bigram weight estimation and the application of an ILP model to maximize the bigram coverage in the summaries.&lt;br /&gt;
&lt;br /&gt;
In the second part of the talk, I will discuss new approaches to compressive summarization, moving a step closer to abstractive summarization.  Using a pipeline compression and summarization framework, I will show how to create a summary guided compression module and to use it for generating multiple compression candidates.  Compressed summary sentences are selected from these candidates by the application of a subsequent ILP system.&lt;br /&gt;
 &lt;br /&gt;
Finally, I will introduce a graph-cut based method for joint compression and summarization. This is more efficient than the standard ILP-based joint compressive summarization method, and has the flexibility to incorporate grammar constraints in order to generate summaries with better readability. I will present various experimental results to demonstrate the effectiveness of our approaches.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Dr. Yang Liu is currently an Associate Professor in the Computer Science Department at the University of Texas at Dallas (UTD).  She received her B.S. and M.S degree from Tsinghua University, and Ph.D from Purdue University in 2004.  She was a researcher at the International Computer Science Institute (ICSI) at Berkeley for 3 years before she joined UTD as an assistant professor in 2005.  Dr. Liu&#039;s research interest is in speech and natural language processing.  She has published over 100 papers in this field.  Dr. Liu received the NSF CAREER award in 2009 and the Air Force Young Investigator award in 2010.  She is currently an Associate editor of IEEE Transactions on Audio, Speech, and Language Processing; ACM Transactions on Speech and Language Processing; ACM Transactions on Asian Language Information Processing; and Speech Communication.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/08/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cis.upenn.edu/~nenkova/ Ani Nenkova],  University of Pennsylvania&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, November 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 12/06/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.stonybrook.edu/~ychoi/ Yejin Choi],  Stony Brook University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, December 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; TBA&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=739</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=739"/>
		<updated>2013-09-11T13:37:08Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/4/2013: Recent Advances in Automatic Summarization ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.hlt.utdallas.edu/~yangl/ Yang Liu],  University of Texas, Dallas&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, October 4, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
There has been great progress on automatic summarization over the past two decades, most notably approaches based on integer linear programming (ILP). In this talk I’ll present some recent work on summarization, focusing first on extractive summarization.  I will describe work with my students on the use of a supervised regression model for bigram weight estimation and the application of an ILP model to maximize the bigram coverage in the summaries.&lt;br /&gt;
&lt;br /&gt;
In the second part of the talk, I will discuss new approaches to compressive summarization, moving a step closer to abstractive summarization.  Using a pipeline compression and summarization framework, I will show how to create a summary guided compression module and to use it for generating multiple compression candidates.  Compressed summary sentences are selected from these candidates by the application of a subsequent ILP system.&lt;br /&gt;
 &lt;br /&gt;
Finally, I will introduce a graph-cut based method for joint compression and summarization. This is more efficient than the standard ILP-based joint compressive summarization method, and has the flexibility to incorporate grammar constraints in order to generate summaries with better readability. I will present various experimental results to demonstrate the effectiveness of our approaches.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Dr. Yang Liu is currently an Associate Professor in the Computer Science Department at the University of Texas at Dallas (UTD).  She received her B.S. and M.S degree from Tsinghua University, and Ph.D from Purdue University in 2004.  She was a researcher at the International Computer Science Institute (ICSI) at Berkeley for 3 years before she joined UTD as an assistant professor in 2005.  Dr. Liu&#039;s research interest is in speech and natural language processing.  She has published over 100 papers in this field.  Dr. Liu received the NSF CAREER award in 2009 and the Air Force Young Investigator award in 2010.  She is currently an Associate editor of IEEE Transactions on Audio, Speech, and Language Processing; ACM Transactions on Speech and Language Processing; ACM Transactions on Asian Language Information Processing; and Speech Communication.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/08/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cis.upenn.edu/~nenkova/ Ani Nenkova],  University of Pennsylvania&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, November 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=738</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=738"/>
		<updated>2013-09-11T01:03:02Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/08/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cis.upenn.edu/~nenkova/ Ani Nenkova],  University of Pennsylvania&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, November 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 4172&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=737</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=737"/>
		<updated>2013-09-10T22:09:18Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 11/08/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cis.upenn.edu/~nenkova/ Ani Nenkova],  University of Pennsylvania&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Friday, November 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; Room TBA&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=736</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=736"/>
		<updated>2013-09-10T20:53:50Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 25, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=735</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=735"/>
		<updated>2013-09-07T00:37:39Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 18, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Teaching machines to read for fun and profit ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=734</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=734"/>
		<updated>2013-09-05T13:38:40Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 18, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=733</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=733"/>
		<updated>2013-09-05T13:38:24Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: /* 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 18, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=732</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=732"/>
		<updated>2013-09-05T13:38:04Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 18, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Towards Minimizing the Annotation Cost of Certified Text Classification ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland]&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen&#039;s F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=731</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=731"/>
		<updated>2013-09-05T13:36:04Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 18, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/23/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Mossaab Bagdouri,  University of Maryland]&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 23, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/30/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Gary Kazantsev,  Bloomberg LP&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=730</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=730"/>
		<updated>2013-09-04T16:47:22Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 18, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=729</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=729"/>
		<updated>2013-09-04T16:46:56Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 18, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
== 9/25/2013: CLIP Lab Meeting ==&lt;br /&gt;
&lt;br /&gt;
Phillip will set the agenda.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=People&amp;diff=728</id>
		<title>People</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=People&amp;diff=728"/>
		<updated>2013-09-02T11:55:01Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: /* Graduate Students */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Faculty ==&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~jbg/  Jordan Boyd-Graber]&#039;&#039;&#039;: Assistant Professor, the iSchool and UMIACS&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~hal/ Hal Daum&amp;amp;eacute; III]&#039;&#039;&#039;: Associate Professor, Computer Science, Linguistics and UMIACS&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://lampsrv02.umiacs.umd.edu/projdb/person.php?id=2 David Doermann]&#039;&#039;&#039;: Senior Research Scientist, UMIACS&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~bonnie/ Bonnie J. Dorr]&#039;&#039;&#039;: Professor, Computer Science and UMIACS&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.ling.umd.edu/~nhf/ Naomi Feldman]&#039;&#039;&#039;: Assistant Professor, Linguistics&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~jimmylin/ Jimmy Lin]&#039;&#039;&#039;: Associate Professor, the iSchool and UMIACS&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.glue.umd.edu/~oard/ Douglas W. Oard]&#039;&#039;&#039;: Professor, the iSchool and UMIACS; Affiliate Professor, Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~louiqa/ Louiqa Raschid]&#039;&#039;&#039;: Professor, Smith School of Business and UMIACS&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~resnik/ Philip Resnik]&#039;&#039;&#039;: Professor, Department of Linguistics and UMIACS; Affiliate Professor, Computer Science&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;center&amp;gt;[[Image:people_facpd.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Emeritus Faculty ==&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~mharper/ Mary Harper]&#039;&#039;&#039;: Affiliate Research Professor, Computer Science, Electrical and Computer Engineering, and UMIACS&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~jklavans/ Judith Klavans]&#039;&#039;&#039;: Visiting Senior Research Scientist, UMIACS&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~weinberg/ Amy Weinberg]&#039;&#039;&#039;: Professor, Department of Linguistics and UMIACS&lt;br /&gt;
&lt;br /&gt;
==Researchers and Post-Docs==&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~lijunhui/ Junhui Li]&#039;&#039;&#039;&lt;br /&gt;
* &#039;&#039;&#039;Jiaul Paik&#039;&#039;&#039;&lt;br /&gt;
*  &#039;&#039;&#039;Dan Goldwasser&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==Graduate Students==&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Mossaab Bagdouri&#039;&#039;&#039;: Ph.D. Student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://sites.google.com/site/snigdhac/ Snigdha Chaturvedi]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~vlad/ Vladimir Eidelman]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.ieleta.com/en/ Irene Eleta]&#039;&#039;&#039;: Ph.D. student, iSchool&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Peter Enns&#039;&#039;&#039;: Ph.D. Student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Ning Gao&#039;&#039;&#039;: Ph.D. student, iSchool&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Anderson Garron&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.linkedin.com/pub/milad-gholami/55/35a/33a Milad Gholami]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Jonathan Gluck&#039;&#039;&#039;: Ph.D. Student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.utah.edu/~amitg/ Amit Goyal]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~rguerra/ Raul David Guerra]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~alvin/ Alvin Grissom II]&#039;&#039;&#039;: Ph.D. student, the iSchool&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~huah/ Hua He]&#039;&#039;&#039;: Ph.D. Student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~hhe/ He He]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~ynhu Yuening Hu]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Mohit Iyyer&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~jiarong/ Jiarong Jiang]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.ling.umd.edu/~yakov Yakov Kronrod]&#039;&#039;&#039;: Ph.D. student, Department of Linguistics&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;John Morgan&#039;&#039;&#039;: Ph.D. Student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~vietan/ Viet-An Nguyen]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~daithang/ Thang Nguyen]&#039;&#039;&#039;: Ph.D. student, the iSchool&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://ling.umd.edu/~naho/ Naho Orita]&#039;&#039;&#039;: Ph.D. student, Department of Linguistics&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Rachael Richardson&#039;&#039;&#039;: Ph.D. student, Department of Linguistics&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~sayyadi/ Hassan Sayyadi]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Jyothi Vinjumur&#039;&#039;&#039;: Ph.D. student, iSchool&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;[http://www.umiacs.umd.edu/~peratham/ Peratham (Will) Wiriyathammabhum]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Ke Wu&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://terpconnect.umd.edu/~tanx/ Tan Xu]&#039;&#039;&#039;, Ph.D. student, iSchool&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Ke Zhai&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
&amp;lt;center&amp;gt;[[Image:people_students.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== CLIP Alumni ==&lt;br /&gt;
&lt;br /&gt;
===Researchers and postdocs===&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~jbg/static/home.html Jordan Boyd-Graber]&#039;&#039;&#039; (2010): Assistant Professor, University of Maryland&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.linkedin.com/in/maribromanolsen Mari Broman-Olsen]&#039;&#039;&#039;: Microsoft&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.isi.edu/~chiang/ David Chiang]&#039;&#039;&#039;: Research Assistant Professor, University of Southern California/Information Sciences Institute&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://faculty.qu.edu.qa/telsayed/ Tamer Elsayed]&#039;&#039;&#039;: Qatar University (Qatar)&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.isical.ac.in/~utpal/ Utpal Garain]&#039;&#039;&#039;: Associate Professor, Indian Statistical Institute (India)&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.sis.pitt.edu/~daqing/ Daqing He]&#039;&#039;&#039;: Associate Professor, University of Pittsburgh&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~hollingk Kristy Hollingshead]&#039;&#039;&#039; (2012): Researcher, Department of Defense&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.pitt.edu/~hwa/ Rebecca Hwa]&#039;&#039;&#039;: Associate Professor, University of Pittsburgh&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Rebecca LaPlante&#039;&#039;&#039;: Independent Consultant on Information Access&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://faculty.washington.edu/levow/ Gina Levow]&#039;&#039;&#039;: Assistant Professor, University of Washington&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Saif Mohammad&#039;&#039;&#039;: Research Officer, National Research Council (Canada)&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~tsmoon/ Taesun Moon]&#039;&#039;&#039;: Researcher, IBM TJ Watson Research Center&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://comminfo.rutgers.edu/directory/smuresan/index.html Smaranda Muresan]&#039;&#039;&#039;: Assistant Professor, Rutgers University&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~hendra/ Hendra Setiawan]&#039;&#039;&#039;: Researcher, IBM TJ Watson Research Center&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Carolyn Sheffield&#039;&#039;&#039;: Smithsonian Institute&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://dbs.uni-leipzig.de/en/person/andreas_thor Andreas Thor]&#039;&#039;&#039;: University of Leipzig (Germany)&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://web.ntnu.edu.tw/~samtseng/ Yuen-Hsien Tseng]&#039;&#039;&#039;: Research Fellow, National Taiwan Normal University (Taiwan)&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.ldc.usb.ve/~mvidal/ Maria Esther Vidal]&#039;&#039;&#039;: Professor, Universidad Simon Bolivar (Venezuela)&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~ewagner/ Earl J. Wagner]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.williamwebber.com/ William Webber]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://research.microsoft.com/en-us/um/people/ryenw/ Ryen White]&#039;&#039;&#039;: Researcher, Microsoft Research&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://mlg.eng.cam.ac.uk/sinead/ Sinead Williamson]&#039;&#039;&#039; (2011): Postdoc, Carnegie Mellon University&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.sis.pitt.edu/~vladimir/ Vladimir Zadorozhny]&#039;&#039;&#039;: Associate Professor, University of Pittsburgh&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://ufal.mff.cuni.cz/~zeman/en/ Daniel Zeman]&#039;&#039;&#039;: Researcher, Charles University (Czech Republic)&lt;br /&gt;
&lt;br /&gt;
===Ph.D. Students===&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.utah.edu/~arvind/ Arvind Agarwal]&#039;&#039;&#039;: Researcher, Xerox Research&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;[http://www.cs.umd.edu/~nima/ Nima Asadi]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Necip Fazil Ayan&#039;&#039;&#039;: SRI&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Laura Bright&#039;&#039;&#039; (2003): McAfee&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.ntou.edu.tw/CSWebPage/eng/teachers.php Ya-Hui Chang]&#039;&#039;&#039;: Associate Professor, National Taiwan Ocean University, Keelung, Taiwan&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Waiyian Chong&#039;&#039;&#039;: Yuan&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.qcri.qa/kareem-darwish/ Kareem Darwish]&#039;&#039;&#039;: Qatar Computing Research Institute (Qatar) &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Dina Demner-Fushman&#039;&#039;&#039;: National Library of Medicine&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.gwu.edu/people/faculty/811 Mona Diab]&#039;&#039;&#039; Associate Professor, George Washington University&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.cmu.edu/~cdyer/ Chris Dyer]&#039;&#039;&#039; Assitant Professor, Carnegie Mellon University&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Denis Filimonov&#039;&#039;&#039;: FactSet&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Rebecca Green&#039;&#039;&#039;: Library of Congress&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stephan Greene&#039;&#039;&#039;: CodeRyte Inc.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~amit/ Amit Goyal]: Yahoo! Research&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Nizar Habash&#039;&#039;&#039;: Research Scientist, Columbia University&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~hardisty/ Eric Hardisty]&#039;&#039;&#039;: NSA&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~zqhuang/ Zhongqiang Huang]&#039;&#039;&#039;: BBN&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~changhu/ Chang Hu]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~jags/ Jagadeesh Jagarlamudi]&#039;&#039;&#039;: Researcher, IBM TJ Watson Research Center&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Darsana P. Josyula&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Maria Katsova&#039;&#039;&#039;: Microsoft Research&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Okan Kolak&#039;&#039;&#039;: Google&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Govind Kothari&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~abhishek/ Abhishek Kumar]&#039;&#039;&#039;: Researcher, IBM TJ Watson Research Center&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Grecia Lapizco&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.ncbi.nlm.nih.gov/CBBresearch/Fellows/AdamLee/ Adam (Woei-Jyh) Lee]&#039;&#039;&#039;: (2009) Postdoc Fellow, National Institutes of Health-&amp;gt;Smith School of Business, University of Maryland&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.jhu.edu/~alopez/ Adam Lopez]&#039;&#039;&#039;: Research Scientist, JHU / COE&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Jun Luo&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.desilinguist.org Nitin Madnani]&#039;&#039;&#039;: Research Scientist, ETS&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www1.ccls.columbia.edu/~ymarton/ Yuval Marton]&#039;&#039;&#039;: IBM&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Joseph Naft&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Levon Mkrtchyan&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;J. Scott Olsson&#039;&#039;&#039;: Wyoming Catholic College&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Lisa Pearl&#039;&#039;&#039;: Assistant Professor, UC Irvine&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stacy President Hobson&#039;&#039;&#039;: IBM&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Alejandro Rodriguez&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Asad Sayeed&#039;&#039;&#039;: Postdoc, University of the Saarland, Germany&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Wade Shen&#039;&#039;&#039;: MIT Lincoln Labs&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Matthew Snover&#039;&#039;&#039;: Postdoc, CUNY&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Calandra Tate&#039;&#039;&#039;: West Point&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Scott Thomas&#039;&#039;&#039;: Navy Research Labs&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~fture/ Ferhan Ture]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Clare Voss&#039;&#039;&#039;: Army Research Labs&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Nate Waisbrot&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.acsu.buffalo.edu/~jw254/ Jianqiang Wang]&#039;&#039;&#039;, Assistant Professor, University at Buffalo, State University of New York&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~lidan/ Lidan Wang]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~wsc/ Shanchan Wu]&#039;&#039;&#039; (2012): HP Labs&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.csc.lsu.edu/~wuyj/ Yejun Wu]&#039;&#039;&#039;: Assistant Professor, Louisiana State University&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Yao Wu&#039;&#039;&#039; (2009): Microsoft&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;David Zajic&#039;&#039;&#039;: Univeristy of Maryland / CASL&lt;br /&gt;
&lt;br /&gt;
===MS Students===&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;David Alexander&#039;&#039;&#039;: BBN&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Aitziber Atutxa&#039;&#039;&#039;: Univ of the Basque Country&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Clara Cabezas&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Jacob Devlin&#039;&#039;&#039;: BBN&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Ed Kenschaft&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Grazia Russo-Lassner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Gregory Sanders&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Aga Skotowski&#039;&#039;&#039;: BBN&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Michael Subotin&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
===Undergraduate Students===&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Olivia Buzek&#039;&#039;&#039;: PhD student, JHU&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Aaron Elkiss&#039;&#039;&#039;: PhD student, University of Michigan --&amp;gt; Univ of Michigan Libraries&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Andrew Fister&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Ayelet Goldin&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Gregory Marton&#039;&#039;&#039;: PhD student, MIT --&amp;gt; Google Research&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Ederlyn Lacson&#039;&#039;&#039;: Microsoft&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Jesse Metcalf-Burton&#039;&#039;&#039;: PhD from University of Michigan&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Eric Nichols&#039;&#039;&#039;: MS student, Nara Institute of Science and Technology, Japan&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Noah Smith&#039;&#039;&#039;: PhD from JHU --&amp;gt; assistant prof --&amp;gt; associate professor, CMU&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Jessica Stevens&#039;&#039;&#039;: BBN&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Michael Wasser&#039;&#039;&#039;&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=People&amp;diff=727</id>
		<title>People</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=People&amp;diff=727"/>
		<updated>2013-09-02T11:50:15Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: /* Graduate Students */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Faculty ==&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~jbg/  Jordan Boyd-Graber]&#039;&#039;&#039;: Assistant Professor, the iSchool and UMIACS&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~hal/ Hal Daum&amp;amp;eacute; III]&#039;&#039;&#039;: Associate Professor, Computer Science, Linguistics and UMIACS&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://lampsrv02.umiacs.umd.edu/projdb/person.php?id=2 David Doermann]&#039;&#039;&#039;: Senior Research Scientist, UMIACS&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~bonnie/ Bonnie J. Dorr]&#039;&#039;&#039;: Professor, Computer Science and UMIACS&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.ling.umd.edu/~nhf/ Naomi Feldman]&#039;&#039;&#039;: Assistant Professor, Linguistics&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~jimmylin/ Jimmy Lin]&#039;&#039;&#039;: Associate Professor, the iSchool and UMIACS&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.glue.umd.edu/~oard/ Douglas W. Oard]&#039;&#039;&#039;: Professor, the iSchool and UMIACS; Affiliate Professor, Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~louiqa/ Louiqa Raschid]&#039;&#039;&#039;: Professor, Smith School of Business and UMIACS&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~resnik/ Philip Resnik]&#039;&#039;&#039;: Professor, Department of Linguistics and UMIACS; Affiliate Professor, Computer Science&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;center&amp;gt;[[Image:people_facpd.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Emeritus Faculty ==&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~mharper/ Mary Harper]&#039;&#039;&#039;: Affiliate Research Professor, Computer Science, Electrical and Computer Engineering, and UMIACS&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~jklavans/ Judith Klavans]&#039;&#039;&#039;: Visiting Senior Research Scientist, UMIACS&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~weinberg/ Amy Weinberg]&#039;&#039;&#039;: Professor, Department of Linguistics and UMIACS&lt;br /&gt;
&lt;br /&gt;
==Researchers and Post-Docs==&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~lijunhui/ Junhui Li]&#039;&#039;&#039;&lt;br /&gt;
* &#039;&#039;&#039;Jiaul Paik&#039;&#039;&#039;&lt;br /&gt;
*  &#039;&#039;&#039;Dan Goldwasser&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
==Graduate Students==&lt;br /&gt;
&lt;br /&gt;
* Mossaab Bagdouri, Ph.D. Student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://sites.google.com/site/snigdhac/ Snigdha Chaturvedi]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~vlad/ Vladimir Eidelman]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.ieleta.com/en/ Irene Eleta]&#039;&#039;&#039;: Ph.D. student, iSchool&lt;br /&gt;
&lt;br /&gt;
* Peter Enns, Ph.D. Student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Ning Gao&#039;&#039;&#039;: Ph.D. student, iSchool&lt;br /&gt;
&lt;br /&gt;
* Anderson Garron: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.linkedin.com/pub/milad-gholami/55/35a/33a Milad Gholami]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* Jonathan Gluck, Ph.D. Student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.utah.edu/~amitg/ Amit Goyal]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~rguerra/ Raul David Guerra]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~alvin/ Alvin Grissom II]&#039;&#039;&#039;: Ph.D. student, the iSchool&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~huah/ Hua He]&#039;&#039;&#039;: Ph.D. Student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~hhe/ He He]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~ynhu Yuening Hu]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Mohit Iyyer&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~jiarong/ Jiarong Jiang]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.ling.umd.edu/~yakov Yakov Kronrod]&#039;&#039;&#039;: Ph.D. student, Department of Linguistics&lt;br /&gt;
&lt;br /&gt;
* John Morgan, Ph.D. Student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~vietan/ Viet-An Nguyen]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~daithang/ Thang Nguyen]&#039;&#039;&#039;: Ph.D. student, the iSchool&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://ling.umd.edu/~naho/ Naho Orita]&#039;&#039;&#039;: Ph.D. student, Department of Linguistics&lt;br /&gt;
&lt;br /&gt;
* Rachael Richardson, Ph.D. student, Department of Linguistics&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~sayyadi/ Hassan Sayyadi]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Jyothi Vinjumur&#039;&#039;&#039;: Ph.D. student, iSchool&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;[http://www.umiacs.umd.edu/~peratham/ Peratham (Will) Wiriyathammabhum]&#039;&#039;&#039;: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* Ke Wu, Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://terpconnect.umd.edu/~tanx/ Tan Xu]&#039;&#039;&#039;, Ph.D. student, iSchool&lt;br /&gt;
&lt;br /&gt;
* Ke Zhai: Ph.D. student, Department of Computer Science&lt;br /&gt;
&lt;br /&gt;
&amp;lt;center&amp;gt;[[Image:people_students.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== CLIP Alumni ==&lt;br /&gt;
&lt;br /&gt;
===Researchers and postdocs===&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~jbg/static/home.html Jordan Boyd-Graber]&#039;&#039;&#039; (2010): Assistant Professor, University of Maryland&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.linkedin.com/in/maribromanolsen Mari Broman-Olsen]&#039;&#039;&#039;: Microsoft&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.isi.edu/~chiang/ David Chiang]&#039;&#039;&#039;: Research Assistant Professor, University of Southern California/Information Sciences Institute&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://faculty.qu.edu.qa/telsayed/ Tamer Elsayed]&#039;&#039;&#039;: Qatar University (Qatar)&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.isical.ac.in/~utpal/ Utpal Garain]&#039;&#039;&#039;: Associate Professor, Indian Statistical Institute (India)&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.sis.pitt.edu/~daqing/ Daqing He]&#039;&#039;&#039;: Associate Professor, University of Pittsburgh&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~hollingk Kristy Hollingshead]&#039;&#039;&#039; (2012): Researcher, Department of Defense&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.pitt.edu/~hwa/ Rebecca Hwa]&#039;&#039;&#039;: Associate Professor, University of Pittsburgh&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Rebecca LaPlante&#039;&#039;&#039;: Independent Consultant on Information Access&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://faculty.washington.edu/levow/ Gina Levow]&#039;&#039;&#039;: Assistant Professor, University of Washington&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Saif Mohammad&#039;&#039;&#039;: Research Officer, National Research Council (Canada)&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~tsmoon/ Taesun Moon]&#039;&#039;&#039;: Researcher, IBM TJ Watson Research Center&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://comminfo.rutgers.edu/directory/smuresan/index.html Smaranda Muresan]&#039;&#039;&#039;: Assistant Professor, Rutgers University&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~hendra/ Hendra Setiawan]&#039;&#039;&#039;: Researcher, IBM TJ Watson Research Center&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Carolyn Sheffield&#039;&#039;&#039;: Smithsonian Institute&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://dbs.uni-leipzig.de/en/person/andreas_thor Andreas Thor]&#039;&#039;&#039;: University of Leipzig (Germany)&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://web.ntnu.edu.tw/~samtseng/ Yuen-Hsien Tseng]&#039;&#039;&#039;: Research Fellow, National Taiwan Normal University (Taiwan)&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.ldc.usb.ve/~mvidal/ Maria Esther Vidal]&#039;&#039;&#039;: Professor, Universidad Simon Bolivar (Venezuela)&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~ewagner/ Earl J. Wagner]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.williamwebber.com/ William Webber]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://research.microsoft.com/en-us/um/people/ryenw/ Ryen White]&#039;&#039;&#039;: Researcher, Microsoft Research&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://mlg.eng.cam.ac.uk/sinead/ Sinead Williamson]&#039;&#039;&#039; (2011): Postdoc, Carnegie Mellon University&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.sis.pitt.edu/~vladimir/ Vladimir Zadorozhny]&#039;&#039;&#039;: Associate Professor, University of Pittsburgh&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://ufal.mff.cuni.cz/~zeman/en/ Daniel Zeman]&#039;&#039;&#039;: Researcher, Charles University (Czech Republic)&lt;br /&gt;
&lt;br /&gt;
===Ph.D. Students===&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.utah.edu/~arvind/ Arvind Agarwal]&#039;&#039;&#039;: Researcher, Xerox Research&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;[http://www.cs.umd.edu/~nima/ Nima Asadi]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Necip Fazil Ayan&#039;&#039;&#039;: SRI&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Laura Bright&#039;&#039;&#039; (2003): McAfee&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.ntou.edu.tw/CSWebPage/eng/teachers.php Ya-Hui Chang]&#039;&#039;&#039;: Associate Professor, National Taiwan Ocean University, Keelung, Taiwan&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Waiyian Chong&#039;&#039;&#039;: Yuan&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.qcri.qa/kareem-darwish/ Kareem Darwish]&#039;&#039;&#039;: Qatar Computing Research Institute (Qatar) &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Dina Demner-Fushman&#039;&#039;&#039;: National Library of Medicine&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.gwu.edu/people/faculty/811 Mona Diab]&#039;&#039;&#039; Associate Professor, George Washington University&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.cmu.edu/~cdyer/ Chris Dyer]&#039;&#039;&#039; Assitant Professor, Carnegie Mellon University&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Denis Filimonov&#039;&#039;&#039;: FactSet&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Rebecca Green&#039;&#039;&#039;: Library of Congress&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stephan Greene&#039;&#039;&#039;: CodeRyte Inc.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~amit/ Amit Goyal]: Yahoo! Research&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Nizar Habash&#039;&#039;&#039;: Research Scientist, Columbia University&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~hardisty/ Eric Hardisty]&#039;&#039;&#039;: NSA&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~zqhuang/ Zhongqiang Huang]&#039;&#039;&#039;: BBN&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~changhu/ Chang Hu]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~jags/ Jagadeesh Jagarlamudi]&#039;&#039;&#039;: Researcher, IBM TJ Watson Research Center&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Darsana P. Josyula&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Maria Katsova&#039;&#039;&#039;: Microsoft Research&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Okan Kolak&#039;&#039;&#039;: Google&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Govind Kothari&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.umiacs.umd.edu/~abhishek/ Abhishek Kumar]&#039;&#039;&#039;: Researcher, IBM TJ Watson Research Center&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Grecia Lapizco&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.ncbi.nlm.nih.gov/CBBresearch/Fellows/AdamLee/ Adam (Woei-Jyh) Lee]&#039;&#039;&#039;: (2009) Postdoc Fellow, National Institutes of Health-&amp;gt;Smith School of Business, University of Maryland&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.jhu.edu/~alopez/ Adam Lopez]&#039;&#039;&#039;: Research Scientist, JHU / COE&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Jun Luo&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.desilinguist.org Nitin Madnani]&#039;&#039;&#039;: Research Scientist, ETS&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www1.ccls.columbia.edu/~ymarton/ Yuval Marton]&#039;&#039;&#039;: IBM&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Joseph Naft&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Levon Mkrtchyan&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;J. Scott Olsson&#039;&#039;&#039;: Wyoming Catholic College&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Lisa Pearl&#039;&#039;&#039;: Assistant Professor, UC Irvine&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Stacy President Hobson&#039;&#039;&#039;: IBM&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Alejandro Rodriguez&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Asad Sayeed&#039;&#039;&#039;: Postdoc, University of the Saarland, Germany&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Wade Shen&#039;&#039;&#039;: MIT Lincoln Labs&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Matthew Snover&#039;&#039;&#039;: Postdoc, CUNY&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Calandra Tate&#039;&#039;&#039;: West Point&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Scott Thomas&#039;&#039;&#039;: Navy Research Labs&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~fture/ Ferhan Ture]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Clare Voss&#039;&#039;&#039;: Army Research Labs&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Nate Waisbrot&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.acsu.buffalo.edu/~jw254/ Jianqiang Wang]&#039;&#039;&#039;, Assistant Professor, University at Buffalo, State University of New York&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~lidan/ Lidan Wang]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.cs.umd.edu/~wsc/ Shanchan Wu]&#039;&#039;&#039; (2012): HP Labs&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[http://www.csc.lsu.edu/~wuyj/ Yejun Wu]&#039;&#039;&#039;: Assistant Professor, Louisiana State University&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Yao Wu&#039;&#039;&#039; (2009): Microsoft&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;David Zajic&#039;&#039;&#039;: Univeristy of Maryland / CASL&lt;br /&gt;
&lt;br /&gt;
===MS Students===&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;David Alexander&#039;&#039;&#039;: BBN&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Aitziber Atutxa&#039;&#039;&#039;: Univ of the Basque Country&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Clara Cabezas&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Jacob Devlin&#039;&#039;&#039;: BBN&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Ed Kenschaft&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Grazia Russo-Lassner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Gregory Sanders&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Aga Skotowski&#039;&#039;&#039;: BBN&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Michael Subotin&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
===Undergraduate Students===&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Olivia Buzek&#039;&#039;&#039;: PhD student, JHU&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Aaron Elkiss&#039;&#039;&#039;: PhD student, University of Michigan --&amp;gt; Univ of Michigan Libraries&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Andrew Fister&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Ayelet Goldin&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Gregory Marton&#039;&#039;&#039;: PhD student, MIT --&amp;gt; Google Research&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Ederlyn Lacson&#039;&#039;&#039;: Microsoft&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Jesse Metcalf-Burton&#039;&#039;&#039;: PhD from University of Michigan&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Eric Nichols&#039;&#039;&#039;: MS student, Nara Institute of Science and Technology, Japan&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Noah Smith&#039;&#039;&#039;: PhD from JHU --&amp;gt; assistant prof --&amp;gt; associate professor, CMU&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Jessica Stevens&#039;&#039;&#039;: BBN&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Michael Wasser&#039;&#039;&#039;&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=726</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=726"/>
		<updated>2013-08-31T12:30:25Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.  &amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 18, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=725</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=725"/>
		<updated>2013-08-31T12:30:04Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 9/4/2013 and 9/11/2013: N-Minute Madness ==&lt;br /&gt;
&lt;br /&gt;
The people of CLIP talk about what&#039;s going on in N minutes.  &amp;lt;b&amp;gt;Special location note&amp;lt;/b&amp;gt;: on 9/4/2013, we&#039;ll be in AVW 4172.&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://ptl.sys.virginia.edu/ptl/members/matthew-gerber Matthew Gerber],  University of Virginia&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, September 18, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/2/2013: Title TBA ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/miles/ Miles Osborne],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 2, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 10/9/2013: Semantics and Social Science: Learning to Extract International Relations from Political Context ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://brenocon.com/ Brendan O&#039;Connor],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, October 9, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=723</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=723"/>
		<updated>2013-08-27T21:53:44Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
Matthew Gerber&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=722</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=722"/>
		<updated>2013-08-27T21:52:20Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 9/18/2013: Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets ==&lt;br /&gt;
&lt;br /&gt;
Matthew Gerber&lt;br /&gt;
&lt;br /&gt;
Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker&#039;&#039;&#039;: Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=CLIP_Colloquium_(Spring_2013)&amp;diff=721</id>
		<title>CLIP Colloquium (Spring 2013)</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=CLIP_Colloquium_(Spring_2013)&amp;diff=721"/>
		<updated>2013-08-27T21:50:24Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: Created page with &amp;quot;__NOTOC__  == 01/30/2013: Human Translation and Machine Translation ==  &amp;#039;&amp;#039;&amp;#039;Speaker:&amp;#039;&amp;#039;&amp;#039; [http://homepages.inf.ed.ac.uk/pkoehn/ Philipp Koehn],  University of Edinburgh&amp;lt;br/&amp;gt; &amp;#039;&amp;#039;&amp;#039;...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
&lt;br /&gt;
== 01/30/2013: Human Translation and Machine Translation ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/pkoehn/ Philipp Koehn],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, January 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Despite all the recent successes of machine translation, when it&lt;br /&gt;
comes to high quality publishable translation, human translators&lt;br /&gt;
are still unchallenged. Since we can&#039;t beat them, can we help&lt;br /&gt;
them to become more productive? I will talk about some recent&lt;br /&gt;
work on developing assistance tools for human translators.&lt;br /&gt;
You can also check out a prototype [http://www.caitra.org/ here]&lt;br /&gt;
and learn about our ongoing European projects [http://www.casmacat.eu/ CASMACAT]&lt;br /&gt;
and [http://www.matecat.com/ MATECAT].&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Philipp Koehn is Professor of Machine Translation at the&lt;br /&gt;
School of Informatics at the University of Edinburgh, Scotland.&lt;br /&gt;
He received his PhD at the University of Southern California&lt;br /&gt;
and spent a year as postdoctoral researcher at MIT.&lt;br /&gt;
He is well-known in the field of statistical machine translation&lt;br /&gt;
for the leading open source toolkit Moses, the organization&lt;br /&gt;
of the annual Workshop on Statistical Machine Translation&lt;br /&gt;
and its evaluation campaign as well as the Machine Translation&lt;br /&gt;
Marathon. He is founding president of the ACL SIG MT and&lt;br /&gt;
currently serves a vice president-elect of the ACL SIG DAT.&lt;br /&gt;
He has published over 80 papers and the textbook in the&lt;br /&gt;
field. He manages a number of EU and DARPA funded&lt;br /&gt;
research projects aimed at morpho-syntactic models, machine&lt;br /&gt;
learning methods and computer assisted translation tools.&lt;br /&gt;
&lt;br /&gt;
== 02/06/2013: A New Recommender System for Large-scale Document Exploration ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.cmu.edu/~chongw/ Chong Wang],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
How can we help people quickly navigate the vast amount of data&lt;br /&gt;
and acquire useful knowledge from it? Recommender systems provide&lt;br /&gt;
a promising solution to this problem. They narrow down the search&lt;br /&gt;
space by providing a few recommendations that are tailored to&lt;br /&gt;
users&#039; personal preferences. However, these systems usually work&lt;br /&gt;
like a black box, limiting further opportunities to provide more&lt;br /&gt;
exploratory experiences to their users.&lt;br /&gt;
&lt;br /&gt;
In this talk, I will describe how we build a new recommender&lt;br /&gt;
system for document exploration. Specially, I will talk about two&lt;br /&gt;
building blocks of the system in detail. The first is about a new&lt;br /&gt;
probabilistic model for document recommendation that is both&lt;br /&gt;
predictive and interpretable. It not only gives better predictive&lt;br /&gt;
performance, but also provides better transparency than&lt;br /&gt;
traditional approaches. This transparency creates many new&lt;br /&gt;
opportunities for exploratory analysis---For example, a user can&lt;br /&gt;
manually adjust her preferences and the system responds to this&lt;br /&gt;
by changing its recommendations. Second, building a recommender&lt;br /&gt;
system like this requires learning the probabilistic model from&lt;br /&gt;
large-scale empirical data. I will describe a scalable approach&lt;br /&gt;
for learning a wide class of probabilistic models that include&lt;br /&gt;
our recommendation model as a special case.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Chong is a Project Scientist in Eric Xing&#039;s group, Machine Learning Department, Carnegie Mellon University.  His PhD advisor was David M. Blei from Princeton University.&lt;br /&gt;
&lt;br /&gt;
== 02/13/2013: Computational Modeling of Sociopragmatic Language Use in Arabic and English Social Media ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www1.ccls.columbia.edu/~mdiab/ Mona Diab], Columbia University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Social media language is a treasure trove for mining and understanding human interactions. In discussion fora, people naturally form groups and subgroups aligning along points of consensus and contention. These subgroup formations are quite nuanced as people could agree on some topic such as liking the movie the matrix, but some within that group might disagree on rating the acting skills of Keanu Reeves. Languages manifest these alignments exploiting  interesting sociolinguistic devices in different ways. In this talk, I will present our work on subgroup modeling and detection in both Arabic and English social media language. I will share with you our experiences with modeling both explicit and implicit attitude using high and low dimensional feature modeling. This work is the beginning of an interesting exploration into the realm of building computational  models of some aspects of the sociopragmatics of human language with the hopes that this research could lead to a  better understanding of human interaction. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Mona Diab is an Associate Professor of Computer Science at the George Washington University. She is also a cofounder of the CADIM (Columbia Arabic Dialect Modeling) group at Columbia University. Mona earned her PhD in Computational Linguistics from University of Maryland College Park with Philip Resnik in 2003 and then did her postdoctoral training with Daniel Jurafsky at Stanford University where she was part of the NLP group.  from 2005 till 2012, before joining GWU in Jan of 2013, Mona held the position of Research Scientist/Principle Investigator at Columbia University Center for Computational Learning Systems (CCLS). Mona&#039;s research  interests span computational lexical semantics, multilingual processing (with a special interest in Arabic and low resource languages), unsupervised learning for NLP, computational sociopragmatic modeling, information extraction and machine translation. Over the past 9 years, Mona has developed significant expertise in modeling low resource languages with a focus on Arabic dialect processing. She is especially interested in ways to leverage existing rich resources to inform algorithms for processing low resource languages. Her research has been published in over 90 papers in various internationally recognized scientific venues. Mona serves as the current elected President of the ACL SIG on Semitic Language Processing, she is also the elected Secretary for the ACL SIG on issues in the Lexicon (SIGLEX). She also serves on the NAACL board as an elected member. Mona recently (2012) co-founded  the yearly *SEM conference that attempts to bring together all aspects of semantic processing under the same umbrella venue. &lt;br /&gt;
&lt;br /&gt;
== 02/14/2013: Efficient Probabilistic Models for Rankings and Orderings ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://stanford.edu/~jhuang11/ Jon Huang], Stanford University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Thursday, February 14, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The need to reason probabilistically with rankings and orderings arises &lt;br /&gt;
in a number of real world problems.  Probability distributions over &lt;br /&gt;
rankings and orderings arise naturally, for example, in preference data, &lt;br /&gt;
and political election data, as well as a number of less obvious &lt;br /&gt;
settings such as topic analysis and neurodegenerative disease &lt;br /&gt;
progression modeling. Representing distributions over the space of all &lt;br /&gt;
rankings is challenging, however, due to the factorial number of ways to &lt;br /&gt;
rank a collection of items.  The focus of my talk is to discuss methods &lt;br /&gt;
for combatting this factorial explosion in probabilistic representation &lt;br /&gt;
and inference.&lt;br /&gt;
&lt;br /&gt;
Ordinarily, a typical machine learning method for dealing with &lt;br /&gt;
combinatorial complexity might be to exploit conditional independence &lt;br /&gt;
relations in order to decompose a distribution into compact factors of a &lt;br /&gt;
graphical model.  For ranked data, however, a far more natural and &lt;br /&gt;
useful probabilistic relation is that of `riffled independence&#039;.  I will &lt;br /&gt;
introduce the concept of riffled independence and discuss how these &lt;br /&gt;
riffle independent relations can be used to decompose a distribution &lt;br /&gt;
over rankings into a product of compactly represented factors.  These &lt;br /&gt;
so-called hierarchical riffle-independent distributions are particularly &lt;br /&gt;
amenable to efficient inference and learning algorithms and in many &lt;br /&gt;
cases lead to intuitively interpretable probabilistic models. To &lt;br /&gt;
illustrate the power of exploiting riffled independence, I will discuss &lt;br /&gt;
a few applications, including Irish political election analysis, &lt;br /&gt;
visualizing the japanese preferences of sushi types and modeling the &lt;br /&gt;
progression of Alzheimer&#039;s disease, showing results on real datasets in &lt;br /&gt;
each problem.&lt;br /&gt;
&lt;br /&gt;
This is joint work with Carlos Guestrin (University of Washington), &lt;br /&gt;
Ashish Kapoor (Microsoft Research) and Daniel Alexander (University &lt;br /&gt;
College London).&lt;br /&gt;
&lt;br /&gt;
== 02/27/2013: Building Scholarly Methodologies with Large-Scale Topic Analysis ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.princeton.edu/~mimno/ David Mimno], Princeton University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 27, 2013, 9:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; Hornbake (South Wing) Room 2119&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;NOTE SPECIAL TIME AND LOCATION!!!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
In the last ten years we have seen the creation of massive digital text collections, from Twitter feeds to million-book libraries, all in dozens of languages. At the same time, researchers have developed text mining methods that go beyond simple word frequency analysis to uncover thematic patterns. When we combine big data with powerful algorithms, we enable analysts in many different fields to enhance qualitative perspectives with quantitative measurements. But these methods are only useful if we can apply them at massive scale and distinguish consistent patterns from random variations. In this talk I will describe my work building reliable topic modeling methodologies for humanists, social scientists and science policy officers.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; David Mimno is a postdoctoral researcher in the Computer Science department at Princeton University. He received his PhD from the University of Massachusetts, Amherst. Before graduate school, he served as Head Programmer at the Perseus Project, a digital library for cultural heritage materials, at Tufts University. He is supported by a CRA Computing Innovation fellowship.&lt;br /&gt;
&lt;br /&gt;
== 03/13/2013: Is Any Politics Local?  An Automated Analysis of Mayoral and Gubernatorial Addresses ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://explore.georgetown.edu/people/dh335/ Dan Hopkins],  Georgetown University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, March 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Dubbed &amp;quot;laboratories of democracy,&amp;quot; America&#039;s states and its large cities face a wide variety of public policy challenges.  But in a period of expanding federal authority and increased long-distance communication, the extent to which U.S. states and large cities pursue varying policy agendas is at once important and unknown.  This paper draws on techniques from automated content analysis to measure the major topics in more than 500 &amp;quot;State of the State&amp;quot; and &amp;quot;State of the City&amp;quot; addresses given by American executive officials since 2000.  Drawing on the Correlated Topic Model (Blei and Lafferty 2006) and other approaches to topic modeling, it demonstrates that big-city mayors do address a distinctive set of topics from their counterparts in state capitols, but one that is surprisingly consistent across cities.  Knowing a mayor&#039;s political party provides little leverage on the topics he or she is likely to highlight, while the same is true for objective indicators such as economic conditions or the city&#039;s crime rate.  At the state level, partisanship proves more predictive of the topics addressed by Governors, but there, too, institutional responsibilities constrain leaders to emphasize a broad and similar set of issues.  American political institutions inscribe a substantial role for geographic and institutional differences, but the policy agendas of America&#039;s states and largest cities are homogeneous and overlapping.      &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Daniel J. Hopkins is an Assistant Professor of Government at Georgetown University whose research focuses on American politics, with a special emphasis on political behavior, urban and local politics, racial and ethnic politics, and statistical methods.  Specifically, his research has addressed issues including the role of rhetoric and of local contexts in shaping political behavior.  It has also involved the development and application of automated techniques for analyzing political rhetoric.  Professor Hopkins&#039; work has appeared in a variety of scholarly and popular outlets, including the American Political Science Review, the American Journal of Political Science, the Journal of Politics, and The Washington Post.  Professor Hopkins received his Ph.D. from Harvard University in 2007. &lt;br /&gt;
&lt;br /&gt;
== 03/27/2013: Corpora and Statistical Analysis of Non-Linguistic Symbol Systems ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Richard Sproat, Google New York&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, March 27, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We report on the creation and analysis of a set of corpora of non-linguistic symbol systems. &lt;br /&gt;
The resource, the first of its kind, consists of data from seven systems, both ancient and modern, &lt;br /&gt;
with four further systems under development, and several others planned. The systems represent &lt;br /&gt;
a range of types, including heraldic systems, formal systems, and systems that are mostly or purely &lt;br /&gt;
decorative. We also compare these systems statistically with a large set of linguistic systems, which &lt;br /&gt;
also range over both time and type.&lt;br /&gt;
&lt;br /&gt;
We show that none of the measures proposed in published work by Rao and colleagues (Rao et al., 2009a; Rao, 2010) &lt;br /&gt;
or Lee and colleagues (Lee et al., 2010a) works. In particular, Rao’s entropic measures are evidently useless when &lt;br /&gt;
one considers a wider range of examples of real non-linguistic symbol systems. And Lee’s measures, with the cutoff &lt;br /&gt;
values they propose, misclassify nearly all of our non-linguistic systems. However, we also show that one of Lee’s &lt;br /&gt;
measures, with different cutoff values, as well as another measure we develop here, do seem useful. We further &lt;br /&gt;
demonstrate that they are useful largely because they are both highly correlated with a rather trivial feature: &lt;br /&gt;
mean text length. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Richard Sproat received his Ph.D. in Linguistics from the Massachusetts&lt;br /&gt;
Institute of Technology in 1985. He has worked at AT&amp;amp;T Bell Labs, at&lt;br /&gt;
Lucent&#039;s Bell Labs and at AT&amp;amp;T Labs -- Research, before joining the faculty of&lt;br /&gt;
the University of Illinois. From there he moved to the Center for Spoken&lt;br /&gt;
Language Understanding at the Oregon Health &amp;amp; Science University. In the Fall of&lt;br /&gt;
2012 he moved to Google, New York as a Research Scientist.&lt;br /&gt;
&lt;br /&gt;
Sproat has worked in numerous areas relating to language and computational&lt;br /&gt;
linguistics, including syntax, morphology, computational morphology,&lt;br /&gt;
articulatory and acoustic phonetics, text processing, text-to-speech synthesis,&lt;br /&gt;
and text-to-scene conversion. Some of his recent work includes multilingual&lt;br /&gt;
named entity transliteration, the effects of script layout on readers&#039;&lt;br /&gt;
phonological awareness, and tools for automated assessment of child language. At&lt;br /&gt;
Google he works on multilingual text normalization and finite-state methods for&lt;br /&gt;
language processing. He also has a long-standing interest in writing systems and&lt;br /&gt;
symbol systems more generally.&lt;br /&gt;
&lt;br /&gt;
== 04/10/2013: Learning with Marginalized Corrupted Features ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cse.wustl.edu/~kilian/ Kilian Weinberger],  Washington University in St. Louis&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, April 10, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If infinite amounts of labeled data are provided, many machine learning algorithms become perfect. With finite amounts of data, regularization or priors have to be used to introduce bias into a classifier. We propose a third option: learning with marginalized corrupted features. We (implicitly) corrupt existing data as a means to generate additional, infinitely many, training samples from a slightly different data distribution -- this is computationally tractable, because the corruption can be marginalized out in closed form. Our framework leads to machine learning algorithms that are fast, generalize well and naturally scale to very large data sets. We showcase this technology as regularization for general risk minimization and for marginalized deep learning for document representations. We provide experimental results on part of speech tagging as well as document and image classification. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Kilian Q. Weinberger is an Assistant Professor in the Department of Computer Science &amp;amp; Engineering at Washington University in St. Louis. He received his Ph.D. from the University of Pennsylvania in Machine Learning under the supervision of Lawrence Saul. Prior to this, he obtained his undergraduate degree in Mathematics and Computer Science at the University of Oxford. During his career he has won several best paper awards at ICML, CVPR and AISTATS. In 2011 he was awarded the AAAI senior program chair award and in 2012 he received the NSF CAREER award. Kilian Weinberger&#039;s research is in Machine Learning and its applications. In particular, he focuses on high dimensional data analysis, metric learning, machine learned web-search ranking, transfer- and multi-task learning as well as bio medical applications.&lt;br /&gt;
&lt;br /&gt;
== 04/17/2013: Recursive Deep Learning in Natural Language Processing and Computer Vision ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.socher.org/ Richard Socher],  Stanford University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, April 17, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Hierarchical and recursive structure is commonly found in different&lt;br /&gt;
modalities, including natural language sentences and scene images. I&lt;br /&gt;
will introduce several recursive deep learning models that, unlike &lt;br /&gt;
standard deep learning methods can learn compositional meaning vector &lt;br /&gt;
representations for phrases or images.&lt;br /&gt;
&lt;br /&gt;
These recursive neural network based models obtain state-of-the-art&lt;br /&gt;
performance on a variety of syntactic and semantic language tasks&lt;br /&gt;
such as parsing, sentiment analysis, paraphrase detection and relation&lt;br /&gt;
classification for extracting knowledge from the web. Because often no&lt;br /&gt;
language specific assumptions are made the same architectures can be&lt;br /&gt;
used for visual scene understanding and object classification from 3d &lt;br /&gt;
images.&lt;br /&gt;
&lt;br /&gt;
Besides the good performance, the models capture interesting phenomena&lt;br /&gt;
in language such as compositionality. For instance the models learn&lt;br /&gt;
that “not good” has worse sentiment than “good” or that high level&lt;br /&gt;
negation can change the meaning of longer phrases with many positive &lt;br /&gt;
words. Furthermore, unlike most machine learning approaches that rely on &lt;br /&gt;
human designed feature sets, features are learned as part of the model.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Richard Socher is a PhD student at Stanford working with Chris Manning&lt;br /&gt;
and Andrew Ng. His research interests are machine learning for NLP and&lt;br /&gt;
vision. He is interested in developing new models that learn useful &lt;br /&gt;
features, capture compositional and hierarchical structure in multiple&lt;br /&gt;
modalities and perform well across multiple tasks. He was awarded the &lt;br /&gt;
2011 Yahoo! Key Scientific Challenges Award, the Distinguished &lt;br /&gt;
Application Paper Award at ICML 2011 and a Microsoft Research PhD &lt;br /&gt;
Fellowship in 2012.&lt;br /&gt;
&lt;br /&gt;
== 05/01/2013: Probabilistic Soft Logic, Stephen Bach ==&lt;br /&gt;
&lt;br /&gt;
In this talk, we will give an overview of probabilistic soft logic (PSL), a tool being developed in the LINQS group at UMD for modeling, learning, and inference in structured and multi-relational domains. We&#039;ll describe the basic syntax and semantics for the language and then describe the underlying mathematical framework upon which efficient inference and learning is built. We refer to the underlying mathematical model as a hinge-loss Markov random field (HL-MRF). HL-MRFs have a number of nice properties, including the fact that most probable explanation (MPE) inference corresponds to a convex optimization problem. We present recent results showing that, using state–of-the-art optimization techniques, we can perform inference on problems with tens of thousands of random variables in seconds, and problems with hundreds of thousands of random variables in minutes. We are currently working on several approaches for distributed inference in PSL, which promise even greater scalability. We will conclude by discussing applications of PSL to problems such as: group identification in social media, activity recognition in videos, image reconstruction, knowledge graph identification, schema mapping, drug target prediction, and others as time permits.&lt;br /&gt;
&lt;br /&gt;
== 05/08/2013: The Foreseer: Integrative Retrieval and Mining of Information in Online Communities ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www-personal.umich.edu/~qmei/ Qiaozhu Mei], University of Michigan&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, May 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With the growth of online communities, the Web has evolved from networks of shared documents into networks of knowledge-sharing groups and individuals. A vast amount of heterogeneous yet interrelated information is being generated, making existing information analysis techniques inadequate. Current data mining tools often neglect the actual context, creators, and consumers of information. Foreseer is a user-centric framework for the next generation of information retrieval and mining for online communities. It represents a new paradigm of information analysis through the integration of the four “C’s”: content, context, crowd, and cloud. &lt;br /&gt;
&lt;br /&gt;
In this talk, we will introduce our recent effort of integrative analysis and mining of information in online communities. We will highlight the real world problems in online communities to which the Foreseer techniques have been successfully applied. These topics include the identification of information needs from social media, the prediction of the adoption of hashtags in microblogging communities, and the prediction of social lending behaviors in microfinance communities.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Qiaozhu Mei is an assistant professor at the School of Information, the University of Michigan. He is widely interested in information retrieval, text mining, natural language processing and their applications in web search, social computing, and health informatics. He has served in the program committee of almost all major conferences in these areas. He is also a recipient of the NSF CAREER Award, two runner-up best student paper awards at KDD, and a SIGKDD dissertation award.&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=720</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=720"/>
		<updated>2013-08-27T21:48:44Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: /* Previous Talks */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 01/30/2013: Human Translation and Machine Translation ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/pkoehn/ Philipp Koehn],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, January 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Despite all the recent successes of machine translation, when it&lt;br /&gt;
comes to high quality publishable translation, human translators&lt;br /&gt;
are still unchallenged. Since we can&#039;t beat them, can we help&lt;br /&gt;
them to become more productive? I will talk about some recent&lt;br /&gt;
work on developing assistance tools for human translators.&lt;br /&gt;
You can also check out a prototype [http://www.caitra.org/ here]&lt;br /&gt;
and learn about our ongoing European projects [http://www.casmacat.eu/ CASMACAT]&lt;br /&gt;
and [http://www.matecat.com/ MATECAT].&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Philipp Koehn is Professor of Machine Translation at the&lt;br /&gt;
School of Informatics at the University of Edinburgh, Scotland.&lt;br /&gt;
He received his PhD at the University of Southern California&lt;br /&gt;
and spent a year as postdoctoral researcher at MIT.&lt;br /&gt;
He is well-known in the field of statistical machine translation&lt;br /&gt;
for the leading open source toolkit Moses, the organization&lt;br /&gt;
of the annual Workshop on Statistical Machine Translation&lt;br /&gt;
and its evaluation campaign as well as the Machine Translation&lt;br /&gt;
Marathon. He is founding president of the ACL SIG MT and&lt;br /&gt;
currently serves a vice president-elect of the ACL SIG DAT.&lt;br /&gt;
He has published over 80 papers and the textbook in the&lt;br /&gt;
field. He manages a number of EU and DARPA funded&lt;br /&gt;
research projects aimed at morpho-syntactic models, machine&lt;br /&gt;
learning methods and computer assisted translation tools.&lt;br /&gt;
&lt;br /&gt;
== 02/06/2013: A New Recommender System for Large-scale Document Exploration ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.cmu.edu/~chongw/ Chong Wang],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
How can we help people quickly navigate the vast amount of data&lt;br /&gt;
and acquire useful knowledge from it? Recommender systems provide&lt;br /&gt;
a promising solution to this problem. They narrow down the search&lt;br /&gt;
space by providing a few recommendations that are tailored to&lt;br /&gt;
users&#039; personal preferences. However, these systems usually work&lt;br /&gt;
like a black box, limiting further opportunities to provide more&lt;br /&gt;
exploratory experiences to their users.&lt;br /&gt;
&lt;br /&gt;
In this talk, I will describe how we build a new recommender&lt;br /&gt;
system for document exploration. Specially, I will talk about two&lt;br /&gt;
building blocks of the system in detail. The first is about a new&lt;br /&gt;
probabilistic model for document recommendation that is both&lt;br /&gt;
predictive and interpretable. It not only gives better predictive&lt;br /&gt;
performance, but also provides better transparency than&lt;br /&gt;
traditional approaches. This transparency creates many new&lt;br /&gt;
opportunities for exploratory analysis---For example, a user can&lt;br /&gt;
manually adjust her preferences and the system responds to this&lt;br /&gt;
by changing its recommendations. Second, building a recommender&lt;br /&gt;
system like this requires learning the probabilistic model from&lt;br /&gt;
large-scale empirical data. I will describe a scalable approach&lt;br /&gt;
for learning a wide class of probabilistic models that include&lt;br /&gt;
our recommendation model as a special case.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Chong is a Project Scientist in Eric Xing&#039;s group, Machine Learning Department, Carnegie Mellon University.  His PhD advisor was David M. Blei from Princeton University.&lt;br /&gt;
&lt;br /&gt;
== 02/13/2013: Computational Modeling of Sociopragmatic Language Use in Arabic and English Social Media ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www1.ccls.columbia.edu/~mdiab/ Mona Diab], Columbia University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Social media language is a treasure trove for mining and understanding human interactions. In discussion fora, people naturally form groups and subgroups aligning along points of consensus and contention. These subgroup formations are quite nuanced as people could agree on some topic such as liking the movie the matrix, but some within that group might disagree on rating the acting skills of Keanu Reeves. Languages manifest these alignments exploiting  interesting sociolinguistic devices in different ways. In this talk, I will present our work on subgroup modeling and detection in both Arabic and English social media language. I will share with you our experiences with modeling both explicit and implicit attitude using high and low dimensional feature modeling. This work is the beginning of an interesting exploration into the realm of building computational  models of some aspects of the sociopragmatics of human language with the hopes that this research could lead to a  better understanding of human interaction. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Mona Diab is an Associate Professor of Computer Science at the George Washington University. She is also a cofounder of the CADIM (Columbia Arabic Dialect Modeling) group at Columbia University. Mona earned her PhD in Computational Linguistics from University of Maryland College Park with Philip Resnik in 2003 and then did her postdoctoral training with Daniel Jurafsky at Stanford University where she was part of the NLP group.  from 2005 till 2012, before joining GWU in Jan of 2013, Mona held the position of Research Scientist/Principle Investigator at Columbia University Center for Computational Learning Systems (CCLS). Mona&#039;s research  interests span computational lexical semantics, multilingual processing (with a special interest in Arabic and low resource languages), unsupervised learning for NLP, computational sociopragmatic modeling, information extraction and machine translation. Over the past 9 years, Mona has developed significant expertise in modeling low resource languages with a focus on Arabic dialect processing. She is especially interested in ways to leverage existing rich resources to inform algorithms for processing low resource languages. Her research has been published in over 90 papers in various internationally recognized scientific venues. Mona serves as the current elected President of the ACL SIG on Semitic Language Processing, she is also the elected Secretary for the ACL SIG on issues in the Lexicon (SIGLEX). She also serves on the NAACL board as an elected member. Mona recently (2012) co-founded  the yearly *SEM conference that attempts to bring together all aspects of semantic processing under the same umbrella venue. &lt;br /&gt;
&lt;br /&gt;
== 02/14/2013: Efficient Probabilistic Models for Rankings and Orderings ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://stanford.edu/~jhuang11/ Jon Huang], Stanford University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Thursday, February 14, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The need to reason probabilistically with rankings and orderings arises &lt;br /&gt;
in a number of real world problems.  Probability distributions over &lt;br /&gt;
rankings and orderings arise naturally, for example, in preference data, &lt;br /&gt;
and political election data, as well as a number of less obvious &lt;br /&gt;
settings such as topic analysis and neurodegenerative disease &lt;br /&gt;
progression modeling. Representing distributions over the space of all &lt;br /&gt;
rankings is challenging, however, due to the factorial number of ways to &lt;br /&gt;
rank a collection of items.  The focus of my talk is to discuss methods &lt;br /&gt;
for combatting this factorial explosion in probabilistic representation &lt;br /&gt;
and inference.&lt;br /&gt;
&lt;br /&gt;
Ordinarily, a typical machine learning method for dealing with &lt;br /&gt;
combinatorial complexity might be to exploit conditional independence &lt;br /&gt;
relations in order to decompose a distribution into compact factors of a &lt;br /&gt;
graphical model.  For ranked data, however, a far more natural and &lt;br /&gt;
useful probabilistic relation is that of `riffled independence&#039;.  I will &lt;br /&gt;
introduce the concept of riffled independence and discuss how these &lt;br /&gt;
riffle independent relations can be used to decompose a distribution &lt;br /&gt;
over rankings into a product of compactly represented factors.  These &lt;br /&gt;
so-called hierarchical riffle-independent distributions are particularly &lt;br /&gt;
amenable to efficient inference and learning algorithms and in many &lt;br /&gt;
cases lead to intuitively interpretable probabilistic models. To &lt;br /&gt;
illustrate the power of exploiting riffled independence, I will discuss &lt;br /&gt;
a few applications, including Irish political election analysis, &lt;br /&gt;
visualizing the japanese preferences of sushi types and modeling the &lt;br /&gt;
progression of Alzheimer&#039;s disease, showing results on real datasets in &lt;br /&gt;
each problem.&lt;br /&gt;
&lt;br /&gt;
This is joint work with Carlos Guestrin (University of Washington), &lt;br /&gt;
Ashish Kapoor (Microsoft Research) and Daniel Alexander (University &lt;br /&gt;
College London).&lt;br /&gt;
&lt;br /&gt;
== 02/27/2013: Building Scholarly Methodologies with Large-Scale Topic Analysis ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.princeton.edu/~mimno/ David Mimno], Princeton University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 27, 2013, 9:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; Hornbake (South Wing) Room 2119&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;NOTE SPECIAL TIME AND LOCATION!!!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
In the last ten years we have seen the creation of massive digital text collections, from Twitter feeds to million-book libraries, all in dozens of languages. At the same time, researchers have developed text mining methods that go beyond simple word frequency analysis to uncover thematic patterns. When we combine big data with powerful algorithms, we enable analysts in many different fields to enhance qualitative perspectives with quantitative measurements. But these methods are only useful if we can apply them at massive scale and distinguish consistent patterns from random variations. In this talk I will describe my work building reliable topic modeling methodologies for humanists, social scientists and science policy officers.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; David Mimno is a postdoctoral researcher in the Computer Science department at Princeton University. He received his PhD from the University of Massachusetts, Amherst. Before graduate school, he served as Head Programmer at the Perseus Project, a digital library for cultural heritage materials, at Tufts University. He is supported by a CRA Computing Innovation fellowship.&lt;br /&gt;
&lt;br /&gt;
== 03/13/2013: Is Any Politics Local?  An Automated Analysis of Mayoral and Gubernatorial Addresses ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://explore.georgetown.edu/people/dh335/ Dan Hopkins],  Georgetown University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, March 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Dubbed &amp;quot;laboratories of democracy,&amp;quot; America&#039;s states and its large cities face a wide variety of public policy challenges.  But in a period of expanding federal authority and increased long-distance communication, the extent to which U.S. states and large cities pursue varying policy agendas is at once important and unknown.  This paper draws on techniques from automated content analysis to measure the major topics in more than 500 &amp;quot;State of the State&amp;quot; and &amp;quot;State of the City&amp;quot; addresses given by American executive officials since 2000.  Drawing on the Correlated Topic Model (Blei and Lafferty 2006) and other approaches to topic modeling, it demonstrates that big-city mayors do address a distinctive set of topics from their counterparts in state capitols, but one that is surprisingly consistent across cities.  Knowing a mayor&#039;s political party provides little leverage on the topics he or she is likely to highlight, while the same is true for objective indicators such as economic conditions or the city&#039;s crime rate.  At the state level, partisanship proves more predictive of the topics addressed by Governors, but there, too, institutional responsibilities constrain leaders to emphasize a broad and similar set of issues.  American political institutions inscribe a substantial role for geographic and institutional differences, but the policy agendas of America&#039;s states and largest cities are homogeneous and overlapping.      &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Daniel J. Hopkins is an Assistant Professor of Government at Georgetown University whose research focuses on American politics, with a special emphasis on political behavior, urban and local politics, racial and ethnic politics, and statistical methods.  Specifically, his research has addressed issues including the role of rhetoric and of local contexts in shaping political behavior.  It has also involved the development and application of automated techniques for analyzing political rhetoric.  Professor Hopkins&#039; work has appeared in a variety of scholarly and popular outlets, including the American Political Science Review, the American Journal of Political Science, the Journal of Politics, and The Washington Post.  Professor Hopkins received his Ph.D. from Harvard University in 2007. &lt;br /&gt;
&lt;br /&gt;
== 03/27/2013: Corpora and Statistical Analysis of Non-Linguistic Symbol Systems ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Richard Sproat, Google New York&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, March 27, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We report on the creation and analysis of a set of corpora of non-linguistic symbol systems. &lt;br /&gt;
The resource, the first of its kind, consists of data from seven systems, both ancient and modern, &lt;br /&gt;
with four further systems under development, and several others planned. The systems represent &lt;br /&gt;
a range of types, including heraldic systems, formal systems, and systems that are mostly or purely &lt;br /&gt;
decorative. We also compare these systems statistically with a large set of linguistic systems, which &lt;br /&gt;
also range over both time and type.&lt;br /&gt;
&lt;br /&gt;
We show that none of the measures proposed in published work by Rao and colleagues (Rao et al., 2009a; Rao, 2010) &lt;br /&gt;
or Lee and colleagues (Lee et al., 2010a) works. In particular, Rao’s entropic measures are evidently useless when &lt;br /&gt;
one considers a wider range of examples of real non-linguistic symbol systems. And Lee’s measures, with the cutoff &lt;br /&gt;
values they propose, misclassify nearly all of our non-linguistic systems. However, we also show that one of Lee’s &lt;br /&gt;
measures, with different cutoff values, as well as another measure we develop here, do seem useful. We further &lt;br /&gt;
demonstrate that they are useful largely because they are both highly correlated with a rather trivial feature: &lt;br /&gt;
mean text length. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Richard Sproat received his Ph.D. in Linguistics from the Massachusetts&lt;br /&gt;
Institute of Technology in 1985. He has worked at AT&amp;amp;T Bell Labs, at&lt;br /&gt;
Lucent&#039;s Bell Labs and at AT&amp;amp;T Labs -- Research, before joining the faculty of&lt;br /&gt;
the University of Illinois. From there he moved to the Center for Spoken&lt;br /&gt;
Language Understanding at the Oregon Health &amp;amp; Science University. In the Fall of&lt;br /&gt;
2012 he moved to Google, New York as a Research Scientist.&lt;br /&gt;
&lt;br /&gt;
Sproat has worked in numerous areas relating to language and computational&lt;br /&gt;
linguistics, including syntax, morphology, computational morphology,&lt;br /&gt;
articulatory and acoustic phonetics, text processing, text-to-speech synthesis,&lt;br /&gt;
and text-to-scene conversion. Some of his recent work includes multilingual&lt;br /&gt;
named entity transliteration, the effects of script layout on readers&#039;&lt;br /&gt;
phonological awareness, and tools for automated assessment of child language. At&lt;br /&gt;
Google he works on multilingual text normalization and finite-state methods for&lt;br /&gt;
language processing. He also has a long-standing interest in writing systems and&lt;br /&gt;
symbol systems more generally.&lt;br /&gt;
&lt;br /&gt;
== 04/10/2013: Learning with Marginalized Corrupted Features ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cse.wustl.edu/~kilian/ Kilian Weinberger],  Washington University in St. Louis&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, April 10, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If infinite amounts of labeled data are provided, many machine learning algorithms become perfect. With finite amounts of data, regularization or priors have to be used to introduce bias into a classifier. We propose a third option: learning with marginalized corrupted features. We (implicitly) corrupt existing data as a means to generate additional, infinitely many, training samples from a slightly different data distribution -- this is computationally tractable, because the corruption can be marginalized out in closed form. Our framework leads to machine learning algorithms that are fast, generalize well and naturally scale to very large data sets. We showcase this technology as regularization for general risk minimization and for marginalized deep learning for document representations. We provide experimental results on part of speech tagging as well as document and image classification. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Kilian Q. Weinberger is an Assistant Professor in the Department of Computer Science &amp;amp; Engineering at Washington University in St. Louis. He received his Ph.D. from the University of Pennsylvania in Machine Learning under the supervision of Lawrence Saul. Prior to this, he obtained his undergraduate degree in Mathematics and Computer Science at the University of Oxford. During his career he has won several best paper awards at ICML, CVPR and AISTATS. In 2011 he was awarded the AAAI senior program chair award and in 2012 he received the NSF CAREER award. Kilian Weinberger&#039;s research is in Machine Learning and its applications. In particular, he focuses on high dimensional data analysis, metric learning, machine learned web-search ranking, transfer- and multi-task learning as well as bio medical applications.&lt;br /&gt;
&lt;br /&gt;
== 04/17/2013: Recursive Deep Learning in Natural Language Processing and Computer Vision ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.socher.org/ Richard Socher],  Stanford University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, April 17, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Hierarchical and recursive structure is commonly found in different&lt;br /&gt;
modalities, including natural language sentences and scene images. I&lt;br /&gt;
will introduce several recursive deep learning models that, unlike &lt;br /&gt;
standard deep learning methods can learn compositional meaning vector &lt;br /&gt;
representations for phrases or images.&lt;br /&gt;
&lt;br /&gt;
These recursive neural network based models obtain state-of-the-art&lt;br /&gt;
performance on a variety of syntactic and semantic language tasks&lt;br /&gt;
such as parsing, sentiment analysis, paraphrase detection and relation&lt;br /&gt;
classification for extracting knowledge from the web. Because often no&lt;br /&gt;
language specific assumptions are made the same architectures can be&lt;br /&gt;
used for visual scene understanding and object classification from 3d &lt;br /&gt;
images.&lt;br /&gt;
&lt;br /&gt;
Besides the good performance, the models capture interesting phenomena&lt;br /&gt;
in language such as compositionality. For instance the models learn&lt;br /&gt;
that “not good” has worse sentiment than “good” or that high level&lt;br /&gt;
negation can change the meaning of longer phrases with many positive &lt;br /&gt;
words. Furthermore, unlike most machine learning approaches that rely on &lt;br /&gt;
human designed feature sets, features are learned as part of the model.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Richard Socher is a PhD student at Stanford working with Chris Manning&lt;br /&gt;
and Andrew Ng. His research interests are machine learning for NLP and&lt;br /&gt;
vision. He is interested in developing new models that learn useful &lt;br /&gt;
features, capture compositional and hierarchical structure in multiple&lt;br /&gt;
modalities and perform well across multiple tasks. He was awarded the &lt;br /&gt;
2011 Yahoo! Key Scientific Challenges Award, the Distinguished &lt;br /&gt;
Application Paper Award at ICML 2011 and a Microsoft Research PhD &lt;br /&gt;
Fellowship in 2012.&lt;br /&gt;
&lt;br /&gt;
== 05/01/2013: Probabilistic Soft Logic, Stephen Bach ==&lt;br /&gt;
&lt;br /&gt;
In this talk, we will give an overview of probabilistic soft logic (PSL), a tool being developed in the LINQS group at UMD for modeling, learning, and inference in structured and multi-relational domains. We&#039;ll describe the basic syntax and semantics for the language and then describe the underlying mathematical framework upon which efficient inference and learning is built. We refer to the underlying mathematical model as a hinge-loss Markov random field (HL-MRF). HL-MRFs have a number of nice properties, including the fact that most probable explanation (MPE) inference corresponds to a convex optimization problem. We present recent results showing that, using state–of-the-art optimization techniques, we can perform inference on problems with tens of thousands of random variables in seconds, and problems with hundreds of thousands of random variables in minutes. We are currently working on several approaches for distributed inference in PSL, which promise even greater scalability. We will conclude by discussing applications of PSL to problems such as: group identification in social media, activity recognition in videos, image reconstruction, knowledge graph identification, schema mapping, drug target prediction, and others as time permits.&lt;br /&gt;
&lt;br /&gt;
== 05/08/2013: The Foreseer: Integrative Retrieval and Mining of Information in Online Communities ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www-personal.umich.edu/~qmei/ Qiaozhu Mei], University of Michigan&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, May 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With the growth of online communities, the Web has evolved from networks of shared documents into networks of knowledge-sharing groups and individuals. A vast amount of heterogeneous yet interrelated information is being generated, making existing information analysis techniques inadequate. Current data mining tools often neglect the actual context, creators, and consumers of information. Foreseer is a user-centric framework for the next generation of information retrieval and mining for online communities. It represents a new paradigm of information analysis through the integration of the four “C’s”: content, context, crowd, and cloud. &lt;br /&gt;
&lt;br /&gt;
In this talk, we will introduce our recent effort of integrative analysis and mining of information in online communities. We will highlight the real world problems in online communities to which the Foreseer techniques have been successfully applied. These topics include the identification of information needs from social media, the prediction of the adoption of hashtags in microblogging communities, and the prediction of social lending behaviors in microfinance communities.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Qiaozhu Mei is an assistant professor at the School of Information, the University of Michigan. He is widely interested in information retrieval, text mining, natural language processing and their applications in web search, social computing, and health informatics. He has served in the program committee of almost all major conferences in these areas. He is also a recipient of the NSF CAREER Award, two runner-up best student paper awards at KDD, and a SIGKDD dissertation award.&lt;br /&gt;
&lt;br /&gt;
== To be rescheduled: Matthew Gerber ==&lt;br /&gt;
&lt;br /&gt;
Title:  Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets&lt;br /&gt;
&lt;br /&gt;
Abstract:  Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
Bio:  Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Spring 2013)|Spring 2013]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=695</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=695"/>
		<updated>2013-04-29T16:22:16Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 01/30/2013: Human Translation and Machine Translation ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/pkoehn/ Philipp Koehn],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, January 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Despite all the recent successes of machine translation, when it&lt;br /&gt;
comes to high quality publishable translation, human translators&lt;br /&gt;
are still unchallenged. Since we can&#039;t beat them, can we help&lt;br /&gt;
them to become more productive? I will talk about some recent&lt;br /&gt;
work on developing assistance tools for human translators.&lt;br /&gt;
You can also check out a prototype [http://www.caitra.org/ here]&lt;br /&gt;
and learn about our ongoing European projects [http://www.casmacat.eu/ CASMACAT]&lt;br /&gt;
and [http://www.matecat.com/ MATECAT].&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Philipp Koehn is Professor of Machine Translation at the&lt;br /&gt;
School of Informatics at the University of Edinburgh, Scotland.&lt;br /&gt;
He received his PhD at the University of Southern California&lt;br /&gt;
and spent a year as postdoctoral researcher at MIT.&lt;br /&gt;
He is well-known in the field of statistical machine translation&lt;br /&gt;
for the leading open source toolkit Moses, the organization&lt;br /&gt;
of the annual Workshop on Statistical Machine Translation&lt;br /&gt;
and its evaluation campaign as well as the Machine Translation&lt;br /&gt;
Marathon. He is founding president of the ACL SIG MT and&lt;br /&gt;
currently serves a vice president-elect of the ACL SIG DAT.&lt;br /&gt;
He has published over 80 papers and the textbook in the&lt;br /&gt;
field. He manages a number of EU and DARPA funded&lt;br /&gt;
research projects aimed at morpho-syntactic models, machine&lt;br /&gt;
learning methods and computer assisted translation tools.&lt;br /&gt;
&lt;br /&gt;
== 02/06/2013: A New Recommender System for Large-scale Document Exploration ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.cmu.edu/~chongw/ Chong Wang],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
How can we help people quickly navigate the vast amount of data&lt;br /&gt;
and acquire useful knowledge from it? Recommender systems provide&lt;br /&gt;
a promising solution to this problem. They narrow down the search&lt;br /&gt;
space by providing a few recommendations that are tailored to&lt;br /&gt;
users&#039; personal preferences. However, these systems usually work&lt;br /&gt;
like a black box, limiting further opportunities to provide more&lt;br /&gt;
exploratory experiences to their users.&lt;br /&gt;
&lt;br /&gt;
In this talk, I will describe how we build a new recommender&lt;br /&gt;
system for document exploration. Specially, I will talk about two&lt;br /&gt;
building blocks of the system in detail. The first is about a new&lt;br /&gt;
probabilistic model for document recommendation that is both&lt;br /&gt;
predictive and interpretable. It not only gives better predictive&lt;br /&gt;
performance, but also provides better transparency than&lt;br /&gt;
traditional approaches. This transparency creates many new&lt;br /&gt;
opportunities for exploratory analysis---For example, a user can&lt;br /&gt;
manually adjust her preferences and the system responds to this&lt;br /&gt;
by changing its recommendations. Second, building a recommender&lt;br /&gt;
system like this requires learning the probabilistic model from&lt;br /&gt;
large-scale empirical data. I will describe a scalable approach&lt;br /&gt;
for learning a wide class of probabilistic models that include&lt;br /&gt;
our recommendation model as a special case.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Chong is a Project Scientist in Eric Xing&#039;s group, Machine Learning Department, Carnegie Mellon University.  His PhD advisor was David M. Blei from Princeton University.&lt;br /&gt;
&lt;br /&gt;
== 02/13/2013: Computational Modeling of Sociopragmatic Language Use in Arabic and English Social Media ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www1.ccls.columbia.edu/~mdiab/ Mona Diab], Columbia University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Social media language is a treasure trove for mining and understanding human interactions. In discussion fora, people naturally form groups and subgroups aligning along points of consensus and contention. These subgroup formations are quite nuanced as people could agree on some topic such as liking the movie the matrix, but some within that group might disagree on rating the acting skills of Keanu Reeves. Languages manifest these alignments exploiting  interesting sociolinguistic devices in different ways. In this talk, I will present our work on subgroup modeling and detection in both Arabic and English social media language. I will share with you our experiences with modeling both explicit and implicit attitude using high and low dimensional feature modeling. This work is the beginning of an interesting exploration into the realm of building computational  models of some aspects of the sociopragmatics of human language with the hopes that this research could lead to a  better understanding of human interaction. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Mona Diab is an Associate Professor of Computer Science at the George Washington University. She is also a cofounder of the CADIM (Columbia Arabic Dialect Modeling) group at Columbia University. Mona earned her PhD in Computational Linguistics from University of Maryland College Park with Philip Resnik in 2003 and then did her postdoctoral training with Daniel Jurafsky at Stanford University where she was part of the NLP group.  from 2005 till 2012, before joining GWU in Jan of 2013, Mona held the position of Research Scientist/Principle Investigator at Columbia University Center for Computational Learning Systems (CCLS). Mona&#039;s research  interests span computational lexical semantics, multilingual processing (with a special interest in Arabic and low resource languages), unsupervised learning for NLP, computational sociopragmatic modeling, information extraction and machine translation. Over the past 9 years, Mona has developed significant expertise in modeling low resource languages with a focus on Arabic dialect processing. She is especially interested in ways to leverage existing rich resources to inform algorithms for processing low resource languages. Her research has been published in over 90 papers in various internationally recognized scientific venues. Mona serves as the current elected President of the ACL SIG on Semitic Language Processing, she is also the elected Secretary for the ACL SIG on issues in the Lexicon (SIGLEX). She also serves on the NAACL board as an elected member. Mona recently (2012) co-founded  the yearly *SEM conference that attempts to bring together all aspects of semantic processing under the same umbrella venue. &lt;br /&gt;
&lt;br /&gt;
== 02/14/2013: Efficient Probabilistic Models for Rankings and Orderings ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://stanford.edu/~jhuang11/ Jon Huang], Stanford University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Thursday, February 14, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The need to reason probabilistically with rankings and orderings arises &lt;br /&gt;
in a number of real world problems.  Probability distributions over &lt;br /&gt;
rankings and orderings arise naturally, for example, in preference data, &lt;br /&gt;
and political election data, as well as a number of less obvious &lt;br /&gt;
settings such as topic analysis and neurodegenerative disease &lt;br /&gt;
progression modeling. Representing distributions over the space of all &lt;br /&gt;
rankings is challenging, however, due to the factorial number of ways to &lt;br /&gt;
rank a collection of items.  The focus of my talk is to discuss methods &lt;br /&gt;
for combatting this factorial explosion in probabilistic representation &lt;br /&gt;
and inference.&lt;br /&gt;
&lt;br /&gt;
Ordinarily, a typical machine learning method for dealing with &lt;br /&gt;
combinatorial complexity might be to exploit conditional independence &lt;br /&gt;
relations in order to decompose a distribution into compact factors of a &lt;br /&gt;
graphical model.  For ranked data, however, a far more natural and &lt;br /&gt;
useful probabilistic relation is that of `riffled independence&#039;.  I will &lt;br /&gt;
introduce the concept of riffled independence and discuss how these &lt;br /&gt;
riffle independent relations can be used to decompose a distribution &lt;br /&gt;
over rankings into a product of compactly represented factors.  These &lt;br /&gt;
so-called hierarchical riffle-independent distributions are particularly &lt;br /&gt;
amenable to efficient inference and learning algorithms and in many &lt;br /&gt;
cases lead to intuitively interpretable probabilistic models. To &lt;br /&gt;
illustrate the power of exploiting riffled independence, I will discuss &lt;br /&gt;
a few applications, including Irish political election analysis, &lt;br /&gt;
visualizing the japanese preferences of sushi types and modeling the &lt;br /&gt;
progression of Alzheimer&#039;s disease, showing results on real datasets in &lt;br /&gt;
each problem.&lt;br /&gt;
&lt;br /&gt;
This is joint work with Carlos Guestrin (University of Washington), &lt;br /&gt;
Ashish Kapoor (Microsoft Research) and Daniel Alexander (University &lt;br /&gt;
College London).&lt;br /&gt;
&lt;br /&gt;
== 02/27/2013: Building Scholarly Methodologies with Large-Scale Topic Analysis ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.princeton.edu/~mimno/ David Mimno], Princeton University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 27, 2013, 9:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; Hornbake (South Wing) Room 2119&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;NOTE SPECIAL TIME AND LOCATION!!!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
In the last ten years we have seen the creation of massive digital text collections, from Twitter feeds to million-book libraries, all in dozens of languages. At the same time, researchers have developed text mining methods that go beyond simple word frequency analysis to uncover thematic patterns. When we combine big data with powerful algorithms, we enable analysts in many different fields to enhance qualitative perspectives with quantitative measurements. But these methods are only useful if we can apply them at massive scale and distinguish consistent patterns from random variations. In this talk I will describe my work building reliable topic modeling methodologies for humanists, social scientists and science policy officers.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; David Mimno is a postdoctoral researcher in the Computer Science department at Princeton University. He received his PhD from the University of Massachusetts, Amherst. Before graduate school, he served as Head Programmer at the Perseus Project, a digital library for cultural heritage materials, at Tufts University. He is supported by a CRA Computing Innovation fellowship.&lt;br /&gt;
&lt;br /&gt;
== 03/13/2013: Is Any Politics Local?  An Automated Analysis of Mayoral and Gubernatorial Addresses ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://explore.georgetown.edu/people/dh335/ Dan Hopkins],  Georgetown University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, March 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Dubbed &amp;quot;laboratories of democracy,&amp;quot; America&#039;s states and its large cities face a wide variety of public policy challenges.  But in a period of expanding federal authority and increased long-distance communication, the extent to which U.S. states and large cities pursue varying policy agendas is at once important and unknown.  This paper draws on techniques from automated content analysis to measure the major topics in more than 500 &amp;quot;State of the State&amp;quot; and &amp;quot;State of the City&amp;quot; addresses given by American executive officials since 2000.  Drawing on the Correlated Topic Model (Blei and Lafferty 2006) and other approaches to topic modeling, it demonstrates that big-city mayors do address a distinctive set of topics from their counterparts in state capitols, but one that is surprisingly consistent across cities.  Knowing a mayor&#039;s political party provides little leverage on the topics he or she is likely to highlight, while the same is true for objective indicators such as economic conditions or the city&#039;s crime rate.  At the state level, partisanship proves more predictive of the topics addressed by Governors, but there, too, institutional responsibilities constrain leaders to emphasize a broad and similar set of issues.  American political institutions inscribe a substantial role for geographic and institutional differences, but the policy agendas of America&#039;s states and largest cities are homogeneous and overlapping.      &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Daniel J. Hopkins is an Assistant Professor of Government at Georgetown University whose research focuses on American politics, with a special emphasis on political behavior, urban and local politics, racial and ethnic politics, and statistical methods.  Specifically, his research has addressed issues including the role of rhetoric and of local contexts in shaping political behavior.  It has also involved the development and application of automated techniques for analyzing political rhetoric.  Professor Hopkins&#039; work has appeared in a variety of scholarly and popular outlets, including the American Political Science Review, the American Journal of Political Science, the Journal of Politics, and The Washington Post.  Professor Hopkins received his Ph.D. from Harvard University in 2007. &lt;br /&gt;
&lt;br /&gt;
== 03/27/2013: Corpora and Statistical Analysis of Non-Linguistic Symbol Systems ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Richard Sproat, Google New York&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, March 27, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We report on the creation and analysis of a set of corpora of non-linguistic symbol systems. &lt;br /&gt;
The resource, the first of its kind, consists of data from seven systems, both ancient and modern, &lt;br /&gt;
with four further systems under development, and several others planned. The systems represent &lt;br /&gt;
a range of types, including heraldic systems, formal systems, and systems that are mostly or purely &lt;br /&gt;
decorative. We also compare these systems statistically with a large set of linguistic systems, which &lt;br /&gt;
also range over both time and type.&lt;br /&gt;
&lt;br /&gt;
We show that none of the measures proposed in published work by Rao and colleagues (Rao et al., 2009a; Rao, 2010) &lt;br /&gt;
or Lee and colleagues (Lee et al., 2010a) works. In particular, Rao’s entropic measures are evidently useless when &lt;br /&gt;
one considers a wider range of examples of real non-linguistic symbol systems. And Lee’s measures, with the cutoff &lt;br /&gt;
values they propose, misclassify nearly all of our non-linguistic systems. However, we also show that one of Lee’s &lt;br /&gt;
measures, with different cutoff values, as well as another measure we develop here, do seem useful. We further &lt;br /&gt;
demonstrate that they are useful largely because they are both highly correlated with a rather trivial feature: &lt;br /&gt;
mean text length. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Richard Sproat received his Ph.D. in Linguistics from the Massachusetts&lt;br /&gt;
Institute of Technology in 1985. He has worked at AT&amp;amp;T Bell Labs, at&lt;br /&gt;
Lucent&#039;s Bell Labs and at AT&amp;amp;T Labs -- Research, before joining the faculty of&lt;br /&gt;
the University of Illinois. From there he moved to the Center for Spoken&lt;br /&gt;
Language Understanding at the Oregon Health &amp;amp; Science University. In the Fall of&lt;br /&gt;
2012 he moved to Google, New York as a Research Scientist.&lt;br /&gt;
&lt;br /&gt;
Sproat has worked in numerous areas relating to language and computational&lt;br /&gt;
linguistics, including syntax, morphology, computational morphology,&lt;br /&gt;
articulatory and acoustic phonetics, text processing, text-to-speech synthesis,&lt;br /&gt;
and text-to-scene conversion. Some of his recent work includes multilingual&lt;br /&gt;
named entity transliteration, the effects of script layout on readers&#039;&lt;br /&gt;
phonological awareness, and tools for automated assessment of child language. At&lt;br /&gt;
Google he works on multilingual text normalization and finite-state methods for&lt;br /&gt;
language processing. He also has a long-standing interest in writing systems and&lt;br /&gt;
symbol systems more generally.&lt;br /&gt;
&lt;br /&gt;
== 04/10/2013: Learning with Marginalized Corrupted Features ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cse.wustl.edu/~kilian/ Kilian Weinberger],  Washington University in St. Louis&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, April 10, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If infinite amounts of labeled data are provided, many machine learning algorithms become perfect. With finite amounts of data, regularization or priors have to be used to introduce bias into a classifier. We propose a third option: learning with marginalized corrupted features. We (implicitly) corrupt existing data as a means to generate additional, infinitely many, training samples from a slightly different data distribution -- this is computationally tractable, because the corruption can be marginalized out in closed form. Our framework leads to machine learning algorithms that are fast, generalize well and naturally scale to very large data sets. We showcase this technology as regularization for general risk minimization and for marginalized deep learning for document representations. We provide experimental results on part of speech tagging as well as document and image classification. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Kilian Q. Weinberger is an Assistant Professor in the Department of Computer Science &amp;amp; Engineering at Washington University in St. Louis. He received his Ph.D. from the University of Pennsylvania in Machine Learning under the supervision of Lawrence Saul. Prior to this, he obtained his undergraduate degree in Mathematics and Computer Science at the University of Oxford. During his career he has won several best paper awards at ICML, CVPR and AISTATS. In 2011 he was awarded the AAAI senior program chair award and in 2012 he received the NSF CAREER award. Kilian Weinberger&#039;s research is in Machine Learning and its applications. In particular, he focuses on high dimensional data analysis, metric learning, machine learned web-search ranking, transfer- and multi-task learning as well as bio medical applications.&lt;br /&gt;
&lt;br /&gt;
== 04/17/2013: Recursive Deep Learning in Natural Language Processing and Computer Vision ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.socher.org/ Richard Socher],  Stanford University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, April 17, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Hierarchical and recursive structure is commonly found in different&lt;br /&gt;
modalities, including natural language sentences and scene images. I&lt;br /&gt;
will introduce several recursive deep learning models that, unlike &lt;br /&gt;
standard deep learning methods can learn compositional meaning vector &lt;br /&gt;
representations for phrases or images.&lt;br /&gt;
&lt;br /&gt;
These recursive neural network based models obtain state-of-the-art&lt;br /&gt;
performance on a variety of syntactic and semantic language tasks&lt;br /&gt;
such as parsing, sentiment analysis, paraphrase detection and relation&lt;br /&gt;
classification for extracting knowledge from the web. Because often no&lt;br /&gt;
language specific assumptions are made the same architectures can be&lt;br /&gt;
used for visual scene understanding and object classification from 3d &lt;br /&gt;
images.&lt;br /&gt;
&lt;br /&gt;
Besides the good performance, the models capture interesting phenomena&lt;br /&gt;
in language such as compositionality. For instance the models learn&lt;br /&gt;
that “not good” has worse sentiment than “good” or that high level&lt;br /&gt;
negation can change the meaning of longer phrases with many positive &lt;br /&gt;
words. Furthermore, unlike most machine learning approaches that rely on &lt;br /&gt;
human designed feature sets, features are learned as part of the model.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Richard Socher is a PhD student at Stanford working with Chris Manning&lt;br /&gt;
and Andrew Ng. His research interests are machine learning for NLP and&lt;br /&gt;
vision. He is interested in developing new models that learn useful &lt;br /&gt;
features, capture compositional and hierarchical structure in multiple&lt;br /&gt;
modalities and perform well across multiple tasks. He was awarded the &lt;br /&gt;
2011 Yahoo! Key Scientific Challenges Award, the Distinguished &lt;br /&gt;
Application Paper Award at ICML 2011 and a Microsoft Research PhD &lt;br /&gt;
Fellowship in 2012.&lt;br /&gt;
&lt;br /&gt;
== 05/01/2013: Stephen Bach ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 05/08/2013: The Foreseer: Integrative Retrieval and Mining of Information in Online Communities ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www-personal.umich.edu/~qmei/ Qiaozhu Mei], University of Michigan&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, May 8, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With the growth of online communities, the Web has evolved from networks of shared documents into networks of knowledge-sharing groups and individuals. A vast amount of heterogeneous yet interrelated information is being generated, making existing information analysis techniques inadequate. Current data mining tools often neglect the actual context, creators, and consumers of information. Foreseer is a user-centric framework for the next generation of information retrieval and mining for online communities. It represents a new paradigm of information analysis through the integration of the four “C’s”: content, context, crowd, and cloud. &lt;br /&gt;
&lt;br /&gt;
In this talk, we will introduce our recent effort of integrative analysis and mining of information in online communities. We will highlight the real world problems in online communities to which the Foreseer techniques have been successfully applied. These topics include the identification of information needs from social media, the prediction of the adoption of hashtags in microblogging communities, and the prediction of social lending behaviors in microfinance communities.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Qiaozhu Mei is an assistant professor at the School of Information, the University of Michigan. He is widely interested in information retrieval, text mining, natural language processing and their applications in web search, social computing, and health informatics. He has served in the program committee of almost all major conferences in these areas. He is also a recipient of the NSF CAREER Award, two runner-up best student paper awards at KDD, and a SIGKDD dissertation award.&lt;br /&gt;
&lt;br /&gt;
== To be rescheduled: Matthew Gerber ==&lt;br /&gt;
&lt;br /&gt;
Title:  Spatio-Temporal Crime Prediction using GPS- and Time-Tagged Tweets&lt;br /&gt;
&lt;br /&gt;
Abstract:  Recent research has shown that social media messages (e.g., tweets) can be used to predict various large-scale events like elections (Bermingham and Smeaton, 2011), infectious disease outbreaks (St. Louis and Zorlu, 2012), and even national revolutions (Howard et al., 2011). The essential hypothesis is that the timing, location, and content of these messages are informative with regard to such future events. For many years, the Predictive Technology Laboratory at the University of Virginia has been constructing statistical prediction models of criminal incidents (e.g., robberies and assaults), and we have recently found preliminary evidence of Twitter’s predictive power in this domain (Wang, Brown, and Gerber, 2012). In my talk, I will present an overview of our crime prediction research with a specific focus on current Twitter-based approaches. I will discuss (1) how precise locations and times of tweets have been integrated into the crime prediction model, and (2) how the textual content of tweets has been integrated into the model via latent Dirichlet allocation. I will present current results of our research in this area and discuss future areas of investigation.&lt;br /&gt;
&lt;br /&gt;
Bio:  Matthew Gerber joined the University of Virginia faculty in 2011 and is currently a Research Assistant Professor in the Department of Systems and Information Engineering. Prior to joining the University of Virginia, Matthew was a Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University and a Visiting Instructor in the School of Computing and Information Systems at Grand Valley State University. In 2010, he received (jointly with Joyce Chai) the ACL Best Long Paper Award for his work on recovering null-instantiated arguments for semantic role labeling. His current research focuses on the semantic analysis of natural language text and its application to various prediction and informatics problems.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=687</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=687"/>
		<updated>2013-04-02T17:28:37Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 01/30/2013: Human Translation and Machine Translation ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/pkoehn/ Philipp Koehn],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, January 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Despite all the recent successes of machine translation, when it&lt;br /&gt;
comes to high quality publishable translation, human translators&lt;br /&gt;
are still unchallenged. Since we can&#039;t beat them, can we help&lt;br /&gt;
them to become more productive? I will talk about some recent&lt;br /&gt;
work on developing assistance tools for human translators.&lt;br /&gt;
You can also check out a prototype [http://www.caitra.org/ here]&lt;br /&gt;
and learn about our ongoing European projects [http://www.casmacat.eu/ CASMACAT]&lt;br /&gt;
and [http://www.matecat.com/ MATECAT].&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Philipp Koehn is Professor of Machine Translation at the&lt;br /&gt;
School of Informatics at the University of Edinburgh, Scotland.&lt;br /&gt;
He received his PhD at the University of Southern California&lt;br /&gt;
and spent a year as postdoctoral researcher at MIT.&lt;br /&gt;
He is well-known in the field of statistical machine translation&lt;br /&gt;
for the leading open source toolkit Moses, the organization&lt;br /&gt;
of the annual Workshop on Statistical Machine Translation&lt;br /&gt;
and its evaluation campaign as well as the Machine Translation&lt;br /&gt;
Marathon. He is founding president of the ACL SIG MT and&lt;br /&gt;
currently serves a vice president-elect of the ACL SIG DAT.&lt;br /&gt;
He has published over 80 papers and the textbook in the&lt;br /&gt;
field. He manages a number of EU and DARPA funded&lt;br /&gt;
research projects aimed at morpho-syntactic models, machine&lt;br /&gt;
learning methods and computer assisted translation tools.&lt;br /&gt;
&lt;br /&gt;
== 02/06/2013: A New Recommender System for Large-scale Document Exploration ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.cmu.edu/~chongw/ Chong Wang],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
How can we help people quickly navigate the vast amount of data&lt;br /&gt;
and acquire useful knowledge from it? Recommender systems provide&lt;br /&gt;
a promising solution to this problem. They narrow down the search&lt;br /&gt;
space by providing a few recommendations that are tailored to&lt;br /&gt;
users&#039; personal preferences. However, these systems usually work&lt;br /&gt;
like a black box, limiting further opportunities to provide more&lt;br /&gt;
exploratory experiences to their users.&lt;br /&gt;
&lt;br /&gt;
In this talk, I will describe how we build a new recommender&lt;br /&gt;
system for document exploration. Specially, I will talk about two&lt;br /&gt;
building blocks of the system in detail. The first is about a new&lt;br /&gt;
probabilistic model for document recommendation that is both&lt;br /&gt;
predictive and interpretable. It not only gives better predictive&lt;br /&gt;
performance, but also provides better transparency than&lt;br /&gt;
traditional approaches. This transparency creates many new&lt;br /&gt;
opportunities for exploratory analysis---For example, a user can&lt;br /&gt;
manually adjust her preferences and the system responds to this&lt;br /&gt;
by changing its recommendations. Second, building a recommender&lt;br /&gt;
system like this requires learning the probabilistic model from&lt;br /&gt;
large-scale empirical data. I will describe a scalable approach&lt;br /&gt;
for learning a wide class of probabilistic models that include&lt;br /&gt;
our recommendation model as a special case.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Chong is a Project Scientist in Eric Xing&#039;s group, Machine Learning Department, Carnegie Mellon University.  His PhD advisor was David M. Blei from Princeton University.&lt;br /&gt;
&lt;br /&gt;
== 02/13/2013: Computational Modeling of Sociopragmatic Language Use in Arabic and English Social Media ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www1.ccls.columbia.edu/~mdiab/ Mona Diab], Columbia University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Social media language is a treasure trove for mining and understanding human interactions. In discussion fora, people naturally form groups and subgroups aligning along points of consensus and contention. These subgroup formations are quite nuanced as people could agree on some topic such as liking the movie the matrix, but some within that group might disagree on rating the acting skills of Keanu Reeves. Languages manifest these alignments exploiting  interesting sociolinguistic devices in different ways. In this talk, I will present our work on subgroup modeling and detection in both Arabic and English social media language. I will share with you our experiences with modeling both explicit and implicit attitude using high and low dimensional feature modeling. This work is the beginning of an interesting exploration into the realm of building computational  models of some aspects of the sociopragmatics of human language with the hopes that this research could lead to a  better understanding of human interaction. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Mona Diab is an Associate Professor of Computer Science at the George Washington University. She is also a cofounder of the CADIM (Columbia Arabic Dialect Modeling) group at Columbia University. Mona earned her PhD in Computational Linguistics from University of Maryland College Park with Philip Resnik in 2003 and then did her postdoctoral training with Daniel Jurafsky at Stanford University where she was part of the NLP group.  from 2005 till 2012, before joining GWU in Jan of 2013, Mona held the position of Research Scientist/Principle Investigator at Columbia University Center for Computational Learning Systems (CCLS). Mona&#039;s research  interests span computational lexical semantics, multilingual processing (with a special interest in Arabic and low resource languages), unsupervised learning for NLP, computational sociopragmatic modeling, information extraction and machine translation. Over the past 9 years, Mona has developed significant expertise in modeling low resource languages with a focus on Arabic dialect processing. She is especially interested in ways to leverage existing rich resources to inform algorithms for processing low resource languages. Her research has been published in over 90 papers in various internationally recognized scientific venues. Mona serves as the current elected President of the ACL SIG on Semitic Language Processing, she is also the elected Secretary for the ACL SIG on issues in the Lexicon (SIGLEX). She also serves on the NAACL board as an elected member. Mona recently (2012) co-founded  the yearly *SEM conference that attempts to bring together all aspects of semantic processing under the same umbrella venue. &lt;br /&gt;
&lt;br /&gt;
== 02/14/2013: Efficient Probabilistic Models for Rankings and Orderings ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://stanford.edu/~jhuang11/ Jon Huang], Stanford University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Thursday, February 14, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The need to reason probabilistically with rankings and orderings arises &lt;br /&gt;
in a number of real world problems.  Probability distributions over &lt;br /&gt;
rankings and orderings arise naturally, for example, in preference data, &lt;br /&gt;
and political election data, as well as a number of less obvious &lt;br /&gt;
settings such as topic analysis and neurodegenerative disease &lt;br /&gt;
progression modeling. Representing distributions over the space of all &lt;br /&gt;
rankings is challenging, however, due to the factorial number of ways to &lt;br /&gt;
rank a collection of items.  The focus of my talk is to discuss methods &lt;br /&gt;
for combatting this factorial explosion in probabilistic representation &lt;br /&gt;
and inference.&lt;br /&gt;
&lt;br /&gt;
Ordinarily, a typical machine learning method for dealing with &lt;br /&gt;
combinatorial complexity might be to exploit conditional independence &lt;br /&gt;
relations in order to decompose a distribution into compact factors of a &lt;br /&gt;
graphical model.  For ranked data, however, a far more natural and &lt;br /&gt;
useful probabilistic relation is that of `riffled independence&#039;.  I will &lt;br /&gt;
introduce the concept of riffled independence and discuss how these &lt;br /&gt;
riffle independent relations can be used to decompose a distribution &lt;br /&gt;
over rankings into a product of compactly represented factors.  These &lt;br /&gt;
so-called hierarchical riffle-independent distributions are particularly &lt;br /&gt;
amenable to efficient inference and learning algorithms and in many &lt;br /&gt;
cases lead to intuitively interpretable probabilistic models. To &lt;br /&gt;
illustrate the power of exploiting riffled independence, I will discuss &lt;br /&gt;
a few applications, including Irish political election analysis, &lt;br /&gt;
visualizing the japanese preferences of sushi types and modeling the &lt;br /&gt;
progression of Alzheimer&#039;s disease, showing results on real datasets in &lt;br /&gt;
each problem.&lt;br /&gt;
&lt;br /&gt;
This is joint work with Carlos Guestrin (University of Washington), &lt;br /&gt;
Ashish Kapoor (Microsoft Research) and Daniel Alexander (University &lt;br /&gt;
College London).&lt;br /&gt;
&lt;br /&gt;
== 02/27/2013: Building Scholarly Methodologies with Large-Scale Topic Analysis ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.princeton.edu/~mimno/ David Mimno], Princeton University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 27, 2013, 9:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; Hornbake (South Wing) Room 2119&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;NOTE SPECIAL TIME AND LOCATION!!!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
In the last ten years we have seen the creation of massive digital text collections, from Twitter feeds to million-book libraries, all in dozens of languages. At the same time, researchers have developed text mining methods that go beyond simple word frequency analysis to uncover thematic patterns. When we combine big data with powerful algorithms, we enable analysts in many different fields to enhance qualitative perspectives with quantitative measurements. But these methods are only useful if we can apply them at massive scale and distinguish consistent patterns from random variations. In this talk I will describe my work building reliable topic modeling methodologies for humanists, social scientists and science policy officers.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; David Mimno is a postdoctoral researcher in the Computer Science department at Princeton University. He received his PhD from the University of Massachusetts, Amherst. Before graduate school, he served as Head Programmer at the Perseus Project, a digital library for cultural heritage materials, at Tufts University. He is supported by a CRA Computing Innovation fellowship.&lt;br /&gt;
&lt;br /&gt;
== 03/13/2013: Is Any Politics Local?  An Automated Analysis of Mayoral and Gubernatorial Addresses ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://explore.georgetown.edu/people/dh335/ Dan Hopkins],  Georgetown University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, March 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Dubbed &amp;quot;laboratories of democracy,&amp;quot; America&#039;s states and its large cities face a wide variety of public policy challenges.  But in a period of expanding federal authority and increased long-distance communication, the extent to which U.S. states and large cities pursue varying policy agendas is at once important and unknown.  This paper draws on techniques from automated content analysis to measure the major topics in more than 500 &amp;quot;State of the State&amp;quot; and &amp;quot;State of the City&amp;quot; addresses given by American executive officials since 2000.  Drawing on the Correlated Topic Model (Blei and Lafferty 2006) and other approaches to topic modeling, it demonstrates that big-city mayors do address a distinctive set of topics from their counterparts in state capitols, but one that is surprisingly consistent across cities.  Knowing a mayor&#039;s political party provides little leverage on the topics he or she is likely to highlight, while the same is true for objective indicators such as economic conditions or the city&#039;s crime rate.  At the state level, partisanship proves more predictive of the topics addressed by Governors, but there, too, institutional responsibilities constrain leaders to emphasize a broad and similar set of issues.  American political institutions inscribe a substantial role for geographic and institutional differences, but the policy agendas of America&#039;s states and largest cities are homogeneous and overlapping.      &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Daniel J. Hopkins is an Assistant Professor of Government at Georgetown University whose research focuses on American politics, with a special emphasis on political behavior, urban and local politics, racial and ethnic politics, and statistical methods.  Specifically, his research has addressed issues including the role of rhetoric and of local contexts in shaping political behavior.  It has also involved the development and application of automated techniques for analyzing political rhetoric.  Professor Hopkins&#039; work has appeared in a variety of scholarly and popular outlets, including the American Political Science Review, the American Journal of Political Science, the Journal of Politics, and The Washington Post.  Professor Hopkins received his Ph.D. from Harvard University in 2007. &lt;br /&gt;
&lt;br /&gt;
== 03/27/2013: Corpora and Statistical Analysis of Non-Linguistic Symbol Systems ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Richard Sproat, Google New York&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, March 27, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We report on the creation and analysis of a set of corpora of non-linguistic symbol systems. &lt;br /&gt;
The resource, the first of its kind, consists of data from seven systems, both ancient and modern, &lt;br /&gt;
with four further systems under development, and several others planned. The systems represent &lt;br /&gt;
a range of types, including heraldic systems, formal systems, and systems that are mostly or purely &lt;br /&gt;
decorative. We also compare these systems statistically with a large set of linguistic systems, which &lt;br /&gt;
also range over both time and type.&lt;br /&gt;
&lt;br /&gt;
We show that none of the measures proposed in published work by Rao and colleagues (Rao et al., 2009a; Rao, 2010) &lt;br /&gt;
or Lee and colleagues (Lee et al., 2010a) works. In particular, Rao’s entropic measures are evidently useless when &lt;br /&gt;
one considers a wider range of examples of real non-linguistic symbol systems. And Lee’s measures, with the cutoff &lt;br /&gt;
values they propose, misclassify nearly all of our non-linguistic systems. However, we also show that one of Lee’s &lt;br /&gt;
measures, with different cutoff values, as well as another measure we develop here, do seem useful. We further &lt;br /&gt;
demonstrate that they are useful largely because they are both highly correlated with a rather trivial feature: &lt;br /&gt;
mean text length. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Richard Sproat received his Ph.D. in Linguistics from the Massachusetts&lt;br /&gt;
Institute of Technology in 1985. He has worked at AT&amp;amp;T Bell Labs, at&lt;br /&gt;
Lucent&#039;s Bell Labs and at AT&amp;amp;T Labs -- Research, before joining the faculty of&lt;br /&gt;
the University of Illinois. From there he moved to the Center for Spoken&lt;br /&gt;
Language Understanding at the Oregon Health &amp;amp; Science University. In the Fall of&lt;br /&gt;
2012 he moved to Google, New York as a Research Scientist.&lt;br /&gt;
&lt;br /&gt;
Sproat has worked in numerous areas relating to language and computational&lt;br /&gt;
linguistics, including syntax, morphology, computational morphology,&lt;br /&gt;
articulatory and acoustic phonetics, text processing, text-to-speech synthesis,&lt;br /&gt;
and text-to-scene conversion. Some of his recent work includes multilingual&lt;br /&gt;
named entity transliteration, the effects of script layout on readers&#039;&lt;br /&gt;
phonological awareness, and tools for automated assessment of child language. At&lt;br /&gt;
Google he works on multilingual text normalization and finite-state methods for&lt;br /&gt;
language processing. He also has a long-standing interest in writing systems and&lt;br /&gt;
symbol systems more generally.&lt;br /&gt;
&lt;br /&gt;
== 04/10/2013: Learning with Marginalized Corrupted Features ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cse.wustl.edu/~kilian/ Kilian Weinberger],  Washington University in St. Louis&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, April 10, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If infinite amounts of labeled data are provided, many machine learning algorithms become perfect. With finite amounts of data, regularization or priors have to be used to introduce bias into a classifier. We propose a third option: learning with marginalized corrupted features. We (implicitly) corrupt existing data as a means to generate additional, infinitely many, training samples from a slightly different data distribution -- this is computationally tractable, because the corruption can be marginalized out in closed form. Our framework leads to machine learning algorithms that are fast, generalize well and naturally scale to very large data sets. We showcase this technology as regularization for general risk minimization and for marginalized deep learning for document representations. We provide experimental results on part of speech tagging as well as document and image classification. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Kilian Q. Weinberger is an Assistant Professor in the Department of Computer Science &amp;amp; Engineering at Washington University in St. Louis. He received his Ph.D. from the University of Pennsylvania in Machine Learning under the supervision of Lawrence Saul. Prior to this, he obtained his undergraduate degree in Mathematics and Computer Science at the University of Oxford. During his career he has won several best paper awards at ICML, CVPR and AISTATS. In 2011 he was awarded the AAAI senior program chair award and in 2012 he received the NSF CAREER award. Kilian Weinberger&#039;s research is in Machine Learning and its applications. In particular, he focuses on high dimensional data analysis, metric learning, machine learned web-search ranking, transfer- and multi-task learning as well as bio medical applications.&lt;br /&gt;
&lt;br /&gt;
== 04/17/2013: Recursive Deep Learning in Natural Language Processing and Computer Vision ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.socher.org/ Richard Socher],  Stanford University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, April 17, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Hierarchical and recursive structure is commonly found in different&lt;br /&gt;
modalities, including natural language sentences and scene images. I&lt;br /&gt;
will introduce several recursive deep learning models that, unlike &lt;br /&gt;
standard deep learning methods can learn compositional meaning vector &lt;br /&gt;
representations for phrases or images.&lt;br /&gt;
&lt;br /&gt;
These recursive neural network based models obtain state-of-the-art&lt;br /&gt;
performance on a variety of syntactic and semantic language tasks&lt;br /&gt;
such as parsing, sentiment analysis, paraphrase detection and relation&lt;br /&gt;
classification for extracting knowledge from the web. Because often no&lt;br /&gt;
language specific assumptions are made the same architectures can be&lt;br /&gt;
used for visual scene understanding and object classification from 3d &lt;br /&gt;
images.&lt;br /&gt;
&lt;br /&gt;
Besides the good performance, the models capture interesting phenomena&lt;br /&gt;
in language such as compositionality. For instance the models learn&lt;br /&gt;
that “not good” has worse sentiment than “good” or that high level&lt;br /&gt;
negation can change the meaning of longer phrases with many positive &lt;br /&gt;
words. Furthermore, unlike most machine learning approaches that rely on &lt;br /&gt;
human designed feature sets, features are learned as part of the model.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Richard Socher is a PhD student at Stanford working with Chris Manning&lt;br /&gt;
and Andrew Ng. His research interests are machine learning for NLP and&lt;br /&gt;
vision. He is interested in developing new models that learn useful &lt;br /&gt;
features, capture compositional and hierarchical structure in multiple&lt;br /&gt;
modalities and perform well across multiple tasks. He was awarded the &lt;br /&gt;
2011 Yahoo! Key Scientific Challenges Award, the Distinguished &lt;br /&gt;
Application Paper Award at ICML 2011 and a Microsoft Research PhD &lt;br /&gt;
Fellowship in 2012.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 04/24/2013: Matthew Gerber ==&lt;br /&gt;
&lt;br /&gt;
TBA&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=686</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=686"/>
		<updated>2013-04-02T17:27:12Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 01/30/2013: Human Translation and Machine Translation ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/pkoehn/ Philipp Koehn],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, January 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Despite all the recent successes of machine translation, when it&lt;br /&gt;
comes to high quality publishable translation, human translators&lt;br /&gt;
are still unchallenged. Since we can&#039;t beat them, can we help&lt;br /&gt;
them to become more productive? I will talk about some recent&lt;br /&gt;
work on developing assistance tools for human translators.&lt;br /&gt;
You can also check out a prototype [http://www.caitra.org/ here]&lt;br /&gt;
and learn about our ongoing European projects [http://www.casmacat.eu/ CASMACAT]&lt;br /&gt;
and [http://www.matecat.com/ MATECAT].&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Philipp Koehn is Professor of Machine Translation at the&lt;br /&gt;
School of Informatics at the University of Edinburgh, Scotland.&lt;br /&gt;
He received his PhD at the University of Southern California&lt;br /&gt;
and spent a year as postdoctoral researcher at MIT.&lt;br /&gt;
He is well-known in the field of statistical machine translation&lt;br /&gt;
for the leading open source toolkit Moses, the organization&lt;br /&gt;
of the annual Workshop on Statistical Machine Translation&lt;br /&gt;
and its evaluation campaign as well as the Machine Translation&lt;br /&gt;
Marathon. He is founding president of the ACL SIG MT and&lt;br /&gt;
currently serves a vice president-elect of the ACL SIG DAT.&lt;br /&gt;
He has published over 80 papers and the textbook in the&lt;br /&gt;
field. He manages a number of EU and DARPA funded&lt;br /&gt;
research projects aimed at morpho-syntactic models, machine&lt;br /&gt;
learning methods and computer assisted translation tools.&lt;br /&gt;
&lt;br /&gt;
== 02/06/2013: A New Recommender System for Large-scale Document Exploration ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.cmu.edu/~chongw/ Chong Wang],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
How can we help people quickly navigate the vast amount of data&lt;br /&gt;
and acquire useful knowledge from it? Recommender systems provide&lt;br /&gt;
a promising solution to this problem. They narrow down the search&lt;br /&gt;
space by providing a few recommendations that are tailored to&lt;br /&gt;
users&#039; personal preferences. However, these systems usually work&lt;br /&gt;
like a black box, limiting further opportunities to provide more&lt;br /&gt;
exploratory experiences to their users.&lt;br /&gt;
&lt;br /&gt;
In this talk, I will describe how we build a new recommender&lt;br /&gt;
system for document exploration. Specially, I will talk about two&lt;br /&gt;
building blocks of the system in detail. The first is about a new&lt;br /&gt;
probabilistic model for document recommendation that is both&lt;br /&gt;
predictive and interpretable. It not only gives better predictive&lt;br /&gt;
performance, but also provides better transparency than&lt;br /&gt;
traditional approaches. This transparency creates many new&lt;br /&gt;
opportunities for exploratory analysis---For example, a user can&lt;br /&gt;
manually adjust her preferences and the system responds to this&lt;br /&gt;
by changing its recommendations. Second, building a recommender&lt;br /&gt;
system like this requires learning the probabilistic model from&lt;br /&gt;
large-scale empirical data. I will describe a scalable approach&lt;br /&gt;
for learning a wide class of probabilistic models that include&lt;br /&gt;
our recommendation model as a special case.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Chong is a Project Scientist in Eric Xing&#039;s group, Machine Learning Department, Carnegie Mellon University.  His PhD advisor was David M. Blei from Princeton University.&lt;br /&gt;
&lt;br /&gt;
== 02/13/2013: Computational Modeling of Sociopragmatic Language Use in Arabic and English Social Media ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www1.ccls.columbia.edu/~mdiab/ Mona Diab], Columbia University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Social media language is a treasure trove for mining and understanding human interactions. In discussion fora, people naturally form groups and subgroups aligning along points of consensus and contention. These subgroup formations are quite nuanced as people could agree on some topic such as liking the movie the matrix, but some within that group might disagree on rating the acting skills of Keanu Reeves. Languages manifest these alignments exploiting  interesting sociolinguistic devices in different ways. In this talk, I will present our work on subgroup modeling and detection in both Arabic and English social media language. I will share with you our experiences with modeling both explicit and implicit attitude using high and low dimensional feature modeling. This work is the beginning of an interesting exploration into the realm of building computational  models of some aspects of the sociopragmatics of human language with the hopes that this research could lead to a  better understanding of human interaction. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Mona Diab is an Associate Professor of Computer Science at the George Washington University. She is also a cofounder of the CADIM (Columbia Arabic Dialect Modeling) group at Columbia University. Mona earned her PhD in Computational Linguistics from University of Maryland College Park with Philip Resnik in 2003 and then did her postdoctoral training with Daniel Jurafsky at Stanford University where she was part of the NLP group.  from 2005 till 2012, before joining GWU in Jan of 2013, Mona held the position of Research Scientist/Principle Investigator at Columbia University Center for Computational Learning Systems (CCLS). Mona&#039;s research  interests span computational lexical semantics, multilingual processing (with a special interest in Arabic and low resource languages), unsupervised learning for NLP, computational sociopragmatic modeling, information extraction and machine translation. Over the past 9 years, Mona has developed significant expertise in modeling low resource languages with a focus on Arabic dialect processing. She is especially interested in ways to leverage existing rich resources to inform algorithms for processing low resource languages. Her research has been published in over 90 papers in various internationally recognized scientific venues. Mona serves as the current elected President of the ACL SIG on Semitic Language Processing, she is also the elected Secretary for the ACL SIG on issues in the Lexicon (SIGLEX). She also serves on the NAACL board as an elected member. Mona recently (2012) co-founded  the yearly *SEM conference that attempts to bring together all aspects of semantic processing under the same umbrella venue. &lt;br /&gt;
&lt;br /&gt;
== 02/14/2013: Efficient Probabilistic Models for Rankings and Orderings ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://stanford.edu/~jhuang11/ Jon Huang], Stanford University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Thursday, February 14, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The need to reason probabilistically with rankings and orderings arises &lt;br /&gt;
in a number of real world problems.  Probability distributions over &lt;br /&gt;
rankings and orderings arise naturally, for example, in preference data, &lt;br /&gt;
and political election data, as well as a number of less obvious &lt;br /&gt;
settings such as topic analysis and neurodegenerative disease &lt;br /&gt;
progression modeling. Representing distributions over the space of all &lt;br /&gt;
rankings is challenging, however, due to the factorial number of ways to &lt;br /&gt;
rank a collection of items.  The focus of my talk is to discuss methods &lt;br /&gt;
for combatting this factorial explosion in probabilistic representation &lt;br /&gt;
and inference.&lt;br /&gt;
&lt;br /&gt;
Ordinarily, a typical machine learning method for dealing with &lt;br /&gt;
combinatorial complexity might be to exploit conditional independence &lt;br /&gt;
relations in order to decompose a distribution into compact factors of a &lt;br /&gt;
graphical model.  For ranked data, however, a far more natural and &lt;br /&gt;
useful probabilistic relation is that of `riffled independence&#039;.  I will &lt;br /&gt;
introduce the concept of riffled independence and discuss how these &lt;br /&gt;
riffle independent relations can be used to decompose a distribution &lt;br /&gt;
over rankings into a product of compactly represented factors.  These &lt;br /&gt;
so-called hierarchical riffle-independent distributions are particularly &lt;br /&gt;
amenable to efficient inference and learning algorithms and in many &lt;br /&gt;
cases lead to intuitively interpretable probabilistic models. To &lt;br /&gt;
illustrate the power of exploiting riffled independence, I will discuss &lt;br /&gt;
a few applications, including Irish political election analysis, &lt;br /&gt;
visualizing the japanese preferences of sushi types and modeling the &lt;br /&gt;
progression of Alzheimer&#039;s disease, showing results on real datasets in &lt;br /&gt;
each problem.&lt;br /&gt;
&lt;br /&gt;
This is joint work with Carlos Guestrin (University of Washington), &lt;br /&gt;
Ashish Kapoor (Microsoft Research) and Daniel Alexander (University &lt;br /&gt;
College London).&lt;br /&gt;
&lt;br /&gt;
== 02/27/2013: Building Scholarly Methodologies with Large-Scale Topic Analysis ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.princeton.edu/~mimno/ David Mimno], Princeton University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 27, 2013, 9:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; Hornbake (South Wing) Room 2119&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;NOTE SPECIAL TIME AND LOCATION!!!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
In the last ten years we have seen the creation of massive digital text collections, from Twitter feeds to million-book libraries, all in dozens of languages. At the same time, researchers have developed text mining methods that go beyond simple word frequency analysis to uncover thematic patterns. When we combine big data with powerful algorithms, we enable analysts in many different fields to enhance qualitative perspectives with quantitative measurements. But these methods are only useful if we can apply them at massive scale and distinguish consistent patterns from random variations. In this talk I will describe my work building reliable topic modeling methodologies for humanists, social scientists and science policy officers.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; David Mimno is a postdoctoral researcher in the Computer Science department at Princeton University. He received his PhD from the University of Massachusetts, Amherst. Before graduate school, he served as Head Programmer at the Perseus Project, a digital library for cultural heritage materials, at Tufts University. He is supported by a CRA Computing Innovation fellowship.&lt;br /&gt;
&lt;br /&gt;
== 03/13/2013: Is Any Politics Local?  An Automated Analysis of Mayoral and Gubernatorial Addresses ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://explore.georgetown.edu/people/dh335/ Dan Hopkins],  Georgetown University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, March 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Dubbed &amp;quot;laboratories of democracy,&amp;quot; America&#039;s states and its large cities face a wide variety of public policy challenges.  But in a period of expanding federal authority and increased long-distance communication, the extent to which U.S. states and large cities pursue varying policy agendas is at once important and unknown.  This paper draws on techniques from automated content analysis to measure the major topics in more than 500 &amp;quot;State of the State&amp;quot; and &amp;quot;State of the City&amp;quot; addresses given by American executive officials since 2000.  Drawing on the Correlated Topic Model (Blei and Lafferty 2006) and other approaches to topic modeling, it demonstrates that big-city mayors do address a distinctive set of topics from their counterparts in state capitols, but one that is surprisingly consistent across cities.  Knowing a mayor&#039;s political party provides little leverage on the topics he or she is likely to highlight, while the same is true for objective indicators such as economic conditions or the city&#039;s crime rate.  At the state level, partisanship proves more predictive of the topics addressed by Governors, but there, too, institutional responsibilities constrain leaders to emphasize a broad and similar set of issues.  American political institutions inscribe a substantial role for geographic and institutional differences, but the policy agendas of America&#039;s states and largest cities are homogeneous and overlapping.      &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Daniel J. Hopkins is an Assistant Professor of Government at Georgetown University whose research focuses on American politics, with a special emphasis on political behavior, urban and local politics, racial and ethnic politics, and statistical methods.  Specifically, his research has addressed issues including the role of rhetoric and of local contexts in shaping political behavior.  It has also involved the development and application of automated techniques for analyzing political rhetoric.  Professor Hopkins&#039; work has appeared in a variety of scholarly and popular outlets, including the American Political Science Review, the American Journal of Political Science, the Journal of Politics, and The Washington Post.  Professor Hopkins received his Ph.D. from Harvard University in 2007. &lt;br /&gt;
&lt;br /&gt;
== 03/27/2013: Corpora and Statistical Analysis of Non-Linguistic Symbol Systems ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Richard Sproat, Google New York&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, March 27, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We report on the creation and analysis of a set of corpora of non-linguistic symbol systems. &lt;br /&gt;
The resource, the first of its kind, consists of data from seven systems, both ancient and modern, &lt;br /&gt;
with four further systems under development, and several others planned. The systems represent &lt;br /&gt;
a range of types, including heraldic systems, formal systems, and systems that are mostly or purely &lt;br /&gt;
decorative. We also compare these systems statistically with a large set of linguistic systems, which &lt;br /&gt;
also range over both time and type.&lt;br /&gt;
&lt;br /&gt;
We show that none of the measures proposed in published work by Rao and colleagues (Rao et al., 2009a; Rao, 2010) &lt;br /&gt;
or Lee and colleagues (Lee et al., 2010a) works. In particular, Rao’s entropic measures are evidently useless when &lt;br /&gt;
one considers a wider range of examples of real non-linguistic symbol systems. And Lee’s measures, with the cutoff &lt;br /&gt;
values they propose, misclassify nearly all of our non-linguistic systems. However, we also show that one of Lee’s &lt;br /&gt;
measures, with different cutoff values, as well as another measure we develop here, do seem useful. We further &lt;br /&gt;
demonstrate that they are useful largely because they are both highly correlated with a rather trivial feature: &lt;br /&gt;
mean text length. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Richard Sproat received his Ph.D. in Linguistics from the Massachusetts&lt;br /&gt;
Institute of Technology in 1985. He has worked at AT&amp;amp;T Bell Labs, at&lt;br /&gt;
Lucent&#039;s Bell Labs and at AT&amp;amp;T Labs -- Research, before joining the faculty of&lt;br /&gt;
the University of Illinois. From there he moved to the Center for Spoken&lt;br /&gt;
Language Understanding at the Oregon Health &amp;amp; Science University. In the Fall of&lt;br /&gt;
2012 he moved to Google, New York as a Research Scientist.&lt;br /&gt;
&lt;br /&gt;
Sproat has worked in numerous areas relating to language and computational&lt;br /&gt;
linguistics, including syntax, morphology, computational morphology,&lt;br /&gt;
articulatory and acoustic phonetics, text processing, text-to-speech synthesis,&lt;br /&gt;
and text-to-scene conversion. Some of his recent work includes multilingual&lt;br /&gt;
named entity transliteration, the effects of script layout on readers&#039;&lt;br /&gt;
phonological awareness, and tools for automated assessment of child language. At&lt;br /&gt;
Google he works on multilingual text normalization and finite-state methods for&lt;br /&gt;
language processing. He also has a long-standing interest in writing systems and&lt;br /&gt;
symbol systems more generally.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 04/10/2013: Learning with Marginalized Corrupted Features ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cse.wustl.edu/~kilian/ Kilian Weinberger],  Washington University in St. Louis&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, April 10, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If infinite amounts of labeled data are provided, many machine learning algorithms become perfect. With finite amounts of data, regularization or priors have to be used to introduce bias into a classifier. We propose a third option: learning with marginalized corrupted features. We corrupt existing data as a means to generate infinitely many additional training samples from a slightly different data distribution -- explicitly in a way that the corruption can be marginalized out in closed form. This leads to machine learning algorithms that are fast, effective and naturally scale to very large data sets. We showcase this technology in two settings: 1. to learn text document representations from unlabeled data and 2. to perform supervised learning with closed form gradient updates for empirical risk minimization. &lt;br /&gt;
&lt;br /&gt;
Text documents (and often images) are traditionally expressed as bag-of-words feature vectors (e.g. as tf-idf). By training linear denoisers that recover unlabeled data from partial corruption, we can learn new data-specific representations. With these, we can match the world-record accuracy on the Amazon transfer learning benchmark with a simple linear classifier. In comparison with the record holder (stacked denoising autoencoders) our approach shrinks the training time from several days to a few minutes. &lt;br /&gt;
&lt;br /&gt;
Finally, we present a variety of loss functions and corrupting distributions, which can be applied out-of-the-box with empirical risk minimization. We show that our formulation leads to significant improvements in document classification tasks over the typically used l_p norm regularization. The new learning framework is extremely versatile, generalizes better, is more stable during test-time (towards distribution drift) and only adds a few lines of code to typical risk minimization.  &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Kilian Q. Weinberger is an Assistant Professor in the Department of Computer Science &amp;amp; Engineering at Washington University in St. Louis. He received his Ph.D. from the University of Pennsylvania in Machine Learning under the supervision of Lawrence Saul. Prior to this, he obtained his undergraduate degree in Mathematics and Computer Science at the University of Oxford. During his career he has won several best paper awards at ICML, CVPR and AISTATS. In 2011 he was awarded the AAAI senior program chair award and in 2012 he received the NSF CAREER award. Kilian Weinberger&#039;s research is in Machine Learning and its applications. In particular, he focuses on high dimensional data analysis, metric learning, machine learned web-search ranking, transfer- and multi-task learning as well as bio medical applications. &lt;br /&gt;
&lt;br /&gt;
== 04/17/2013: Recursive Deep Learning in Natural Language Processing and Computer Vision ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.socher.org/ Richard Socher],  Stanford University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, April 17, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Hierarchical and recursive structure is commonly found in different&lt;br /&gt;
modalities, including natural language sentences and scene images. I&lt;br /&gt;
will introduce several recursive deep learning models that, unlike &lt;br /&gt;
standard deep learning methods can learn compositional meaning vector &lt;br /&gt;
representations for phrases or images.&lt;br /&gt;
&lt;br /&gt;
These recursive neural network based models obtain state-of-the-art&lt;br /&gt;
performance on a variety of syntactic and semantic language tasks&lt;br /&gt;
such as parsing, sentiment analysis, paraphrase detection and relation&lt;br /&gt;
classification for extracting knowledge from the web. Because often no&lt;br /&gt;
language specific assumptions are made the same architectures can be&lt;br /&gt;
used for visual scene understanding and object classification from 3d &lt;br /&gt;
images.&lt;br /&gt;
&lt;br /&gt;
Besides the good performance, the models capture interesting phenomena&lt;br /&gt;
in language such as compositionality. For instance the models learn&lt;br /&gt;
that “not good” has worse sentiment than “good” or that high level&lt;br /&gt;
negation can change the meaning of longer phrases with many positive &lt;br /&gt;
words. Furthermore, unlike most machine learning approaches that rely on &lt;br /&gt;
human designed feature sets, features are learned as part of the model.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Richard Socher is a PhD student at Stanford working with Chris Manning&lt;br /&gt;
and Andrew Ng. His research interests are machine learning for NLP and&lt;br /&gt;
vision. He is interested in developing new models that learn useful &lt;br /&gt;
features, capture compositional and hierarchical structure in multiple&lt;br /&gt;
modalities and perform well across multiple tasks. He was awarded the &lt;br /&gt;
2011 Yahoo! Key Scientific Challenges Award, the Distinguished &lt;br /&gt;
Application Paper Award at ICML 2011 and a Microsoft Research PhD &lt;br /&gt;
Fellowship in 2012.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 04/24/2013: Matthew Gerber ==&lt;br /&gt;
&lt;br /&gt;
TBA&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=683</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=683"/>
		<updated>2013-03-16T13:10:51Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 01/30/2013: Human Translation and Machine Translation ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/pkoehn/ Philipp Koehn],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, January 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Despite all the recent successes of machine translation, when it&lt;br /&gt;
comes to high quality publishable translation, human translators&lt;br /&gt;
are still unchallenged. Since we can&#039;t beat them, can we help&lt;br /&gt;
them to become more productive? I will talk about some recent&lt;br /&gt;
work on developing assistance tools for human translators.&lt;br /&gt;
You can also check out a prototype [http://www.caitra.org/ here]&lt;br /&gt;
and learn about our ongoing European projects [http://www.casmacat.eu/ CASMACAT]&lt;br /&gt;
and [http://www.matecat.com/ MATECAT].&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Philipp Koehn is Professor of Machine Translation at the&lt;br /&gt;
School of Informatics at the University of Edinburgh, Scotland.&lt;br /&gt;
He received his PhD at the University of Southern California&lt;br /&gt;
and spent a year as postdoctoral researcher at MIT.&lt;br /&gt;
He is well-known in the field of statistical machine translation&lt;br /&gt;
for the leading open source toolkit Moses, the organization&lt;br /&gt;
of the annual Workshop on Statistical Machine Translation&lt;br /&gt;
and its evaluation campaign as well as the Machine Translation&lt;br /&gt;
Marathon. He is founding president of the ACL SIG MT and&lt;br /&gt;
currently serves a vice president-elect of the ACL SIG DAT.&lt;br /&gt;
He has published over 80 papers and the textbook in the&lt;br /&gt;
field. He manages a number of EU and DARPA funded&lt;br /&gt;
research projects aimed at morpho-syntactic models, machine&lt;br /&gt;
learning methods and computer assisted translation tools.&lt;br /&gt;
&lt;br /&gt;
== 02/06/2013: A New Recommender System for Large-scale Document Exploration ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.cmu.edu/~chongw/ Chong Wang],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
How can we help people quickly navigate the vast amount of data&lt;br /&gt;
and acquire useful knowledge from it? Recommender systems provide&lt;br /&gt;
a promising solution to this problem. They narrow down the search&lt;br /&gt;
space by providing a few recommendations that are tailored to&lt;br /&gt;
users&#039; personal preferences. However, these systems usually work&lt;br /&gt;
like a black box, limiting further opportunities to provide more&lt;br /&gt;
exploratory experiences to their users.&lt;br /&gt;
&lt;br /&gt;
In this talk, I will describe how we build a new recommender&lt;br /&gt;
system for document exploration. Specially, I will talk about two&lt;br /&gt;
building blocks of the system in detail. The first is about a new&lt;br /&gt;
probabilistic model for document recommendation that is both&lt;br /&gt;
predictive and interpretable. It not only gives better predictive&lt;br /&gt;
performance, but also provides better transparency than&lt;br /&gt;
traditional approaches. This transparency creates many new&lt;br /&gt;
opportunities for exploratory analysis---For example, a user can&lt;br /&gt;
manually adjust her preferences and the system responds to this&lt;br /&gt;
by changing its recommendations. Second, building a recommender&lt;br /&gt;
system like this requires learning the probabilistic model from&lt;br /&gt;
large-scale empirical data. I will describe a scalable approach&lt;br /&gt;
for learning a wide class of probabilistic models that include&lt;br /&gt;
our recommendation model as a special case.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Chong is a Project Scientist in Eric Xing&#039;s group, Machine Learning Department, Carnegie Mellon University.  His PhD advisor was David M. Blei from Princeton University.&lt;br /&gt;
&lt;br /&gt;
== 02/13/2013: Computational Modeling of Sociopragmatic Language Use in Arabic and English Social Media ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www1.ccls.columbia.edu/~mdiab/ Mona Diab], Columbia University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Social media language is a treasure trove for mining and understanding human interactions. In discussion fora, people naturally form groups and subgroups aligning along points of consensus and contention. These subgroup formations are quite nuanced as people could agree on some topic such as liking the movie the matrix, but some within that group might disagree on rating the acting skills of Keanu Reeves. Languages manifest these alignments exploiting  interesting sociolinguistic devices in different ways. In this talk, I will present our work on subgroup modeling and detection in both Arabic and English social media language. I will share with you our experiences with modeling both explicit and implicit attitude using high and low dimensional feature modeling. This work is the beginning of an interesting exploration into the realm of building computational  models of some aspects of the sociopragmatics of human language with the hopes that this research could lead to a  better understanding of human interaction. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Mona Diab is an Associate Professor of Computer Science at the George Washington University. She is also a cofounder of the CADIM (Columbia Arabic Dialect Modeling) group at Columbia University. Mona earned her PhD in Computational Linguistics from University of Maryland College Park with Philip Resnik in 2003 and then did her postdoctoral training with Daniel Jurafsky at Stanford University where she was part of the NLP group.  from 2005 till 2012, before joining GWU in Jan of 2013, Mona held the position of Research Scientist/Principle Investigator at Columbia University Center for Computational Learning Systems (CCLS). Mona&#039;s research  interests span computational lexical semantics, multilingual processing (with a special interest in Arabic and low resource languages), unsupervised learning for NLP, computational sociopragmatic modeling, information extraction and machine translation. Over the past 9 years, Mona has developed significant expertise in modeling low resource languages with a focus on Arabic dialect processing. She is especially interested in ways to leverage existing rich resources to inform algorithms for processing low resource languages. Her research has been published in over 90 papers in various internationally recognized scientific venues. Mona serves as the current elected President of the ACL SIG on Semitic Language Processing, she is also the elected Secretary for the ACL SIG on issues in the Lexicon (SIGLEX). She also serves on the NAACL board as an elected member. Mona recently (2012) co-founded  the yearly *SEM conference that attempts to bring together all aspects of semantic processing under the same umbrella venue. &lt;br /&gt;
&lt;br /&gt;
== 02/14/2013: Efficient Probabilistic Models for Rankings and Orderings ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://stanford.edu/~jhuang11/ Jon Huang], Stanford University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Thursday, February 14, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The need to reason probabilistically with rankings and orderings arises &lt;br /&gt;
in a number of real world problems.  Probability distributions over &lt;br /&gt;
rankings and orderings arise naturally, for example, in preference data, &lt;br /&gt;
and political election data, as well as a number of less obvious &lt;br /&gt;
settings such as topic analysis and neurodegenerative disease &lt;br /&gt;
progression modeling. Representing distributions over the space of all &lt;br /&gt;
rankings is challenging, however, due to the factorial number of ways to &lt;br /&gt;
rank a collection of items.  The focus of my talk is to discuss methods &lt;br /&gt;
for combatting this factorial explosion in probabilistic representation &lt;br /&gt;
and inference.&lt;br /&gt;
&lt;br /&gt;
Ordinarily, a typical machine learning method for dealing with &lt;br /&gt;
combinatorial complexity might be to exploit conditional independence &lt;br /&gt;
relations in order to decompose a distribution into compact factors of a &lt;br /&gt;
graphical model.  For ranked data, however, a far more natural and &lt;br /&gt;
useful probabilistic relation is that of `riffled independence&#039;.  I will &lt;br /&gt;
introduce the concept of riffled independence and discuss how these &lt;br /&gt;
riffle independent relations can be used to decompose a distribution &lt;br /&gt;
over rankings into a product of compactly represented factors.  These &lt;br /&gt;
so-called hierarchical riffle-independent distributions are particularly &lt;br /&gt;
amenable to efficient inference and learning algorithms and in many &lt;br /&gt;
cases lead to intuitively interpretable probabilistic models. To &lt;br /&gt;
illustrate the power of exploiting riffled independence, I will discuss &lt;br /&gt;
a few applications, including Irish political election analysis, &lt;br /&gt;
visualizing the japanese preferences of sushi types and modeling the &lt;br /&gt;
progression of Alzheimer&#039;s disease, showing results on real datasets in &lt;br /&gt;
each problem.&lt;br /&gt;
&lt;br /&gt;
This is joint work with Carlos Guestrin (University of Washington), &lt;br /&gt;
Ashish Kapoor (Microsoft Research) and Daniel Alexander (University &lt;br /&gt;
College London).&lt;br /&gt;
&lt;br /&gt;
== 02/27/2013: Building Scholarly Methodologies with Large-Scale Topic Analysis ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.princeton.edu/~mimno/ David Mimno], Princeton University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 27, 2013, 9:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; Hornbake (South Wing) Room 2119&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;NOTE SPECIAL TIME AND LOCATION!!!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
In the last ten years we have seen the creation of massive digital text collections, from Twitter feeds to million-book libraries, all in dozens of languages. At the same time, researchers have developed text mining methods that go beyond simple word frequency analysis to uncover thematic patterns. When we combine big data with powerful algorithms, we enable analysts in many different fields to enhance qualitative perspectives with quantitative measurements. But these methods are only useful if we can apply them at massive scale and distinguish consistent patterns from random variations. In this talk I will describe my work building reliable topic modeling methodologies for humanists, social scientists and science policy officers.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; David Mimno is a postdoctoral researcher in the Computer Science department at Princeton University. He received his PhD from the University of Massachusetts, Amherst. Before graduate school, he served as Head Programmer at the Perseus Project, a digital library for cultural heritage materials, at Tufts University. He is supported by a CRA Computing Innovation fellowship.&lt;br /&gt;
&lt;br /&gt;
== 03/13/2013: Is Any Politics Local?  An Automated Analysis of Mayoral and Gubernatorial Addresses ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://explore.georgetown.edu/people/dh335/ Dan Hopkins],  Georgetown University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, March 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Dubbed &amp;quot;laboratories of democracy,&amp;quot; America&#039;s states and its large cities face a wide variety of public policy challenges.  But in a period of expanding federal authority and increased long-distance communication, the extent to which U.S. states and large cities pursue varying policy agendas is at once important and unknown.  This paper draws on techniques from automated content analysis to measure the major topics in more than 500 &amp;quot;State of the State&amp;quot; and &amp;quot;State of the City&amp;quot; addresses given by American executive officials since 2000.  Drawing on the Correlated Topic Model (Blei and Lafferty 2006) and other approaches to topic modeling, it demonstrates that big-city mayors do address a distinctive set of topics from their counterparts in state capitols, but one that is surprisingly consistent across cities.  Knowing a mayor&#039;s political party provides little leverage on the topics he or she is likely to highlight, while the same is true for objective indicators such as economic conditions or the city&#039;s crime rate.  At the state level, partisanship proves more predictive of the topics addressed by Governors, but there, too, institutional responsibilities constrain leaders to emphasize a broad and similar set of issues.  American political institutions inscribe a substantial role for geographic and institutional differences, but the policy agendas of America&#039;s states and largest cities are homogeneous and overlapping.      &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Daniel J. Hopkins is an Assistant Professor of Government at Georgetown University whose research focuses on American politics, with a special emphasis on political behavior, urban and local politics, racial and ethnic politics, and statistical methods.  Specifically, his research has addressed issues including the role of rhetoric and of local contexts in shaping political behavior.  It has also involved the development and application of automated techniques for analyzing political rhetoric.  Professor Hopkins&#039; work has appeared in a variety of scholarly and popular outlets, including the American Political Science Review, the American Journal of Political Science, the Journal of Politics, and The Washington Post.  Professor Hopkins received his Ph.D. from Harvard University in 2007. &lt;br /&gt;
&lt;br /&gt;
== 03/27/2013: Corpora and Statistical Analysis of Non-Linguistic Symbol Systems ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; Richard Sproat, Google New York&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, March 27, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We report on the creation and analysis of a set of corpora of non-linguistic symbol systems. &lt;br /&gt;
The resource, the first of its kind, consists of data from seven systems, both ancient and modern, &lt;br /&gt;
with four further systems under development, and several others planned. The systems represent &lt;br /&gt;
a range of types, including heraldic systems, formal systems, and systems that are mostly or purely &lt;br /&gt;
decorative. We also compare these systems statistically with a large set of linguistic systems, which &lt;br /&gt;
also range over both time and type.&lt;br /&gt;
&lt;br /&gt;
We show that none of the measures proposed in published work by Rao and colleagues (Rao et al., 2009a; Rao, 2010) &lt;br /&gt;
or Lee and colleagues (Lee et al., 2010a) works. In particular, Rao’s entropic measures are evidently useless when &lt;br /&gt;
one considers a wider range of examples of real non-linguistic symbol systems. And Lee’s measures, with the cutoff &lt;br /&gt;
values they propose, misclassify nearly all of our non-linguistic systems. However, we also show that one of Lee’s &lt;br /&gt;
measures, with different cutoff values, as well as another measure we develop here, do seem useful. We further &lt;br /&gt;
demonstrate that they are useful largely because they are both highly correlated with a rather trivial feature: &lt;br /&gt;
mean text length. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Richard Sproat received his Ph.D. in Linguistics from the Massachusetts&lt;br /&gt;
Institute of Technology in 1985. He has worked at AT&amp;amp;T Bell Labs, at&lt;br /&gt;
Lucent&#039;s Bell Labs and at AT&amp;amp;T Labs -- Research, before joining the faculty of&lt;br /&gt;
the University of Illinois. From there he moved to the Center for Spoken&lt;br /&gt;
Language Understanding at the Oregon Health &amp;amp; Science University. In the Fall of&lt;br /&gt;
2012 he moved to Google, New York as a Research Scientist.&lt;br /&gt;
&lt;br /&gt;
Sproat has worked in numerous areas relating to language and computational&lt;br /&gt;
linguistics, including syntax, morphology, computational morphology,&lt;br /&gt;
articulatory and acoustic phonetics, text processing, text-to-speech synthesis,&lt;br /&gt;
and text-to-scene conversion. Some of his recent work includes multilingual&lt;br /&gt;
named entity transliteration, the effects of script layout on readers&#039;&lt;br /&gt;
phonological awareness, and tools for automated assessment of child language. At&lt;br /&gt;
Google he works on multilingual text normalization and finite-state methods for&lt;br /&gt;
language processing. He also has a long-standing interest in writing systems and&lt;br /&gt;
symbol systems more generally.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 04/10/2013: Learning with Marginalized Corrupted Features ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cse.wustl.edu/~kilian/ Kilian Weinberger],  Washington University in St. Louis&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, April 10, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If infinite amounts of labeled data are provided, many machine learning algorithms become perfect. With finite amounts of data, regularization or priors have to be used to introduce bias into a classifier. We propose a third option: learning with marginalized corrupted features. We corrupt existing data as a means to generate infinitely many additional training samples from a slightly different data distribution -- explicitly in a way that the corruption can be marginalized out in closed form. This leads to machine learning algorithms that are fast, effective and naturally scale to very large data sets. We showcase this technology in two settings: 1. to learn text document representations from unlabeled data and 2. to perform supervised learning with closed form gradient updates for empirical risk minimization. &lt;br /&gt;
&lt;br /&gt;
Text documents (and often images) are traditionally expressed as bag-of-words feature vectors (e.g. as tf-idf). By training linear denoisers that recover unlabeled data from partial corruption, we can learn new data-specific representations. With these, we can match the world-record accuracy on the Amazon transfer learning benchmark with a simple linear classifier. In comparison with the record holder (stacked denoising autoencoders) our approach shrinks the training time from several days to a few minutes. &lt;br /&gt;
&lt;br /&gt;
Finally, we present a variety of loss functions and corrupting distributions, which can be applied out-of-the-box with empirical risk minimization. We show that our formulation leads to significant improvements in document classification tasks over the typically used l_p norm regularization. The new learning framework is extremely versatile, generalizes better, is more stable during test-time (towards distribution drift) and only adds a few lines of code to typical risk minimization.  &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Kilian Q. Weinberger is an Assistant Professor in the Department of Computer Science &amp;amp; Engineering at Washington University in St. Louis. He received his Ph.D. from the University of Pennsylvania in Machine Learning under the supervision of Lawrence Saul. Prior to this, he obtained his undergraduate degree in Mathematics and Computer Science at the University of Oxford. During his career he has won several best paper awards at ICML, CVPR and AISTATS. In 2011 he was awarded the AAAI senior program chair award and in 2012 he received the NSF CAREER award. Kilian Weinberger&#039;s research is in Machine Learning and its applications. In particular, he focuses on high dimensional data analysis, metric learning, machine learned web-search ranking, transfer- and multi-task learning as well as bio medical applications. &lt;br /&gt;
&lt;br /&gt;
== 04/24/2013: Matthew Gerber ==&lt;br /&gt;
&lt;br /&gt;
TBA&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=681</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=681"/>
		<updated>2013-02-24T20:36:38Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 01/30/2013: Human Translation and Machine Translation ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/pkoehn/ Philipp Koehn],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, January 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Despite all the recent successes of machine translation, when it&lt;br /&gt;
comes to high quality publishable translation, human translators&lt;br /&gt;
are still unchallenged. Since we can&#039;t beat them, can we help&lt;br /&gt;
them to become more productive? I will talk about some recent&lt;br /&gt;
work on developing assistance tools for human translators.&lt;br /&gt;
You can also check out a prototype [http://www.caitra.org/ here]&lt;br /&gt;
and learn about our ongoing European projects [http://www.casmacat.eu/ CASMACAT]&lt;br /&gt;
and [http://www.matecat.com/ MATECAT].&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Philipp Koehn is Professor of Machine Translation at the&lt;br /&gt;
School of Informatics at the University of Edinburgh, Scotland.&lt;br /&gt;
He received his PhD at the University of Southern California&lt;br /&gt;
and spent a year as postdoctoral researcher at MIT.&lt;br /&gt;
He is well-known in the field of statistical machine translation&lt;br /&gt;
for the leading open source toolkit Moses, the organization&lt;br /&gt;
of the annual Workshop on Statistical Machine Translation&lt;br /&gt;
and its evaluation campaign as well as the Machine Translation&lt;br /&gt;
Marathon. He is founding president of the ACL SIG MT and&lt;br /&gt;
currently serves a vice president-elect of the ACL SIG DAT.&lt;br /&gt;
He has published over 80 papers and the textbook in the&lt;br /&gt;
field. He manages a number of EU and DARPA funded&lt;br /&gt;
research projects aimed at morpho-syntactic models, machine&lt;br /&gt;
learning methods and computer assisted translation tools.&lt;br /&gt;
&lt;br /&gt;
== 02/06/2013: A New Recommender System for Large-scale Document Exploration ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.cmu.edu/~chongw/ Chong Wang],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
How can we help people quickly navigate the vast amount of data&lt;br /&gt;
and acquire useful knowledge from it? Recommender systems provide&lt;br /&gt;
a promising solution to this problem. They narrow down the search&lt;br /&gt;
space by providing a few recommendations that are tailored to&lt;br /&gt;
users&#039; personal preferences. However, these systems usually work&lt;br /&gt;
like a black box, limiting further opportunities to provide more&lt;br /&gt;
exploratory experiences to their users.&lt;br /&gt;
&lt;br /&gt;
In this talk, I will describe how we build a new recommender&lt;br /&gt;
system for document exploration. Specially, I will talk about two&lt;br /&gt;
building blocks of the system in detail. The first is about a new&lt;br /&gt;
probabilistic model for document recommendation that is both&lt;br /&gt;
predictive and interpretable. It not only gives better predictive&lt;br /&gt;
performance, but also provides better transparency than&lt;br /&gt;
traditional approaches. This transparency creates many new&lt;br /&gt;
opportunities for exploratory analysis---For example, a user can&lt;br /&gt;
manually adjust her preferences and the system responds to this&lt;br /&gt;
by changing its recommendations. Second, building a recommender&lt;br /&gt;
system like this requires learning the probabilistic model from&lt;br /&gt;
large-scale empirical data. I will describe a scalable approach&lt;br /&gt;
for learning a wide class of probabilistic models that include&lt;br /&gt;
our recommendation model as a special case.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Chong is a Project Scientist in Eric Xing&#039;s group, Machine Learning Department, Carnegie Mellon University.  His PhD advisor was David M. Blei from Princeton University.&lt;br /&gt;
&lt;br /&gt;
== 02/13/2013: Computational Modeling of Sociopragmatic Language Use in Arabic and English Social Media ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www1.ccls.columbia.edu/~mdiab/ Mona Diab], Columbia University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Social media language is a treasure trove for mining and understanding human interactions. In discussion fora, people naturally form groups and subgroups aligning along points of consensus and contention. These subgroup formations are quite nuanced as people could agree on some topic such as liking the movie the matrix, but some within that group might disagree on rating the acting skills of Keanu Reeves. Languages manifest these alignments exploiting  interesting sociolinguistic devices in different ways. In this talk, I will present our work on subgroup modeling and detection in both Arabic and English social media language. I will share with you our experiences with modeling both explicit and implicit attitude using high and low dimensional feature modeling. This work is the beginning of an interesting exploration into the realm of building computational  models of some aspects of the sociopragmatics of human language with the hopes that this research could lead to a  better understanding of human interaction. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Mona Diab is an Associate Professor of Computer Science at the George Washington University. She is also a cofounder of the CADIM (Columbia Arabic Dialect Modeling) group at Columbia University. Mona earned her PhD in Computational Linguistics from University of Maryland College Park with Philip Resnik in 2003 and then did her postdoctoral training with Daniel Jurafsky at Stanford University where she was part of the NLP group.  from 2005 till 2012, before joining GWU in Jan of 2013, Mona held the position of Research Scientist/Principle Investigator at Columbia University Center for Computational Learning Systems (CCLS). Mona&#039;s research  interests span computational lexical semantics, multilingual processing (with a special interest in Arabic and low resource languages), unsupervised learning for NLP, computational sociopragmatic modeling, information extraction and machine translation. Over the past 9 years, Mona has developed significant expertise in modeling low resource languages with a focus on Arabic dialect processing. She is especially interested in ways to leverage existing rich resources to inform algorithms for processing low resource languages. Her research has been published in over 90 papers in various internationally recognized scientific venues. Mona serves as the current elected President of the ACL SIG on Semitic Language Processing, she is also the elected Secretary for the ACL SIG on issues in the Lexicon (SIGLEX). She also serves on the NAACL board as an elected member. Mona recently (2012) co-founded  the yearly *SEM conference that attempts to bring together all aspects of semantic processing under the same umbrella venue. &lt;br /&gt;
&lt;br /&gt;
== 02/14/2013: Efficient Probabilistic Models for Rankings and Orderings ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://stanford.edu/~jhuang11/ Jon Huang], Stanford University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Thursday, February 14, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The need to reason probabilistically with rankings and orderings arises &lt;br /&gt;
in a number of real world problems.  Probability distributions over &lt;br /&gt;
rankings and orderings arise naturally, for example, in preference data, &lt;br /&gt;
and political election data, as well as a number of less obvious &lt;br /&gt;
settings such as topic analysis and neurodegenerative disease &lt;br /&gt;
progression modeling. Representing distributions over the space of all &lt;br /&gt;
rankings is challenging, however, due to the factorial number of ways to &lt;br /&gt;
rank a collection of items.  The focus of my talk is to discuss methods &lt;br /&gt;
for combatting this factorial explosion in probabilistic representation &lt;br /&gt;
and inference.&lt;br /&gt;
&lt;br /&gt;
Ordinarily, a typical machine learning method for dealing with &lt;br /&gt;
combinatorial complexity might be to exploit conditional independence &lt;br /&gt;
relations in order to decompose a distribution into compact factors of a &lt;br /&gt;
graphical model.  For ranked data, however, a far more natural and &lt;br /&gt;
useful probabilistic relation is that of `riffled independence&#039;.  I will &lt;br /&gt;
introduce the concept of riffled independence and discuss how these &lt;br /&gt;
riffle independent relations can be used to decompose a distribution &lt;br /&gt;
over rankings into a product of compactly represented factors.  These &lt;br /&gt;
so-called hierarchical riffle-independent distributions are particularly &lt;br /&gt;
amenable to efficient inference and learning algorithms and in many &lt;br /&gt;
cases lead to intuitively interpretable probabilistic models. To &lt;br /&gt;
illustrate the power of exploiting riffled independence, I will discuss &lt;br /&gt;
a few applications, including Irish political election analysis, &lt;br /&gt;
visualizing the japanese preferences of sushi types and modeling the &lt;br /&gt;
progression of Alzheimer&#039;s disease, showing results on real datasets in &lt;br /&gt;
each problem.&lt;br /&gt;
&lt;br /&gt;
This is joint work with Carlos Guestrin (University of Washington), &lt;br /&gt;
Ashish Kapoor (Microsoft Research) and Daniel Alexander (University &lt;br /&gt;
College London).&lt;br /&gt;
&lt;br /&gt;
== 02/27/2013: Building Scholarly Methodologies with Large-Scale Topic Analysis ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.princeton.edu/~mimno/ David Mimno], Princeton University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 27, 2013, 9:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; Hornbake (South Wing) Room 2119&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;NOTE SPECIAL TIME AND LOCATION!!!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
In the last ten years we have seen the creation of massive digital text collections, from Twitter feeds to million-book libraries, all in dozens of languages. At the same time, researchers have developed text mining methods that go beyond simple word frequency analysis to uncover thematic patterns. When we combine big data with powerful algorithms, we enable analysts in many different fields to enhance qualitative perspectives with quantitative measurements. But these methods are only useful if we can apply them at massive scale and distinguish consistent patterns from random variations. In this talk I will describe my work building reliable topic modeling methodologies for humanists, social scientists and science policy officers.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; David Mimno is a postdoctoral researcher in the Computer Science department at Princeton University. He received his PhD from the University of Massachusetts, Amherst. Before graduate school, he served as Head Programmer at the Perseus Project, a digital library for cultural heritage materials, at Tufts University. He is supported by a CRA Computing Innovation fellowship.&lt;br /&gt;
&lt;br /&gt;
== 03/13/2013: Is Any Politics Local?  An Automated Analysis of Mayoral and Gubernatorial Addresses ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://explore.georgetown.edu/people/dh335/ Dan Hopkins],  Georgetown University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, March 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Dubbed &amp;quot;laboratories of democracy,&amp;quot; America&#039;s states and its large cities face a wide variety of public policy challenges.  But in a period of expanding federal authority and increased long-distance communication, the extent to which U.S. states and large cities pursue varying policy agendas is at once important and unknown.  This paper draws on techniques from automated content analysis to measure the major topics in more than 500 &amp;quot;State of the State&amp;quot; and &amp;quot;State of the City&amp;quot; addresses given by American executive officials since 2000.  Drawing on the Correlated Topic Model (Blei and Lafferty 2006) and other approaches to topic modeling, it demonstrates that big-city mayors do address a distinctive set of topics from their counterparts in state capitols, but one that is surprisingly consistent across cities.  Knowing a mayor&#039;s political party provides little leverage on the topics he or she is likely to highlight, while the same is true for objective indicators such as economic conditions or the city&#039;s crime rate.  At the state level, partisanship proves more predictive of the topics addressed by Governors, but there, too, institutional responsibilities constrain leaders to emphasize a broad and similar set of issues.  American political institutions inscribe a substantial role for geographic and institutional differences, but the policy agendas of America&#039;s states and largest cities are homogeneous and overlapping.      &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Daniel J. Hopkins is an Assistant Professor of Government at Georgetown University whose research focuses on American politics, with a special emphasis on political behavior, urban and local politics, racial and ethnic politics, and statistical methods.  Specifically, his research has addressed issues including the role of rhetoric and of local contexts in shaping political behavior.  It has also involved the development and application of automated techniques for analyzing political rhetoric.  Professor Hopkins&#039; work has appeared in a variety of scholarly and popular outlets, including the American Political Science Review, the American Journal of Political Science, the Journal of Politics, and The Washington Post.  Professor Hopkins received his Ph.D. from Harvard University in 2007. &lt;br /&gt;
&lt;br /&gt;
== 03/27/2013: Richard Sproat ==&lt;br /&gt;
&lt;br /&gt;
== 04/10/2013: Learning with Marginalized Corrupted Features ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cse.wustl.edu/~kilian/ Kilian Weinberger],  Washington University in St. Louis&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, April 10, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If infinite amounts of labeled data are provided, many machine learning algorithms become perfect. With finite amounts of data, regularization or priors have to be used to introduce bias into a classifier. We propose a third option: learning with marginalized corrupted features. We corrupt existing data as a means to generate infinitely many additional training samples from a slightly different data distribution -- explicitly in a way that the corruption can be marginalized out in closed form. This leads to machine learning algorithms that are fast, effective and naturally scale to very large data sets. We showcase this technology in two settings: 1. to learn text document representations from unlabeled data and 2. to perform supervised learning with closed form gradient updates for empirical risk minimization. &lt;br /&gt;
&lt;br /&gt;
Text documents (and often images) are traditionally expressed as bag-of-words feature vectors (e.g. as tf-idf). By training linear denoisers that recover unlabeled data from partial corruption, we can learn new data-specific representations. With these, we can match the world-record accuracy on the Amazon transfer learning benchmark with a simple linear classifier. In comparison with the record holder (stacked denoising autoencoders) our approach shrinks the training time from several days to a few minutes. &lt;br /&gt;
&lt;br /&gt;
Finally, we present a variety of loss functions and corrupting distributions, which can be applied out-of-the-box with empirical risk minimization. We show that our formulation leads to significant improvements in document classification tasks over the typically used l_p norm regularization. The new learning framework is extremely versatile, generalizes better, is more stable during test-time (towards distribution drift) and only adds a few lines of code to typical risk minimization.  &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Kilian Q. Weinberger is an Assistant Professor in the Department of Computer Science &amp;amp; Engineering at Washington University in St. Louis. He received his Ph.D. from the University of Pennsylvania in Machine Learning under the supervision of Lawrence Saul. Prior to this, he obtained his undergraduate degree in Mathematics and Computer Science at the University of Oxford. During his career he has won several best paper awards at ICML, CVPR and AISTATS. In 2011 he was awarded the AAAI senior program chair award and in 2012 he received the NSF CAREER award. Kilian Weinberger&#039;s research is in Machine Learning and its applications. In particular, he focuses on high dimensional data analysis, metric learning, machine learned web-search ranking, transfer- and multi-task learning as well as bio medical applications. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=680</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=680"/>
		<updated>2013-02-24T18:56:54Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 01/30/2013: Human Translation and Machine Translation ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/pkoehn/ Philipp Koehn],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, January 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Despite all the recent successes of machine translation, when it&lt;br /&gt;
comes to high quality publishable translation, human translators&lt;br /&gt;
are still unchallenged. Since we can&#039;t beat them, can we help&lt;br /&gt;
them to become more productive? I will talk about some recent&lt;br /&gt;
work on developing assistance tools for human translators.&lt;br /&gt;
You can also check out a prototype [http://www.caitra.org/ here]&lt;br /&gt;
and learn about our ongoing European projects [http://www.casmacat.eu/ CASMACAT]&lt;br /&gt;
and [http://www.matecat.com/ MATECAT].&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Philipp Koehn is Professor of Machine Translation at the&lt;br /&gt;
School of Informatics at the University of Edinburgh, Scotland.&lt;br /&gt;
He received his PhD at the University of Southern California&lt;br /&gt;
and spent a year as postdoctoral researcher at MIT.&lt;br /&gt;
He is well-known in the field of statistical machine translation&lt;br /&gt;
for the leading open source toolkit Moses, the organization&lt;br /&gt;
of the annual Workshop on Statistical Machine Translation&lt;br /&gt;
and its evaluation campaign as well as the Machine Translation&lt;br /&gt;
Marathon. He is founding president of the ACL SIG MT and&lt;br /&gt;
currently serves a vice president-elect of the ACL SIG DAT.&lt;br /&gt;
He has published over 80 papers and the textbook in the&lt;br /&gt;
field. He manages a number of EU and DARPA funded&lt;br /&gt;
research projects aimed at morpho-syntactic models, machine&lt;br /&gt;
learning methods and computer assisted translation tools.&lt;br /&gt;
&lt;br /&gt;
== 02/06/2013: A New Recommender System for Large-scale Document Exploration ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.cmu.edu/~chongw/ Chong Wang],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
How can we help people quickly navigate the vast amount of data&lt;br /&gt;
and acquire useful knowledge from it? Recommender systems provide&lt;br /&gt;
a promising solution to this problem. They narrow down the search&lt;br /&gt;
space by providing a few recommendations that are tailored to&lt;br /&gt;
users&#039; personal preferences. However, these systems usually work&lt;br /&gt;
like a black box, limiting further opportunities to provide more&lt;br /&gt;
exploratory experiences to their users.&lt;br /&gt;
&lt;br /&gt;
In this talk, I will describe how we build a new recommender&lt;br /&gt;
system for document exploration. Specially, I will talk about two&lt;br /&gt;
building blocks of the system in detail. The first is about a new&lt;br /&gt;
probabilistic model for document recommendation that is both&lt;br /&gt;
predictive and interpretable. It not only gives better predictive&lt;br /&gt;
performance, but also provides better transparency than&lt;br /&gt;
traditional approaches. This transparency creates many new&lt;br /&gt;
opportunities for exploratory analysis---For example, a user can&lt;br /&gt;
manually adjust her preferences and the system responds to this&lt;br /&gt;
by changing its recommendations. Second, building a recommender&lt;br /&gt;
system like this requires learning the probabilistic model from&lt;br /&gt;
large-scale empirical data. I will describe a scalable approach&lt;br /&gt;
for learning a wide class of probabilistic models that include&lt;br /&gt;
our recommendation model as a special case.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Chong is a Project Scientist in Eric Xing&#039;s group, Machine Learning Department, Carnegie Mellon University.  His PhD advisor was David M. Blei from Princeton University.&lt;br /&gt;
&lt;br /&gt;
== 02/13/2013: Computational Modeling of Sociopragmatic Language Use in Arabic and English Social Media ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www1.ccls.columbia.edu/~mdiab/ Mona Diab], Columbia University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Social media language is a treasure trove for mining and understanding human interactions. In discussion fora, people naturally form groups and subgroups aligning along points of consensus and contention. These subgroup formations are quite nuanced as people could agree on some topic such as liking the movie the matrix, but some within that group might disagree on rating the acting skills of Keanu Reeves. Languages manifest these alignments exploiting  interesting sociolinguistic devices in different ways. In this talk, I will present our work on subgroup modeling and detection in both Arabic and English social media language. I will share with you our experiences with modeling both explicit and implicit attitude using high and low dimensional feature modeling. This work is the beginning of an interesting exploration into the realm of building computational  models of some aspects of the sociopragmatics of human language with the hopes that this research could lead to a  better understanding of human interaction. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Mona Diab is an Associate Professor of Computer Science at the George Washington University. She is also a cofounder of the CADIM (Columbia Arabic Dialect Modeling) group at Columbia University. Mona earned her PhD in Computational Linguistics from University of Maryland College Park with Philip Resnik in 2003 and then did her postdoctoral training with Daniel Jurafsky at Stanford University where she was part of the NLP group.  from 2005 till 2012, before joining GWU in Jan of 2013, Mona held the position of Research Scientist/Principle Investigator at Columbia University Center for Computational Learning Systems (CCLS). Mona&#039;s research  interests span computational lexical semantics, multilingual processing (with a special interest in Arabic and low resource languages), unsupervised learning for NLP, computational sociopragmatic modeling, information extraction and machine translation. Over the past 9 years, Mona has developed significant expertise in modeling low resource languages with a focus on Arabic dialect processing. She is especially interested in ways to leverage existing rich resources to inform algorithms for processing low resource languages. Her research has been published in over 90 papers in various internationally recognized scientific venues. Mona serves as the current elected President of the ACL SIG on Semitic Language Processing, she is also the elected Secretary for the ACL SIG on issues in the Lexicon (SIGLEX). She also serves on the NAACL board as an elected member. Mona recently (2012) co-founded  the yearly *SEM conference that attempts to bring together all aspects of semantic processing under the same umbrella venue. &lt;br /&gt;
&lt;br /&gt;
== 02/14/2013: Efficient Probabilistic Models for Rankings and Orderings ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://stanford.edu/~jhuang11/ Jon Huang], Stanford University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Thursday, February 14, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The need to reason probabilistically with rankings and orderings arises &lt;br /&gt;
in a number of real world problems.  Probability distributions over &lt;br /&gt;
rankings and orderings arise naturally, for example, in preference data, &lt;br /&gt;
and political election data, as well as a number of less obvious &lt;br /&gt;
settings such as topic analysis and neurodegenerative disease &lt;br /&gt;
progression modeling. Representing distributions over the space of all &lt;br /&gt;
rankings is challenging, however, due to the factorial number of ways to &lt;br /&gt;
rank a collection of items.  The focus of my talk is to discuss methods &lt;br /&gt;
for combatting this factorial explosion in probabilistic representation &lt;br /&gt;
and inference.&lt;br /&gt;
&lt;br /&gt;
Ordinarily, a typical machine learning method for dealing with &lt;br /&gt;
combinatorial complexity might be to exploit conditional independence &lt;br /&gt;
relations in order to decompose a distribution into compact factors of a &lt;br /&gt;
graphical model.  For ranked data, however, a far more natural and &lt;br /&gt;
useful probabilistic relation is that of `riffled independence&#039;.  I will &lt;br /&gt;
introduce the concept of riffled independence and discuss how these &lt;br /&gt;
riffle independent relations can be used to decompose a distribution &lt;br /&gt;
over rankings into a product of compactly represented factors.  These &lt;br /&gt;
so-called hierarchical riffle-independent distributions are particularly &lt;br /&gt;
amenable to efficient inference and learning algorithms and in many &lt;br /&gt;
cases lead to intuitively interpretable probabilistic models. To &lt;br /&gt;
illustrate the power of exploiting riffled independence, I will discuss &lt;br /&gt;
a few applications, including Irish political election analysis, &lt;br /&gt;
visualizing the japanese preferences of sushi types and modeling the &lt;br /&gt;
progression of Alzheimer&#039;s disease, showing results on real datasets in &lt;br /&gt;
each problem.&lt;br /&gt;
&lt;br /&gt;
This is joint work with Carlos Guestrin (University of Washington), &lt;br /&gt;
Ashish Kapoor (Microsoft Research) and Daniel Alexander (University &lt;br /&gt;
College London).&lt;br /&gt;
&lt;br /&gt;
== 02/27/2013: Building Scholarly Methodologies with Large-Scale Topic Analysis ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.princeton.edu/~mimno/ David Mimno], Princeton University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Thursday, February 27, 2013, 9:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; Hornbake (South Wing) Room 2119&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;NOTE SPECIAL DATE AND TIME!!!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
In the last ten years we have seen the creation of massive digital text collections, from Twitter feeds to million-book libraries, all in dozens of languages. At the same time, researchers have developed text mining methods that go beyond simple word frequency analysis to uncover thematic patterns. When we combine big data with powerful algorithms, we enable analysts in many different fields to enhance qualitative perspectives with quantitative measurements. But these methods are only useful if we can apply them at massive scale and distinguish consistent patterns from random variations. In this talk I will describe my work building reliable topic modeling methodologies for humanists, social scientists and science policy officers.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; David Mimno is a postdoctoral researcher in the Computer Science department at Princeton University. He received his PhD from the University of Massachusetts, Amherst. Before graduate school, he served as Head Programmer at the Perseus Project, a digital library for cultural heritage materials, at Tufts University. He is supported by a CRA Computing Innovation fellowship.&lt;br /&gt;
&lt;br /&gt;
== 03/13/2013: Is Any Politics Local?  An Automated Analysis of Mayoral and Gubernatorial Addresses ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://explore.georgetown.edu/people/dh335/ Dan Hopkins],  Georgetown University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, March 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Dubbed &amp;quot;laboratories of democracy,&amp;quot; America&#039;s states and its large cities face a wide variety of public policy challenges.  But in a period of expanding federal authority and increased long-distance communication, the extent to which U.S. states and large cities pursue varying policy agendas is at once important and unknown.  This paper draws on techniques from automated content analysis to measure the major topics in more than 500 &amp;quot;State of the State&amp;quot; and &amp;quot;State of the City&amp;quot; addresses given by American executive officials since 2000.  Drawing on the Correlated Topic Model (Blei and Lafferty 2006) and other approaches to topic modeling, it demonstrates that big-city mayors do address a distinctive set of topics from their counterparts in state capitols, but one that is surprisingly consistent across cities.  Knowing a mayor&#039;s political party provides little leverage on the topics he or she is likely to highlight, while the same is true for objective indicators such as economic conditions or the city&#039;s crime rate.  At the state level, partisanship proves more predictive of the topics addressed by Governors, but there, too, institutional responsibilities constrain leaders to emphasize a broad and similar set of issues.  American political institutions inscribe a substantial role for geographic and institutional differences, but the policy agendas of America&#039;s states and largest cities are homogeneous and overlapping.      &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Daniel J. Hopkins is an Assistant Professor of Government at Georgetown University whose research focuses on American politics, with a special emphasis on political behavior, urban and local politics, racial and ethnic politics, and statistical methods.  Specifically, his research has addressed issues including the role of rhetoric and of local contexts in shaping political behavior.  It has also involved the development and application of automated techniques for analyzing political rhetoric.  Professor Hopkins&#039; work has appeared in a variety of scholarly and popular outlets, including the American Political Science Review, the American Journal of Political Science, the Journal of Politics, and The Washington Post.  Professor Hopkins received his Ph.D. from Harvard University in 2007. &lt;br /&gt;
&lt;br /&gt;
== 03/27/2013: Richard Sproat ==&lt;br /&gt;
&lt;br /&gt;
== 04/10/2013: Learning with Marginalized Corrupted Features ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cse.wustl.edu/~kilian/ Kilian Weinberger],  Washington University in St. Louis&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, April 10, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If infinite amounts of labeled data are provided, many machine learning algorithms become perfect. With finite amounts of data, regularization or priors have to be used to introduce bias into a classifier. We propose a third option: learning with marginalized corrupted features. We corrupt existing data as a means to generate infinitely many additional training samples from a slightly different data distribution -- explicitly in a way that the corruption can be marginalized out in closed form. This leads to machine learning algorithms that are fast, effective and naturally scale to very large data sets. We showcase this technology in two settings: 1. to learn text document representations from unlabeled data and 2. to perform supervised learning with closed form gradient updates for empirical risk minimization. &lt;br /&gt;
&lt;br /&gt;
Text documents (and often images) are traditionally expressed as bag-of-words feature vectors (e.g. as tf-idf). By training linear denoisers that recover unlabeled data from partial corruption, we can learn new data-specific representations. With these, we can match the world-record accuracy on the Amazon transfer learning benchmark with a simple linear classifier. In comparison with the record holder (stacked denoising autoencoders) our approach shrinks the training time from several days to a few minutes. &lt;br /&gt;
&lt;br /&gt;
Finally, we present a variety of loss functions and corrupting distributions, which can be applied out-of-the-box with empirical risk minimization. We show that our formulation leads to significant improvements in document classification tasks over the typically used l_p norm regularization. The new learning framework is extremely versatile, generalizes better, is more stable during test-time (towards distribution drift) and only adds a few lines of code to typical risk minimization.  &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Kilian Q. Weinberger is an Assistant Professor in the Department of Computer Science &amp;amp; Engineering at Washington University in St. Louis. He received his Ph.D. from the University of Pennsylvania in Machine Learning under the supervision of Lawrence Saul. Prior to this, he obtained his undergraduate degree in Mathematics and Computer Science at the University of Oxford. During his career he has won several best paper awards at ICML, CVPR and AISTATS. In 2011 he was awarded the AAAI senior program chair award and in 2012 he received the NSF CAREER award. Kilian Weinberger&#039;s research is in Machine Learning and its applications. In particular, he focuses on high dimensional data analysis, metric learning, machine learned web-search ranking, transfer- and multi-task learning as well as bio medical applications. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=679</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=679"/>
		<updated>2013-02-24T18:50:49Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 01/30/2013: Human Translation and Machine Translation ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/pkoehn/ Philipp Koehn],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, January 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Despite all the recent successes of machine translation, when it&lt;br /&gt;
comes to high quality publishable translation, human translators&lt;br /&gt;
are still unchallenged. Since we can&#039;t beat them, can we help&lt;br /&gt;
them to become more productive? I will talk about some recent&lt;br /&gt;
work on developing assistance tools for human translators.&lt;br /&gt;
You can also check out a prototype [http://www.caitra.org/ here]&lt;br /&gt;
and learn about our ongoing European projects [http://www.casmacat.eu/ CASMACAT]&lt;br /&gt;
and [http://www.matecat.com/ MATECAT].&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Philipp Koehn is Professor of Machine Translation at the&lt;br /&gt;
School of Informatics at the University of Edinburgh, Scotland.&lt;br /&gt;
He received his PhD at the University of Southern California&lt;br /&gt;
and spent a year as postdoctoral researcher at MIT.&lt;br /&gt;
He is well-known in the field of statistical machine translation&lt;br /&gt;
for the leading open source toolkit Moses, the organization&lt;br /&gt;
of the annual Workshop on Statistical Machine Translation&lt;br /&gt;
and its evaluation campaign as well as the Machine Translation&lt;br /&gt;
Marathon. He is founding president of the ACL SIG MT and&lt;br /&gt;
currently serves a vice president-elect of the ACL SIG DAT.&lt;br /&gt;
He has published over 80 papers and the textbook in the&lt;br /&gt;
field. He manages a number of EU and DARPA funded&lt;br /&gt;
research projects aimed at morpho-syntactic models, machine&lt;br /&gt;
learning methods and computer assisted translation tools.&lt;br /&gt;
&lt;br /&gt;
== 02/06/2013: A New Recommender System for Large-scale Document Exploration ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.cmu.edu/~chongw/ Chong Wang],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
How can we help people quickly navigate the vast amount of data&lt;br /&gt;
and acquire useful knowledge from it? Recommender systems provide&lt;br /&gt;
a promising solution to this problem. They narrow down the search&lt;br /&gt;
space by providing a few recommendations that are tailored to&lt;br /&gt;
users&#039; personal preferences. However, these systems usually work&lt;br /&gt;
like a black box, limiting further opportunities to provide more&lt;br /&gt;
exploratory experiences to their users.&lt;br /&gt;
&lt;br /&gt;
In this talk, I will describe how we build a new recommender&lt;br /&gt;
system for document exploration. Specially, I will talk about two&lt;br /&gt;
building blocks of the system in detail. The first is about a new&lt;br /&gt;
probabilistic model for document recommendation that is both&lt;br /&gt;
predictive and interpretable. It not only gives better predictive&lt;br /&gt;
performance, but also provides better transparency than&lt;br /&gt;
traditional approaches. This transparency creates many new&lt;br /&gt;
opportunities for exploratory analysis---For example, a user can&lt;br /&gt;
manually adjust her preferences and the system responds to this&lt;br /&gt;
by changing its recommendations. Second, building a recommender&lt;br /&gt;
system like this requires learning the probabilistic model from&lt;br /&gt;
large-scale empirical data. I will describe a scalable approach&lt;br /&gt;
for learning a wide class of probabilistic models that include&lt;br /&gt;
our recommendation model as a special case.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Chong is a Project Scientist in Eric Xing&#039;s group, Machine Learning Department, Carnegie Mellon University.  His PhD advisor was David M. Blei from Princeton University.&lt;br /&gt;
&lt;br /&gt;
== 02/13/2013: Computational Modeling of Sociopragmatic Language Use in Arabic and English Social Media ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www1.ccls.columbia.edu/~mdiab/ Mona Diab], Columbia University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Social media language is a treasure trove for mining and understanding human interactions. In discussion fora, people naturally form groups and subgroups aligning along points of consensus and contention. These subgroup formations are quite nuanced as people could agree on some topic such as liking the movie the matrix, but some within that group might disagree on rating the acting skills of Keanu Reeves. Languages manifest these alignments exploiting  interesting sociolinguistic devices in different ways. In this talk, I will present our work on subgroup modeling and detection in both Arabic and English social media language. I will share with you our experiences with modeling both explicit and implicit attitude using high and low dimensional feature modeling. This work is the beginning of an interesting exploration into the realm of building computational  models of some aspects of the sociopragmatics of human language with the hopes that this research could lead to a  better understanding of human interaction. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Mona Diab is an Associate Professor of Computer Science at the George Washington University. She is also a cofounder of the CADIM (Columbia Arabic Dialect Modeling) group at Columbia University. Mona earned her PhD in Computational Linguistics from University of Maryland College Park with Philip Resnik in 2003 and then did her postdoctoral training with Daniel Jurafsky at Stanford University where she was part of the NLP group.  from 2005 till 2012, before joining GWU in Jan of 2013, Mona held the position of Research Scientist/Principle Investigator at Columbia University Center for Computational Learning Systems (CCLS). Mona&#039;s research  interests span computational lexical semantics, multilingual processing (with a special interest in Arabic and low resource languages), unsupervised learning for NLP, computational sociopragmatic modeling, information extraction and machine translation. Over the past 9 years, Mona has developed significant expertise in modeling low resource languages with a focus on Arabic dialect processing. She is especially interested in ways to leverage existing rich resources to inform algorithms for processing low resource languages. Her research has been published in over 90 papers in various internationally recognized scientific venues. Mona serves as the current elected President of the ACL SIG on Semitic Language Processing, she is also the elected Secretary for the ACL SIG on issues in the Lexicon (SIGLEX). She also serves on the NAACL board as an elected member. Mona recently (2012) co-founded  the yearly *SEM conference that attempts to bring together all aspects of semantic processing under the same umbrella venue. &lt;br /&gt;
&lt;br /&gt;
== 02/14/2013: Efficient Probabilistic Models for Rankings and Orderings ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://stanford.edu/~jhuang11/ Jon Huang], Stanford University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Thursday, February 14, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The need to reason probabilistically with rankings and orderings arises &lt;br /&gt;
in a number of real world problems.  Probability distributions over &lt;br /&gt;
rankings and orderings arise naturally, for example, in preference data, &lt;br /&gt;
and political election data, as well as a number of less obvious &lt;br /&gt;
settings such as topic analysis and neurodegenerative disease &lt;br /&gt;
progression modeling. Representing distributions over the space of all &lt;br /&gt;
rankings is challenging, however, due to the factorial number of ways to &lt;br /&gt;
rank a collection of items.  The focus of my talk is to discuss methods &lt;br /&gt;
for combatting this factorial explosion in probabilistic representation &lt;br /&gt;
and inference.&lt;br /&gt;
&lt;br /&gt;
Ordinarily, a typical machine learning method for dealing with &lt;br /&gt;
combinatorial complexity might be to exploit conditional independence &lt;br /&gt;
relations in order to decompose a distribution into compact factors of a &lt;br /&gt;
graphical model.  For ranked data, however, a far more natural and &lt;br /&gt;
useful probabilistic relation is that of `riffled independence&#039;.  I will &lt;br /&gt;
introduce the concept of riffled independence and discuss how these &lt;br /&gt;
riffle independent relations can be used to decompose a distribution &lt;br /&gt;
over rankings into a product of compactly represented factors.  These &lt;br /&gt;
so-called hierarchical riffle-independent distributions are particularly &lt;br /&gt;
amenable to efficient inference and learning algorithms and in many &lt;br /&gt;
cases lead to intuitively interpretable probabilistic models. To &lt;br /&gt;
illustrate the power of exploiting riffled independence, I will discuss &lt;br /&gt;
a few applications, including Irish political election analysis, &lt;br /&gt;
visualizing the japanese preferences of sushi types and modeling the &lt;br /&gt;
progression of Alzheimer&#039;s disease, showing results on real datasets in &lt;br /&gt;
each problem.&lt;br /&gt;
&lt;br /&gt;
This is joint work with Carlos Guestrin (University of Washington), &lt;br /&gt;
Ashish Kapoor (Microsoft Research) and Daniel Alexander (University &lt;br /&gt;
College London).&lt;br /&gt;
&lt;br /&gt;
== 02/27/2013:David Mimno ==&lt;br /&gt;
&lt;br /&gt;
== 03/13/2013: Is Any Politics Local?  An Automated Analysis of Mayoral and Gubernatorial Addresses ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://explore.georgetown.edu/people/dh335/ Dan Hopkins],  Georgetown University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, March 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Dubbed &amp;quot;laboratories of democracy,&amp;quot; America&#039;s states and its large cities face a wide variety of public policy challenges.  But in a period of expanding federal authority and increased long-distance communication, the extent to which U.S. states and large cities pursue varying policy agendas is at once important and unknown.  This paper draws on techniques from automated content analysis to measure the major topics in more than 500 &amp;quot;State of the State&amp;quot; and &amp;quot;State of the City&amp;quot; addresses given by American executive officials since 2000.  Drawing on the Correlated Topic Model (Blei and Lafferty 2006) and other approaches to topic modeling, it demonstrates that big-city mayors do address a distinctive set of topics from their counterparts in state capitols, but one that is surprisingly consistent across cities.  Knowing a mayor&#039;s political party provides little leverage on the topics he or she is likely to highlight, while the same is true for objective indicators such as economic conditions or the city&#039;s crime rate.  At the state level, partisanship proves more predictive of the topics addressed by Governors, but there, too, institutional responsibilities constrain leaders to emphasize a broad and similar set of issues.  American political institutions inscribe a substantial role for geographic and institutional differences, but the policy agendas of America&#039;s states and largest cities are homogeneous and overlapping.      &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Daniel J. Hopkins is an Assistant Professor of Government at Georgetown University whose research focuses on American politics, with a special emphasis on political behavior, urban and local politics, racial and ethnic politics, and statistical methods.  Specifically, his research has addressed issues including the role of rhetoric and of local contexts in shaping political behavior.  It has also involved the development and application of automated techniques for analyzing political rhetoric.  Professor Hopkins&#039; work has appeared in a variety of scholarly and popular outlets, including the American Political Science Review, the American Journal of Political Science, the Journal of Politics, and The Washington Post.  Professor Hopkins received his Ph.D. from Harvard University in 2007. &lt;br /&gt;
&lt;br /&gt;
== 03/27/2013: Richard Sproat ==&lt;br /&gt;
&lt;br /&gt;
== 04/10/2013: Learning with Marginalized Corrupted Features ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cse.wustl.edu/~kilian/ Kilian Weinberger],  Washington University in St. Louis&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, April 10, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If infinite amounts of labeled data are provided, many machine learning algorithms become perfect. With finite amounts of data, regularization or priors have to be used to introduce bias into a classifier. We propose a third option: learning with marginalized corrupted features. We corrupt existing data as a means to generate infinitely many additional training samples from a slightly different data distribution -- explicitly in a way that the corruption can be marginalized out in closed form. This leads to machine learning algorithms that are fast, effective and naturally scale to very large data sets. We showcase this technology in two settings: 1. to learn text document representations from unlabeled data and 2. to perform supervised learning with closed form gradient updates for empirical risk minimization. &lt;br /&gt;
&lt;br /&gt;
Text documents (and often images) are traditionally expressed as bag-of-words feature vectors (e.g. as tf-idf). By training linear denoisers that recover unlabeled data from partial corruption, we can learn new data-specific representations. With these, we can match the world-record accuracy on the Amazon transfer learning benchmark with a simple linear classifier. In comparison with the record holder (stacked denoising autoencoders) our approach shrinks the training time from several days to a few minutes. &lt;br /&gt;
&lt;br /&gt;
Finally, we present a variety of loss functions and corrupting distributions, which can be applied out-of-the-box with empirical risk minimization. We show that our formulation leads to significant improvements in document classification tasks over the typically used l_p norm regularization. The new learning framework is extremely versatile, generalizes better, is more stable during test-time (towards distribution drift) and only adds a few lines of code to typical risk minimization.  &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Kilian Q. Weinberger is an Assistant Professor in the Department of Computer Science &amp;amp; Engineering at Washington University in St. Louis. He received his Ph.D. from the University of Pennsylvania in Machine Learning under the supervision of Lawrence Saul. Prior to this, he obtained his undergraduate degree in Mathematics and Computer Science at the University of Oxford. During his career he has won several best paper awards at ICML, CVPR and AISTATS. In 2011 he was awarded the AAAI senior program chair award and in 2012 he received the NSF CAREER award. Kilian Weinberger&#039;s research is in Machine Learning and its applications. In particular, he focuses on high dimensional data analysis, metric learning, machine learned web-search ranking, transfer- and multi-task learning as well as bio medical applications. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=678</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=678"/>
		<updated>2013-02-09T04:36:11Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 01/30/2013: Human Translation and Machine Translation ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/pkoehn/ Philipp Koehn],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, January 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Despite all the recent successes of machine translation, when it&lt;br /&gt;
comes to high quality publishable translation, human translators&lt;br /&gt;
are still unchallenged. Since we can&#039;t beat them, can we help&lt;br /&gt;
them to become more productive? I will talk about some recent&lt;br /&gt;
work on developing assistance tools for human translators.&lt;br /&gt;
You can also check out a prototype [http://www.caitra.org/ here]&lt;br /&gt;
and learn about our ongoing European projects [http://www.casmacat.eu/ CASMACAT]&lt;br /&gt;
and [http://www.matecat.com/ MATECAT].&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Philipp Koehn is Professor of Machine Translation at the&lt;br /&gt;
School of Informatics at the University of Edinburgh, Scotland.&lt;br /&gt;
He received his PhD at the University of Southern California&lt;br /&gt;
and spent a year as postdoctoral researcher at MIT.&lt;br /&gt;
He is well-known in the field of statistical machine translation&lt;br /&gt;
for the leading open source toolkit Moses, the organization&lt;br /&gt;
of the annual Workshop on Statistical Machine Translation&lt;br /&gt;
and its evaluation campaign as well as the Machine Translation&lt;br /&gt;
Marathon. He is founding president of the ACL SIG MT and&lt;br /&gt;
currently serves a vice president-elect of the ACL SIG DAT.&lt;br /&gt;
He has published over 80 papers and the textbook in the&lt;br /&gt;
field. He manages a number of EU and DARPA funded&lt;br /&gt;
research projects aimed at morpho-syntactic models, machine&lt;br /&gt;
learning methods and computer assisted translation tools.&lt;br /&gt;
&lt;br /&gt;
== 02/06/2013: A New Recommender System for Large-scale Document Exploration ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.cmu.edu/~chongw/ Chong Wang],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
How can we help people quickly navigate the vast amount of data&lt;br /&gt;
and acquire useful knowledge from it? Recommender systems provide&lt;br /&gt;
a promising solution to this problem. They narrow down the search&lt;br /&gt;
space by providing a few recommendations that are tailored to&lt;br /&gt;
users&#039; personal preferences. However, these systems usually work&lt;br /&gt;
like a black box, limiting further opportunities to provide more&lt;br /&gt;
exploratory experiences to their users.&lt;br /&gt;
&lt;br /&gt;
In this talk, I will describe how we build a new recommender&lt;br /&gt;
system for document exploration. Specially, I will talk about two&lt;br /&gt;
building blocks of the system in detail. The first is about a new&lt;br /&gt;
probabilistic model for document recommendation that is both&lt;br /&gt;
predictive and interpretable. It not only gives better predictive&lt;br /&gt;
performance, but also provides better transparency than&lt;br /&gt;
traditional approaches. This transparency creates many new&lt;br /&gt;
opportunities for exploratory analysis---For example, a user can&lt;br /&gt;
manually adjust her preferences and the system responds to this&lt;br /&gt;
by changing its recommendations. Second, building a recommender&lt;br /&gt;
system like this requires learning the probabilistic model from&lt;br /&gt;
large-scale empirical data. I will describe a scalable approach&lt;br /&gt;
for learning a wide class of probabilistic models that include&lt;br /&gt;
our recommendation model as a special case.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Chong is a Project Scientist in Eric Xing&#039;s group, Machine Learning Department, Carnegie Mellon University.  His PhD advisor was David M. Blei from Princeton University.&lt;br /&gt;
&lt;br /&gt;
== 02/13/2013: Computational Modeling of Sociopragmatic Language Use in Arabic and English Social Media ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www1.ccls.columbia.edu/~mdiab/ Mona Diab], Columbia University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Social media language is a treasure trove for mining and understanding human interactions. In discussion fora, people naturally form groups and subgroups aligning along points of consensus and contention. These subgroup formations are quite nuanced as people could agree on some topic such as liking the movie the matrix, but some within that group might disagree on rating the acting skills of Keanu Reeves. Languages manifest these alignments exploiting  interesting sociolinguistic devices in different ways. In this talk, I will present our work on subgroup modeling and detection in both Arabic and English social media language. I will share with you our experiences with modeling both explicit and implicit attitude using high and low dimensional feature modeling. This work is the beginning of an interesting exploration into the realm of building computational  models of some aspects of the sociopragmatics of human language with the hopes that this research could lead to a  better understanding of human interaction. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Mona Diab is an Associate Professor of Computer Science at the George Washington University. She is also a cofounder of the CADIM (Columbia Arabic Dialect Modeling) group at Columbia University. Mona earned her PhD in Computational Linguistics from University of Maryland College Park with Philip Resnik in 2003 and then did her postdoctoral training with Daniel Jurafsky at Stanford University where she was part of the NLP group.  from 2005 till 2012, before joining GWU in Jan of 2013, Mona held the position of Research Scientist/Principle Investigator at Columbia University Center for Computational Learning Systems (CCLS). Mona&#039;s research  interests span computational lexical semantics, multilingual processing (with a special interest in Arabic and low resource languages), unsupervised learning for NLP, computational sociopragmatic modeling, information extraction and machine translation. Over the past 9 years, Mona has developed significant expertise in modeling low resource languages with a focus on Arabic dialect processing. She is especially interested in ways to leverage existing rich resources to inform algorithms for processing low resource languages. Her research has been published in over 90 papers in various internationally recognized scientific venues. Mona serves as the current elected President of the ACL SIG on Semitic Language Processing, she is also the elected Secretary for the ACL SIG on issues in the Lexicon (SIGLEX). She also serves on the NAACL board as an elected member. Mona recently (2012) co-founded  the yearly *SEM conference that attempts to bring together all aspects of semantic processing under the same umbrella venue. &lt;br /&gt;
&lt;br /&gt;
== 02/14/2013: Efficient Probabilistic Models for Rankings and Orderings ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://stanford.edu/~jhuang11/ Jon Huang], Stanford University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Thursday, February 14, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The need to reason probabilistically with rankings and orderings arises &lt;br /&gt;
in a number of real world problems.  Probability distributions over &lt;br /&gt;
rankings and orderings arise naturally, for example, in preference data, &lt;br /&gt;
and political election data, as well as a number of less obvious &lt;br /&gt;
settings such as topic analysis and neurodegenerative disease &lt;br /&gt;
progression modeling. Representing distributions over the space of all &lt;br /&gt;
rankings is challenging, however, due to the factorial number of ways to &lt;br /&gt;
rank a collection of items.  The focus of my talk is to discuss methods &lt;br /&gt;
for combatting this factorial explosion in probabilistic representation &lt;br /&gt;
and inference.&lt;br /&gt;
&lt;br /&gt;
Ordinarily, a typical machine learning method for dealing with &lt;br /&gt;
combinatorial complexity might be to exploit conditional independence &lt;br /&gt;
relations in order to decompose a distribution into compact factors of a &lt;br /&gt;
graphical model.  For ranked data, however, a far more natural and &lt;br /&gt;
useful probabilistic relation is that of `riffled independence&#039;.  I will &lt;br /&gt;
introduce the concept of riffled independence and discuss how these &lt;br /&gt;
riffle independent relations can be used to decompose a distribution &lt;br /&gt;
over rankings into a product of compactly represented factors.  These &lt;br /&gt;
so-called hierarchical riffle-independent distributions are particularly &lt;br /&gt;
amenable to efficient inference and learning algorithms and in many &lt;br /&gt;
cases lead to intuitively interpretable probabilistic models. To &lt;br /&gt;
illustrate the power of exploiting riffled independence, I will discuss &lt;br /&gt;
a few applications, including Irish political election analysis, &lt;br /&gt;
visualizing the japanese preferences of sushi types and modeling the &lt;br /&gt;
progression of Alzheimer&#039;s disease, showing results on real datasets in &lt;br /&gt;
each problem.&lt;br /&gt;
&lt;br /&gt;
This is joint work with Carlos Guestrin (University of Washington), &lt;br /&gt;
Ashish Kapoor (Microsoft Research) and Daniel Alexander (University &lt;br /&gt;
College London).&lt;br /&gt;
&lt;br /&gt;
== 02/27/2013:David Mimno ==&lt;br /&gt;
&lt;br /&gt;
== 03/13/2013: Dan Hopkins ==&lt;br /&gt;
&lt;br /&gt;
== 03/27/2013: Richard Sproat ==&lt;br /&gt;
&lt;br /&gt;
== 04/10/2013: Learning with Marginalized Corrupted Features ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cse.wustl.edu/~kilian/ Kilian Weinberger],  Washington University in St. Louis&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, April 10, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If infinite amounts of labeled data are provided, many machine learning algorithms become perfect. With finite amounts of data, regularization or priors have to be used to introduce bias into a classifier. We propose a third option: learning with marginalized corrupted features. We corrupt existing data as a means to generate infinitely many additional training samples from a slightly different data distribution -- explicitly in a way that the corruption can be marginalized out in closed form. This leads to machine learning algorithms that are fast, effective and naturally scale to very large data sets. We showcase this technology in two settings: 1. to learn text document representations from unlabeled data and 2. to perform supervised learning with closed form gradient updates for empirical risk minimization. &lt;br /&gt;
&lt;br /&gt;
Text documents (and often images) are traditionally expressed as bag-of-words feature vectors (e.g. as tf-idf). By training linear denoisers that recover unlabeled data from partial corruption, we can learn new data-specific representations. With these, we can match the world-record accuracy on the Amazon transfer learning benchmark with a simple linear classifier. In comparison with the record holder (stacked denoising autoencoders) our approach shrinks the training time from several days to a few minutes. &lt;br /&gt;
&lt;br /&gt;
Finally, we present a variety of loss functions and corrupting distributions, which can be applied out-of-the-box with empirical risk minimization. We show that our formulation leads to significant improvements in document classification tasks over the typically used l_p norm regularization. The new learning framework is extremely versatile, generalizes better, is more stable during test-time (towards distribution drift) and only adds a few lines of code to typical risk minimization.  &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Kilian Q. Weinberger is an Assistant Professor in the Department of Computer Science &amp;amp; Engineering at Washington University in St. Louis. He received his Ph.D. from the University of Pennsylvania in Machine Learning under the supervision of Lawrence Saul. Prior to this, he obtained his undergraduate degree in Mathematics and Computer Science at the University of Oxford. During his career he has won several best paper awards at ICML, CVPR and AISTATS. In 2011 he was awarded the AAAI senior program chair award and in 2012 he received the NSF CAREER award. Kilian Weinberger&#039;s research is in Machine Learning and its applications. In particular, he focuses on high dimensional data analysis, metric learning, machine learned web-search ranking, transfer- and multi-task learning as well as bio medical applications. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=677</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=677"/>
		<updated>2013-02-05T15:10:54Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 01/30/2013: Human Translation and Machine Translation ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/pkoehn/ Philipp Koehn],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, January 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Despite all the recent successes of machine translation, when it&lt;br /&gt;
comes to high quality publishable translation, human translators&lt;br /&gt;
are still unchallenged. Since we can&#039;t beat them, can we help&lt;br /&gt;
them to become more productive? I will talk about some recent&lt;br /&gt;
work on developing assistance tools for human translators.&lt;br /&gt;
You can also check out a prototype [http://www.caitra.org/ here]&lt;br /&gt;
and learn about our ongoing European projects [http://www.casmacat.eu/ CASMACAT]&lt;br /&gt;
and [http://www.matecat.com/ MATECAT].&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Philipp Koehn is Professor of Machine Translation at the&lt;br /&gt;
School of Informatics at the University of Edinburgh, Scotland.&lt;br /&gt;
He received his PhD at the University of Southern California&lt;br /&gt;
and spent a year as postdoctoral researcher at MIT.&lt;br /&gt;
He is well-known in the field of statistical machine translation&lt;br /&gt;
for the leading open source toolkit Moses, the organization&lt;br /&gt;
of the annual Workshop on Statistical Machine Translation&lt;br /&gt;
and its evaluation campaign as well as the Machine Translation&lt;br /&gt;
Marathon. He is founding president of the ACL SIG MT and&lt;br /&gt;
currently serves a vice president-elect of the ACL SIG DAT.&lt;br /&gt;
He has published over 80 papers and the textbook in the&lt;br /&gt;
field. He manages a number of EU and DARPA funded&lt;br /&gt;
research projects aimed at morpho-syntactic models, machine&lt;br /&gt;
learning methods and computer assisted translation tools.&lt;br /&gt;
&lt;br /&gt;
== 02/06/2013: A New Recommender System for Large-scale Document Exploration ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cs.cmu.edu/~chongw/ Chong Wang],  Carnegie Mellon University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 6, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
How can we help people quickly navigate the vast amount of data&lt;br /&gt;
and acquire useful knowledge from it? Recommender systems provide&lt;br /&gt;
a promising solution to this problem. They narrow down the search&lt;br /&gt;
space by providing a few recommendations that are tailored to&lt;br /&gt;
users&#039; personal preferences. However, these systems usually work&lt;br /&gt;
like a black box, limiting further opportunities to provide more&lt;br /&gt;
exploratory experiences to their users.&lt;br /&gt;
&lt;br /&gt;
In this talk, I will describe how we build a new recommender&lt;br /&gt;
system for document exploration. Specially, I will talk about two&lt;br /&gt;
building blocks of the system in detail. The first is about a new&lt;br /&gt;
probabilistic model for document recommendation that is both&lt;br /&gt;
predictive and interpretable. It not only gives better predictive&lt;br /&gt;
performance, but also provides better transparency than&lt;br /&gt;
traditional approaches. This transparency creates many new&lt;br /&gt;
opportunities for exploratory analysis---For example, a user can&lt;br /&gt;
manually adjust her preferences and the system responds to this&lt;br /&gt;
by changing its recommendations. Second, building a recommender&lt;br /&gt;
system like this requires learning the probabilistic model from&lt;br /&gt;
large-scale empirical data. I will describe a scalable approach&lt;br /&gt;
for learning a wide class of probabilistic models that include&lt;br /&gt;
our recommendation model as a special case.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Chong is a Project Scientist in Eric Xing&#039;s group, Machine Learning Department, Carnegie Mellon University.  His PhD advisor was David M. Blei from Princeton University.&lt;br /&gt;
&lt;br /&gt;
== 02/13/2013: Mona Diab ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www1.ccls.columbia.edu/~mdiab/ Mona Diab], Columbia University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 02/14/2013: Efficient Probabilistic Models for Rankings and Orderings ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://stanford.edu/~jhuang11/ Jon Huang], Stanford University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Thursday, February 14, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The need to reason probabilistically with rankings and orderings arises &lt;br /&gt;
in a number of real world problems.  Probability distributions over &lt;br /&gt;
rankings and orderings arise naturally, for example, in preference data, &lt;br /&gt;
and political election data, as well as a number of less obvious &lt;br /&gt;
settings such as topic analysis and neurodegenerative disease &lt;br /&gt;
progression modeling. Representing distributions over the space of all &lt;br /&gt;
rankings is challenging, however, due to the factorial number of ways to &lt;br /&gt;
rank a collection of items.  The focus of my talk is to discuss methods &lt;br /&gt;
for combatting this factorial explosion in probabilistic representation &lt;br /&gt;
and inference.&lt;br /&gt;
&lt;br /&gt;
Ordinarily, a typical machine learning method for dealing with &lt;br /&gt;
combinatorial complexity might be to exploit conditional independence &lt;br /&gt;
relations in order to decompose a distribution into compact factors of a &lt;br /&gt;
graphical model.  For ranked data, however, a far more natural and &lt;br /&gt;
useful probabilistic relation is that of `riffled independence&#039;.  I will &lt;br /&gt;
introduce the concept of riffled independence and discuss how these &lt;br /&gt;
riffle independent relations can be used to decompose a distribution &lt;br /&gt;
over rankings into a product of compactly represented factors.  These &lt;br /&gt;
so-called hierarchical riffle-independent distributions are particularly &lt;br /&gt;
amenable to efficient inference and learning algorithms and in many &lt;br /&gt;
cases lead to intuitively interpretable probabilistic models. To &lt;br /&gt;
illustrate the power of exploiting riffled independence, I will discuss &lt;br /&gt;
a few applications, including Irish political election analysis, &lt;br /&gt;
visualizing the japanese preferences of sushi types and modeling the &lt;br /&gt;
progression of Alzheimer&#039;s disease, showing results on real datasets in &lt;br /&gt;
each problem.&lt;br /&gt;
&lt;br /&gt;
This is joint work with Carlos Guestrin (University of Washington), &lt;br /&gt;
Ashish Kapoor (Microsoft Research) and Daniel Alexander (University &lt;br /&gt;
College London).&lt;br /&gt;
&lt;br /&gt;
== 02/27/2013:David Mimno ==&lt;br /&gt;
&lt;br /&gt;
== 03/13/2013: Dan Hopkins ==&lt;br /&gt;
&lt;br /&gt;
== 03/27/2013: Richard Sproat ==&lt;br /&gt;
&lt;br /&gt;
== 04/10/2013: Learning with Marginalized Corrupted Features ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cse.wustl.edu/~kilian/ Kilian Weinberger],  Washington University in St. Louis&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, April 10, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If infinite amounts of labeled data are provided, many machine learning algorithms become perfect. With finite amounts of data, regularization or priors have to be used to introduce bias into a classifier. We propose a third option: learning with marginalized corrupted features. We corrupt existing data as a means to generate infinitely many additional training samples from a slightly different data distribution -- explicitly in a way that the corruption can be marginalized out in closed form. This leads to machine learning algorithms that are fast, effective and naturally scale to very large data sets. We showcase this technology in two settings: 1. to learn text document representations from unlabeled data and 2. to perform supervised learning with closed form gradient updates for empirical risk minimization. &lt;br /&gt;
&lt;br /&gt;
Text documents (and often images) are traditionally expressed as bag-of-words feature vectors (e.g. as tf-idf). By training linear denoisers that recover unlabeled data from partial corruption, we can learn new data-specific representations. With these, we can match the world-record accuracy on the Amazon transfer learning benchmark with a simple linear classifier. In comparison with the record holder (stacked denoising autoencoders) our approach shrinks the training time from several days to a few minutes. &lt;br /&gt;
&lt;br /&gt;
Finally, we present a variety of loss functions and corrupting distributions, which can be applied out-of-the-box with empirical risk minimization. We show that our formulation leads to significant improvements in document classification tasks over the typically used l_p norm regularization. The new learning framework is extremely versatile, generalizes better, is more stable during test-time (towards distribution drift) and only adds a few lines of code to typical risk minimization.  &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Kilian Q. Weinberger is an Assistant Professor in the Department of Computer Science &amp;amp; Engineering at Washington University in St. Louis. He received his Ph.D. from the University of Pennsylvania in Machine Learning under the supervision of Lawrence Saul. Prior to this, he obtained his undergraduate degree in Mathematics and Computer Science at the University of Oxford. During his career he has won several best paper awards at ICML, CVPR and AISTATS. In 2011 he was awarded the AAAI senior program chair award and in 2012 he received the NSF CAREER award. Kilian Weinberger&#039;s research is in Machine Learning and its applications. In particular, he focuses on high dimensional data analysis, metric learning, machine learned web-search ranking, transfer- and multi-task learning as well as bio medical applications. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=674</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=674"/>
		<updated>2013-02-05T14:29:51Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 01/30/2013: Human Translation and Machine Translation ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/pkoehn/ Philipp Koehn],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, January 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Despite all the recent successes of machine translation, when it&lt;br /&gt;
comes to high quality publishable translation, human translators&lt;br /&gt;
are still unchallenged. Since we can&#039;t beat them, can we help&lt;br /&gt;
them to become more productive? I will talk about some recent&lt;br /&gt;
work on developing assistance tools for human translators.&lt;br /&gt;
You can also check out a prototype [http://www.caitra.org/ here]&lt;br /&gt;
and learn about our ongoing European projects [http://www.casmacat.eu/ CASMACAT]&lt;br /&gt;
and [http://www.matecat.com/ MATECAT].&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Philipp Koehn is Professor of Machine Translation at the&lt;br /&gt;
School of Informatics at the University of Edinburgh, Scotland.&lt;br /&gt;
He received his PhD at the University of Southern California&lt;br /&gt;
and spent a year as postdoctoral researcher at MIT.&lt;br /&gt;
He is well-known in the field of statistical machine translation&lt;br /&gt;
for the leading open source toolkit Moses, the organization&lt;br /&gt;
of the annual Workshop on Statistical Machine Translation&lt;br /&gt;
and its evaluation campaign as well as the Machine Translation&lt;br /&gt;
Marathon. He is founding president of the ACL SIG MT and&lt;br /&gt;
currently serves a vice president-elect of the ACL SIG DAT.&lt;br /&gt;
He has published over 80 papers and the textbook in the&lt;br /&gt;
field. He manages a number of EU and DARPA funded&lt;br /&gt;
research projects aimed at morpho-syntactic models, machine&lt;br /&gt;
learning methods and computer assisted translation tools.&lt;br /&gt;
&lt;br /&gt;
== 02/06/2013: Chong Wang: A New Recommender System for Large-scale Document Exploration ==&lt;br /&gt;
&lt;br /&gt;
How can we help people quickly navigate the vast amount of data&lt;br /&gt;
and acquire useful knowledge from it? Recommender systems provide&lt;br /&gt;
a promising solution to this problem. They narrow down the search&lt;br /&gt;
space by providing a few recommendations that are tailored to&lt;br /&gt;
users&#039; personal preferences. However, these systems usually work&lt;br /&gt;
like a black box, limiting further opportunities to provide more&lt;br /&gt;
exploratory experiences to their users.&lt;br /&gt;
&lt;br /&gt;
In this talk, I will describe how we build a new recommender&lt;br /&gt;
system for document exploration. Specially, I will talk about two&lt;br /&gt;
building blocks of the system in detail. The first is about a new&lt;br /&gt;
probabilistic model for document recommendation that is both&lt;br /&gt;
predictive and interpretable. It not only gives better predictive&lt;br /&gt;
performance, but also provides better transparency than&lt;br /&gt;
traditional approaches. This transparency creates many new&lt;br /&gt;
opportunities for exploratory analysis---For example, a user can&lt;br /&gt;
manually adjust her preferences and the system responds to this&lt;br /&gt;
by changing its recommendations. Second, building a recommender&lt;br /&gt;
system like this requires learning the probabilistic model from&lt;br /&gt;
large-scale empirical data. I will describe a scalable approach&lt;br /&gt;
for learning a wide class of probabilistic models that include&lt;br /&gt;
our recommendation model as a special case.&lt;br /&gt;
&lt;br /&gt;
Chong is a Project Scientist in Eric Xing&#039;s group, Machine Learning Department, Carnegie Mellon University.  His PhD advisor was David M. Blei from Princeton University.&lt;br /&gt;
&lt;br /&gt;
== 02/13/2013: Mona Diab ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www1.ccls.columbia.edu/~mdiab/ Mona Diab], Columbia University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 02/14/2013: Efficient Probabilistic Models for Rankings and Orderings ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://stanford.edu/~jhuang11/ Jon Huang], Stanford University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Thursday, February 14, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; TBA&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The need to reason probabilistically with rankings and orderings arises &lt;br /&gt;
in a number of real world problems.  Probability distributions over &lt;br /&gt;
rankings and orderings arise naturally, for example, in preference data, &lt;br /&gt;
and political election data, as well as a number of less obvious &lt;br /&gt;
settings such as topic analysis and neurodegenerative disease &lt;br /&gt;
progression modeling. Representing distributions over the space of all &lt;br /&gt;
rankings is challenging, however, due to the factorial number of ways to &lt;br /&gt;
rank a collection of items.  The focus of my talk is to discuss methods &lt;br /&gt;
for combatting this factorial explosion in probabilistic representation &lt;br /&gt;
and inference.&lt;br /&gt;
&lt;br /&gt;
Ordinarily, a typical machine learning method for dealing with &lt;br /&gt;
combinatorial complexity might be to exploit conditional independence &lt;br /&gt;
relations in order to decompose a distribution into compact factors of a &lt;br /&gt;
graphical model.  For ranked data, however, a far more natural and &lt;br /&gt;
useful probabilistic relation is that of `riffled independence&#039;.  I will &lt;br /&gt;
introduce the concept of riffled independence and discuss how these &lt;br /&gt;
riffle independent relations can be used to decompose a distribution &lt;br /&gt;
over rankings into a product of compactly represented factors.  These &lt;br /&gt;
so-called hierarchical riffle-independent distributions are particularly &lt;br /&gt;
amenable to efficient inference and learning algorithms and in many &lt;br /&gt;
cases lead to intuitively interpretable probabilistic models. To &lt;br /&gt;
illustrate the power of exploiting riffled independence, I will discuss &lt;br /&gt;
a few applications, including Irish political election analysis, &lt;br /&gt;
visualizing the japanese preferences of sushi types and modeling the &lt;br /&gt;
progression of Alzheimer&#039;s disease, showing results on real datasets in &lt;br /&gt;
each problem.&lt;br /&gt;
&lt;br /&gt;
This is joint work with Carlos Guestrin (University of Washington), &lt;br /&gt;
Ashish Kapoor (Microsoft Research) and Daniel Alexander (University &lt;br /&gt;
College London).&lt;br /&gt;
&lt;br /&gt;
== 02/27/2013:David Mimno ==&lt;br /&gt;
&lt;br /&gt;
== 03/13/2013: Dan Hopkins ==&lt;br /&gt;
&lt;br /&gt;
== 03/27/2013: Richard Sproat ==&lt;br /&gt;
&lt;br /&gt;
== 04/10/2013: Learning with Marginalized Corrupted Features ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cse.wustl.edu/~kilian/ Kilian Weinberger],  Washington University in St. Louis&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, April 10, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If infinite amounts of labeled data are provided, many machine learning algorithms become perfect. With finite amounts of data, regularization or priors have to be used to introduce bias into a classifier. We propose a third option: learning with marginalized corrupted features. We corrupt existing data as a means to generate infinitely many additional training samples from a slightly different data distribution -- explicitly in a way that the corruption can be marginalized out in closed form. This leads to machine learning algorithms that are fast, effective and naturally scale to very large data sets. We showcase this technology in two settings: 1. to learn text document representations from unlabeled data and 2. to perform supervised learning with closed form gradient updates for empirical risk minimization. &lt;br /&gt;
&lt;br /&gt;
Text documents (and often images) are traditionally expressed as bag-of-words feature vectors (e.g. as tf-idf). By training linear denoisers that recover unlabeled data from partial corruption, we can learn new data-specific representations. With these, we can match the world-record accuracy on the Amazon transfer learning benchmark with a simple linear classifier. In comparison with the record holder (stacked denoising autoencoders) our approach shrinks the training time from several days to a few minutes. &lt;br /&gt;
&lt;br /&gt;
Finally, we present a variety of loss functions and corrupting distributions, which can be applied out-of-the-box with empirical risk minimization. We show that our formulation leads to significant improvements in document classification tasks over the typically used l_p norm regularization. The new learning framework is extremely versatile, generalizes better, is more stable during test-time (towards distribution drift) and only adds a few lines of code to typical risk minimization.  &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Kilian Q. Weinberger is an Assistant Professor in the Department of Computer Science &amp;amp; Engineering at Washington University in St. Louis. He received his Ph.D. from the University of Pennsylvania in Machine Learning under the supervision of Lawrence Saul. Prior to this, he obtained his undergraduate degree in Mathematics and Computer Science at the University of Oxford. During his career he has won several best paper awards at ICML, CVPR and AISTATS. In 2011 he was awarded the AAAI senior program chair award and in 2012 he received the NSF CAREER award. Kilian Weinberger&#039;s research is in Machine Learning and its applications. In particular, he focuses on high dimensional data analysis, metric learning, machine learned web-search ranking, transfer- and multi-task learning as well as bio medical applications. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
	<entry>
		<id>https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=671</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://wiki.umiacs.umd.edu/clip/index.php?title=Events&amp;diff=671"/>
		<updated>2013-01-20T20:26:37Z</updated>

		<summary type="html">&lt;p&gt;Jimmylin: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;[[Image:colloq.jpg|center|504px|x]]&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you&#039;d like to have a meeting.&lt;br /&gt;
&lt;br /&gt;
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{{#widget:Google Calendar&lt;br /&gt;
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com&lt;br /&gt;
|color=B1440E&lt;br /&gt;
|title=Upcoming Talks&lt;br /&gt;
|view=AGENDA&lt;br /&gt;
|height=300&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== 01/30/2013: Human Translation and Machine Translation ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://homepages.inf.ed.ac.uk/pkoehn/ Philipp Koehn],  University of Edinburgh&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, January 30, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Despite all the recent successes of machine translation, when it&lt;br /&gt;
comes to high quality publishable translation, human translators&lt;br /&gt;
are still unchallenged. Since we can&#039;t beat them, can we help&lt;br /&gt;
them to become more productive? I will talk about some recent&lt;br /&gt;
work on developing assistance tools for human translators.&lt;br /&gt;
You can also check out a prototype [http://www.caitra.org/ here]&lt;br /&gt;
and learn about our ongoing European projects [http://www.casmacat.eu/ CASMACAT]&lt;br /&gt;
and [http://www.matecat.com/ MATECAT].&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Philipp Koehn is Professor of Machine Translation at the&lt;br /&gt;
School of Informatics at the University of Edinburgh, Scotland.&lt;br /&gt;
He received his PhD at the University of Southern California&lt;br /&gt;
and spent a year as postdoctoral researcher at MIT.&lt;br /&gt;
He is well-known in the field of statistical machine translation&lt;br /&gt;
for the leading open source toolkit Moses, the organization&lt;br /&gt;
of the annual Workshop on Statistical Machine Translation&lt;br /&gt;
and its evaluation campaign as well as the Machine Translation&lt;br /&gt;
Marathon. He is founding president of the ACL SIG MT and&lt;br /&gt;
currently serves a vice president-elect of the ACL SIG DAT.&lt;br /&gt;
He has published over 80 papers and the textbook in the&lt;br /&gt;
field. He manages a number of EU and DARPA funded&lt;br /&gt;
research projects aimed at morpho-syntactic models, machine&lt;br /&gt;
learning methods and computer assisted translation tools.&lt;br /&gt;
&lt;br /&gt;
== 02/06/2013: Chong Wang ==&lt;br /&gt;
&lt;br /&gt;
== 02/13/2013: Mona Diab ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www1.ccls.columbia.edu/~mdiab/ Mona Diab], Columbia University&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, February 13, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 02/27/2013:David Mimno ==&lt;br /&gt;
&lt;br /&gt;
== 03/13/2013: Dan Hopkins ==&lt;br /&gt;
&lt;br /&gt;
== 03/27/2013: Richard Sproat ==&lt;br /&gt;
&lt;br /&gt;
== 04/10/2013: Learning with Marginalized Corrupted Features ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Speaker:&#039;&#039;&#039; [http://www.cse.wustl.edu/~kilian/ Kilian Weinberger],  Washington University in St. Louis&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; Wednesday, April 10, 2013, 11:00 AM&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Venue:&#039;&#039;&#039; AVW 3258&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If infinite amounts of labeled data are provided, many machine learning algorithms become perfect. With finite amounts of data, regularization or priors have to be used to introduce bias into a classifier. We propose a third option: learning with marginalized corrupted features. We corrupt existing data as a means to generate infinitely many additional training samples from a slightly different data distribution -- explicitly in a way that the corruption can be marginalized out in closed form. This leads to machine learning algorithms that are fast, effective and naturally scale to very large data sets. We showcase this technology in two settings: 1. to learn text document representations from unlabeled data and 2. to perform supervised learning with closed form gradient updates for empirical risk minimization. &lt;br /&gt;
&lt;br /&gt;
Text documents (and often images) are traditionally expressed as bag-of-words feature vectors (e.g. as tf-idf). By training linear denoisers that recover unlabeled data from partial corruption, we can learn new data-specific representations. With these, we can match the world-record accuracy on the Amazon transfer learning benchmark with a simple linear classifier. In comparison with the record holder (stacked denoising autoencoders) our approach shrinks the training time from several days to a few minutes. &lt;br /&gt;
&lt;br /&gt;
Finally, we present a variety of loss functions and corrupting distributions, which can be applied out-of-the-box with empirical risk minimization. We show that our formulation leads to significant improvements in document classification tasks over the typically used l_p norm regularization. The new learning framework is extremely versatile, generalizes better, is more stable during test-time (towards distribution drift) and only adds a few lines of code to typical risk minimization.  &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;About the Speaker:&#039;&#039;&#039; Kilian Q. Weinberger is an Assistant Professor in the Department of Computer Science &amp;amp; Engineering at Washington University in St. Louis. He received his Ph.D. from the University of Pennsylvania in Machine Learning under the supervision of Lawrence Saul. Prior to this, he obtained his undergraduate degree in Mathematics and Computer Science at the University of Oxford. During his career he has won several best paper awards at ICML, CVPR and AISTATS. In 2011 he was awarded the AAAI senior program chair award and in 2012 he received the NSF CAREER award. Kilian Weinberger&#039;s research is in Machine Learning and its applications. In particular, he focuses on high dimensional data analysis, metric learning, machine learned web-search ranking, transfer- and multi-task learning as well as bio medical applications. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Previous Talks ==&lt;br /&gt;
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]&lt;br /&gt;
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]&lt;/div&gt;</summary>
		<author><name>Jimmylin</name></author>
	</entry>
</feed>