Actions

Difference between revisions of "Events"

Computational Linguistics and Information Processing

(81 intermediate revisions by 8 users not shown)
Line 1: Line 1:
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers before and after the talks; contact the host if you'd like to have a meeting.
+
<center>[[Image:colloq.jpg|center|504px|x]]</center>
  
If you would like to get on the cl-colloquium@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:jimmylin@umd.edu Jimmy Lin], the current organizer.
+
== CLIP Colloquium ==
  
 +
The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held on Wednesday at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers.
  
{{#widget:Google Calendar
+
If you would like to get on the clip-talks@umiacs.umd.edu list or for other questions about the colloquium series, e-mail [mailto:oard@umiacs.umd.edu Doug Oard], the current organizer.
|id=lqah25nfftkqi2msv25trab8pk@group.calendar.google.com
 
|color=B1440E
 
|title=Upcoming Talks
 
|view=AGENDA
 
|height=300
 
}}
 
__NOTOC__
 
== 10/23/2012: Bootstrapping via Graph Propagation ==
 
  
'''Speaker:''' [http://www.cs.sfu.ca/~anoop/ Anoop Sarkar], Simon Fraser University <br/>
+
For up-to-date information, see the [https://talks.cs.umd.edu/lists/7 UMD CS Talks page].  (You can also subscribe to the calendar there.)
'''Time:''' Tuesday, October 23, 2012, 2:00 PM<br/>
 
'''Venue:''' AVW 4172<br/>
 
  
'''[http://www.cs.sfu.ca/~anoop/papers/pdf/yprop-slides-umd10232012.pdf  Slides]'''
+
=== Colloquium Recordings ===
 +
* [[Colloqium Recording (Fall 2020)|Fall 2020]]
 +
* [[Colloqium Recording (Spring 2021)|Spring 2021]]
  
'''Note special time and place!!!'''
+
=== Previous Talks ===
 +
* [[https://talks.cs.umd.edu/lists/7?range=past Past talks, 2013 - present]]
 +
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]  [[CLIP Colloquium (Fall 2011)|Fall 2011]]  [[CLIP Colloquium (Spring 2011)|Spring 2011]]  [[CLIP Colloquium (Fall 2010)|Fall 2010]]
  
In natural language processing, the bootstrapping algorithm introduced
+
== CLIP NEWS  ==
by David Yarowsky (15 years ago) is a discriminative unsupervised
 
learning algorithm that uses some seed rules to bootstrap a classifier
 
(this is the ordinary sense of bootstrapping which is distinct from
 
the Bootstrap in statistics). The Yarowsky algorithm works remarkably
 
well on a wide variety of NLP classification tasks such as
 
distinguishing between word senses and deciding if a noun phrase is an
 
organization, location, or person.
 
  
Extending previous attempts at providing an objective function
+
* News about CLIP researchers on the UMIACS website [http://www.umiacs.umd.edu/about-us/news]
optimization view of Yarowsky, we show that bootstrapping a classifier
+
* Please follow us on Twitter @umdclip [https://twitter.com/umdclip?lang=en]
from a small set of seed rules can be viewed as the propagation of
 
labels between examples via features shared between them. This paper
 
introduces a novel variant of the Yarowsky algorithm based on this
 
view. It is a bootstrapping learning method which uses a graph
 
propagation algorithm with a well defined per-iteration objective
 
function that incorporates the cautious behaviour of the original
 
Yarowsky algorithm.
 
 
 
The experimental results show that our proposed bootstrapping
 
algorithm achieves state of the art performance or better on several
 
different natural language data sets, outperforming other unsupervised
 
methods such as the EM algorithm. We show that cautious learning is an
 
important principle in unsupervised learning, however we do not
 
understand it well, and we show that the Yarowsky algorithm can
 
outperform or match co-training  without any reliance on multiple
 
views.
 
 
 
'''About the Speaker:''' Anoop Sarkar is an Associate Professor at Simon Fraser University in
 
British Columbia, Canada where he co-directs the [http://natlang.cs.sfu.ca Natural Language Laboratory]. He received his Ph.D. from the
 
Department of Computer and Information Sciences at the University of
 
Pennsylvania under Prof. Aravind Joshi for his work on semi-supervised
 
statistical parsing using tree-adjoining grammars.
 
 
 
His research is focused on statistical parsing and machine translation
 
(exploiting syntax or morphology, semi-supervised learning, and domain
 
adaptation). His interests also include formal language theory and
 
stochastic grammars, in particular tree automata and tree-adjoining
 
grammars.
 
 
 
== 10/24/2012: Recent Advances in Open Information Extraction ==
 
 
 
'''Speaker:''' [http://homes.cs.washington.edu/~mausam/ Mausam], University of Washington<br/>
 
'''Time:''' Wednesday, October 24, 2012, 11:00 AM<br/>
 
'''Venue:''' AVW 3258<br/>
 
 
 
[http://openie.cs.washington.edu Open Information Extraction] is
 
an attractive paradigm for extracting large amounts of relational facts from
 
natural language text in a domain-independent manner. In this talk I
 
describe our recent progress using this model, including our latest open
 
extractors, ReVerb and OLLIE, which substantially improve on the previous
 
state of the art. I will end with our ongoing work that uses open
 
extractions for various end tasks, including multi-document summarization
 
and unsupervised event extraction.
 
 
 
'''About the Speaker:''' Mausam is a Research Assistant Professor at the Turing Center in the
 
Department of Computer Science at the University of Washington, Seattle. His
 
research interests span various sub-fields of artificial intelligence,
 
including sequential decision making under uncertainty, large scale natural
 
language processing, and AI applications to crowd-sourcing. Mausam obtained
 
a PhD from University of Washington in 2007 and a Bachelor of Technology
 
from IIT Delhi in 2001.
 
 
 
== 10/31/2012: Learning with Marginalized Corrupted Features ==
 
 
 
'''Speaker:''' [http://www.cse.wustl.edu/~kilian/ Kilian Weinberger],  Washington University in St. Louis<br/>
 
'''Time:''' Wednesday, October 31, 2012, 11:00 AM<br/>
 
'''Venue:''' AVW 3258<br/>
 
 
 
If infinite amounts of labeled data are provided, many machine learning algorithms become perfect. With finite amounts of data, regularization or priors have to be used to introduce bias into a classifier. We propose a third option: learning with marginalized corrupted features. We corrupt existing data as a means to generate infinitely many additional training samples from a slightly different data distribution -- explicitly in a way that the corruption can be marginalized out in closed form. This leads to machine learning algorithms that are fast, effective and naturally scale to very large data sets. We showcase this technology in two settings: 1. to learn text document representations from unlabeled data and 2. to perform supervised learning with closed form gradient updates for empirical risk minimization.
 
 
 
Text documents (and often images) are traditionally expressed as bag-of-words feature vectors (e.g. as tf-idf). By training linear denoisers that recover unlabeled data from partial corruption, we can learn new data-specific representations. With these, we can match the world-record accuracy on the Amazon transfer learning benchmark with a simple linear classifier. In comparison with the record holder (stacked denoising autoencoders) our approach shrinks the training time from several days to a few minutes.
 
 
 
Finally, we present a variety of loss functions and corrupting distributions, which can be applied out-of-the-box with empirical risk minimization. We show that our formulation leads to significant improvements in document classification tasks over the typically used l_p norm regularization. The new learning framework is extremely versatile, generalizes better, is more stable during test-time (towards distribution drift) and only adds a few lines of code to typical risk minimization. 
 
 
 
'''About the Speaker:''' Kilian Q. Weinberger is an Assistant Professor in the Department of Computer Science & Engineering at Washington University in St. Louis. He received his Ph.D. from the University of Pennsylvania in Machine Learning under the supervision of Lawrence Saul. Prior to this, he obtained his undergraduate degree in Mathematics and Computer Science at the University of Oxford. During his career he has won several best paper awards at ICML, CVPR and AISTATS. In 2011 he was awarded the AAAI senior program chair award and in 2012 he received the NSF CAREER award. Kilian Weinberger's research is in Machine Learning and its applications. In particular, he focuses on high dimensional data analysis, metric learning, machine learned web-search ranking, transfer- and multi-task learning as well as bio medical applications.
 
 
 
== 11/07/2012: Using Syntactic Head Information in Hierarchical Phrase-Based Translation ==
 
 
 
'''Speaker:''' Junhui Li<br/>
 
'''Time:''' Wednesday, November 7, 2012, 11:00 AM<br/>
 
'''Venue:''' AVW 3258<br/>
 
 
 
The traditional hierarchical phrase-based (HPB) model is prone to overgeneration due to lack of linguistic knowledge: the grammar may suggest more derivations than appropriate, many of which may lead to ungrammatical translations. On the other hand, limitations of glue grammar rules in HPB model may actually prevent systems from considering some reasonable derivations. This talk presents a simple but effective translation model, called the Head-Driven HPB (HD-HPB) model, which incorporates head information in translation rules to better capture syntax-driven information in a derivation. In addition, unlike the original glue rules, the HD-HPB model allows improved reordering between any two neighboring non-terminals to explore a larger reordering search space. In experiments, we examined different head label sets to refine non-terminal X, including part-of-speech (POS) tags, coarsed POS tags, dependency labels.
 
 
 
'''About the Speaker:''' Junhui Li joined CLIP lab as a post-doc researcher from Aug 2012. He was previously a post-doc researcher in the Centre for Next Generation Localisation (CNGL), at Dublin City University from Feb 2011 to Jul 2012. Before that, he was a student at NLP Lab of Soochow University, China.
 
 
 
== Previous Talks ==
 
* [[CLIP Colloquium (Fall 2012)|Fall 2012]]
 
* [[CLIP Colloquium (Spring 2012)|Spring 2012]]
 
* [[CLIP Colloquium (Fall 2011)|Fall 2011]]
 
* [[CLIP Colloquium (Spring 2011)|Spring 2011]]
 
* [[CLIP Colloquium (Fall 2010)|Fall 2010]]
 

Revision as of 18:21, 6 June 2021

x

CLIP Colloquium

The CLIP Colloquium is a weekly speaker series organized and hosted by CLIP Lab. The talks are open to everyone. Most talks are held on Wednesday at 11AM in AV Williams 3258 unless otherwise noted. Typically, external speakers have slots for one-on-one meetings with Maryland researchers.

If you would like to get on the clip-talks@umiacs.umd.edu list or for other questions about the colloquium series, e-mail Doug Oard, the current organizer.

For up-to-date information, see the UMD CS Talks page. (You can also subscribe to the calendar there.)

Colloquium Recordings

Previous Talks

CLIP NEWS

  • News about CLIP researchers on the UMIACS website [1]
  • Please follow us on Twitter @umdclip [2]