ACL-SIGLEX 2005 Workshop on Deep Lexical Acquisition

ACL 2005 post-conference workshop

30 June, 2005

Ann Arbor, USA

Workshop Description

In natural language processing (NLP), there is a pressing need to develop deep lexical resources (e.g. lexicons for linguistically-precise grammars, template sets for information extraction systems, ontologies for word sense disambiguation). Such resources are critical for enhancing the performance of systems and for improving their portability between domains. For example, to perform reliably, an information extraction system needs access to high-quality lexicons or templates specific to the task at hand.

Most deep lexical resources have been developed manually by lexicographers. Manual work is costly and the resulting resources have limited coverage, and require labour-intensive porting to new tasks. Automatic lexical acquisition is a more promising and cost-effective approach to take, and is increasingly viable given recent advances in NLP and machine learning technology, and corpus availability.

While advances have recently been made in some areas of automatic deep lexical acquisition, a number of important challenges need addressing before benefits can be reaped in practical language engineering:

  • Acquisition of deep lexical information from corpora

    While corpus data has been successfully applied in learning certain types of deep lexical information (e.g. semantic relations, subcategorization, selectional preferences), there remain a broad range of lexical relations that corpus-based techniques have yet to be applied to.

  • Accurate, large-scale, portable acquisition techniques

    One of the biggest current research challenges is how to improve the accuracy of existing acquisition techniques further, at the same time as improving both scalability and robustness.

  • Use of deep lexical acquisition in recognised applications

    Although lexical acquisition has the potential to boost performance in many NLP application tasks, this has yet to be demonstrated for many important applications.

  • Multilingual deep lexical acquisition

    For theoretical and practical reasons it is important to test whether techniques developed for one language (typically English) can be used to benefit research on other languages.

Target Audience

The workshop will be of interest to anyone interested in automatically acquired deep lexical information, e.g. in the areas of computational grammars, computational lexicography, machine translation, information retrieval, question-answering, and text mining.

Areas of Interest

  • Automatic acquisition of deep lexical information:
    • subcategorization
    • diathesis alternations
    • selectional preferences
    • lexical / semantic classes
    • qualia structure
    • lexical ontologies
    • semantic roles
    • word senses
  • Methods for supervised, unsupervised and weakly supervised deep lexical acquisition (machine learning, statistical, example- or rule-based, hybrid etc.)
  • Large-scale, cross-domain, domain-specific and portable deep lexical acquisition
  • Extending and refining existing lexical resources with automatically acquired information
  • Evaluation of deep lexical acquisition
  • Application of deep lexical acquisition to NLP applications (e.g. machine translation, information extraction, language generation, question-answering)
  • Multilingual deep lexical acquisition

Important Dates

Workshop date: 30 June, 2005


08:55-09:00 Opening Remarks
09:00-09:30 Data Homogeneity and Semantic Role Tagging in Chinese
  Oi Yee Kwong and Benjamin K. Tsou
09:30-10:00 Verb Subcategorization Kernels for Automatic Semantic Labeling
  Alessandro Moschitti and Roberto Basili
10:00-10:30 Identifying Concept Attributes Using a Classifier
  Massimo Poesio and Abdulrahman Almuhareb
10:30-11:00 Coffee Break
11:00-11:30 Automatically Learning Qualia Structures from the Web
  Philipp Cimiano and Johanna Wenderoth
11:30-12:00 Automatically Distinguishing Literal and Figurative usages of Highly Polysemous Verbs
  Afsaneh Fazly, Ryan North and Suzanne Stevenson
12:00-12:30 Automatic Extraction of Idioms using Graph Analysis and Asymmetric Lexicosyntactic Patterns
  Dominic Widdows and Beate Dorow
12:30-14:00 Lunch
14:00-14:30 Frame Semantic Enhancement of Lexical-Semantic Resources
  Rebecca Green and Bonnie J. Dorr
14:30-15:30 It might be deep enough, but is it broad enough? Diversity in the lexicon
  Invited Speaker - Chris Brew, Ohio State University

This workshop is about deep lexical resources and their acquisition. In an ideal world, what would such resources be like? In other words, what aspects of the syntax, semantics, phonetics, pragmatics and sociolinguistics of language do we want to see encoded in the lexicon.  Without disrespect to subcategorization, diathesis alternations, selectional preferences,lexical / semantic classes, qualia structure, lexical ontologies, semantic roles and word senses, all of which are explicit topics of the workshop, what else should we be thinking about?  Topics addressed will include Greek babies, German verbs, English swearwords and spoken language morphology, all from a computational perspective.

15:30-16:00 Coffee Break
16:00-16:30 Bootstrapping Deep Lexical Resources: Resources for Courses
  Timothy Baldwin
16:30-17:00 Morphology vs. Syntax in Adjective Class Acquisition
  Gemma Boleda, Toni Badia and Sabine Schulte im Walde
17:00-17:30 Automatic Acquisition of Bilingual Rules for Extraction of Bilingual Word Pairs from Parallel Corpora
  Hiroshi Echizen-ya, Kenji Araki and Yoshio Momouchi
17:30-18:00 Approximate Searching for Distributional Similarity
  James Gorman and James R. Curran
18:00-18:05 Closing Remarks

Organising Committee

Timothy Baldwin
University of Melbourne, Australia

Anna Korhonen
University of Cambridge, UK

Aline Villavicencio
University of Essex, UK

Programme Committee

Collin Baker (University of California Berkeley, USA)
Roberto Basili (University of Rome Tor Vergata, Italy)
Francis Bond (NTT, Japan)
Chris Brew (Ohio State University, USA)
Ted Briscoe (University of Cambridge, UK)
John Carroll (University of Sussex, UK)
Stephen Clark (University of Oxford, UK)
Sonja Eisenbeiss (University of Essex, UK)
Christiane Fellbaum (University of Princeton, USA)
Frederick Fouvry (University of Saarland, Germany)
Sadao Kurohashi (University of Tokyo, Japan)
Diana McCarthy (University of Sussex, UK)
Rada Mihalcea (University of North Texas, USA)
Tom O'Hara (University of Maryland, Baltimore County, USA)
Martha Palmer (University of Pennsylvania, USA)
Massimo Poesio (University of Essex, UK)
Philip Resnik (University of Maryland, USA)
Patrick Saint-Dizier (IRIT-CNRS, France)
Sabine Schulte im Walde (University of Saarland, Germany)
Mark Steedman (University of Edinburgh, Scotland, UK)
Mark Stevenson (University of Sheffield, UK)
Suzanne Stevenson (University of Toronto, Canada)
Dominic Widdows (MAYA Design, Inc., USA)
Yorick Wilks (University of Sheffield, UK)
Dekai Wu (Hong Kong University of Science and Technology)
