Apologies for multiple postings
CALL FOR PARTICIPATION
IEEE Workshop on Knowledge Acquisition from Distributed, Autonomous,
Semantically Heterogeneous Data and Knowledge Sources
Half-day workshop held from 1:15 to 6pm on November 27, 2005,
Houston, Texas, USA
In conjunction with The Fifth IEEE International Conference on Data
Mining, Houston, Texas, USA, November 27-30, 2005
INVITED SPEAKER: Dr. Bertram Ludaescher
"Scientific Data Integration: From the Big Picture to some Gory Details"
ABSTRACT. Many scientific disciplines, ranging from nuclear physics,
computational chemistry, geoinformatics, bioinformatics, ecoinformatics,
to astronomy and cosmology are highly dependent on effective and
ways to manage and integrate scientific data. In this talk, I will focus
on the scientific data integration challenges from two large-scale
projects, the Geosciences Network (GEON), which is building
"cyberinfrastructure" and tools for the geosciences community, and the
Science Environment for Ecological Knowledge (SEEK) having a similar
mission to enable data integration and analysis for the ecological
sciences. Looking at the big picture, it turns out that data integration
is only one aspect of a set of larger scientific data management and
analysis challenges. Technologies in support of design and execution of
scientific workflows, including knowledge-based approaches, are
to address these larger issues. While interest in scientific
gaining momentum, many of the gory details still require considerable
attention and research effort. In the second part of this talk, I will
drill-down into some of these issues, such as the use of knowledge
representation techniques to support data integration and scientific
workflow design and their relation to current data integration
studied by the database community.
ABOUT THE SPEAKER. Dr. Ludaescher is an Associate Professor in the
Department of Computer Science at UC Davis, faculty member of the UC
Genome Center, and Fellow of the San Diego Supercomputer Center, UC San
Diego. His primary research interests are in scientific data management,
in particular scientific data integration, scientific workflow
and knowledge-based extensions thereof. Until his move to Davis, he
member of the NIH-funded Biomedical Informatics Research Network
Coordination Center (BIRN-CC) at UC San Diego, focusing on database
mediation and knowledge representation issues. He is actively
several large-scale, collaborative scientific data management projects,
i.e., the DOE Scientific Data Management Center (SciDAC/SDM), the NSF/
Science Environment for Ecological Knowledge (SEEK), and NSF/ITR
Geosciences Network (GEON). Dr. Ludaescher received his MS in Computer
Science from the Technical University of Karlsruhe in 1992 and his
Computer Science from the University of Freiburg in 1998 (both in
Germany). From 1998 to 2004 he worked as a researcher at the San Diego
Supercomputer Center, at the end as a lab director for Knowledge-Based
* Supporting Query-driven Mining over Autonomous Data Sources
* Combining Document Clusters Generated from Syntactic and Semantic
Feature Sets using Tree Combination Methods
Mahmood Hossain, Susan Bridges, Yong Wang and Julia Hodges
* Automatically Extracting Subsequent Response Pages from Web Search
Dheerendranath Mundluru, Zonghuan Wu, Vijay Raghavan, Weiyi Meng
and Hongkun Zhao
* Collaborative Package-Based Ontology Building and Usage
Jie Bao and Vasant Honavar
* OntoQA: Metric-Based Ontology Quality Analysis
Samir Tartir, I. Budak Arpinar, Michael Moore, Amit P. Sheth and
* A Heuristic Query Optimization for Distributed Inference on Life-
Takahiro Kosaka, Susumu Date, Hideo Matsuda and Shinji Shimojo
TOPICS OF INTEREST
Topics of interest include, but are not restricted to:
* Challenges presented by emerging data-rich application domains
such as bioinformatics, health informatics, security informatics,
social informatics, environmental informatics.
* Knowledge discovery from distributed data (assuming different
types of data fragmentation, e.g., horizontal or vertical data
fragmentation; different hypothesis classes, e.g., naïve Bayes,
decision tree; different performance criteria, e.g., accuracy versus
complexity versus reliability of the model generated, etc.).
* Making semantically heterogeneous data sources self-describing
(e.g., by explicitly associating ontologies with data sources and
mappings between them) in order to help collaborative science .
* Representation, manipulation, and reasoning with ontologies and
mappings between ontologies.
* Learning ontologies from data (e.g., attribute value taxonomies).
* Learning mappings between semantically heterogeneous data source
schemas and between their associated ontologies.
* Knowledge discovery in the presence of ontologies (e.g., attribute
value taxonomies) and partially specified data (data described at
different levels of abstraction within an ontology)?
* Online query relaxation when an initial query posed to the data
sources fails (i.e., returns no tuples), or equivalently, query-
driven mining of the individual sources that will result in knowledge
that can be used for query relaxation.
Doina Caragea, [hidden email]
Iowa State University
Vasant Honavar, [hidden email]
Iowa State University
Ion Muslea, [hidden email]
Language Weaver, Inc.
Raghu Ramakrishnan, [hidden email]
University of Wisconsin-Madison
Naoki Abe, IBM
Liviu Badea, ICI, Romania
Doina Caragea, Iowa State Univ.
Marie desJardins, UMBC
C. Lee Giles, Penn State Univ.
Vasant Honavar, Iowa State Univ.
Hillol Kargupta, UMBC
Sally McClean, U. of Ulster, UK
Bamshad Mobasher – DePaul U.
Ion Muslea, Language Weaver, Inc.
C. David Page, Univ. of Wisconsin
Alexandrin Popescul - Ask Jeeves
Raghu Ramakrishnan, Univ. of Wisconsin
Steffen Staab – Univ. of Koblenz
For more information, please visit the workshop page at:
We look forward to meeting you in Houston!
|Free forum by Nabble||Edit this page|