Research Unit: University of Bologna

Department of Electronics, Computer sciences and Systems

 

Research Program of the Unit (model B)

Research Program Coordinator of the Unit

Prof. Ciaccia Paolo

Department of ELECTRONICS, COMPUTER SCIENCES AND SYSTEMS
Faculty of ENGINEERING
University of BOLOGNA

Viale Risorgimento, 2 - 40136 Bologna, Italy
Tel  :+39 051 2093070
Fax :+39 051 2093540

E-mail: pciaccia@deis.unibo.it
Home page: http://www-db.deis.unibo.it/~pciaccia

Participants to this Research Unit
 
Participant
Department
Qualification
CIACCIA PAOLO Dep. of ELECTRONICS, COMPUTER SCIENCES AND SYSTEMS Full Professor
RIZZI STEFANO Dep. of ELECTRONICS, COMPUTER SCIENCES AND SYSTEMS Full Professor
GOLFARELLI MATTEO Dep. of ELECTRONICS, COMPUTER SCIENCES AND SYSTEMS Associate Professor
PATELLA MARCO Dep. of ELECTRONICS, COMPUTER SCIENCES AND SYSTEMS Researcher
PENZO WILMA Dep. of ELECTRONICS, COMPUTER SCIENCES AND SYSTEMS Researcher
BARTOLINI ILARIA Dep. of ELECTRONICS, COMPUTER SCIENCES AND SYSTEMS Researcher
BALDACCI LORENZO Dep. of ELECTRONICS, COMPUTER SCIENCES AND SYSTEMS Ph.D. Student
LINARI ALESSANDRO Dep. of ELECTRONICS, COMPUTER SCIENCES AND SYSTEMS Ph.D. Student
MATTEO DI CARLO Dep. of ELECTRONICS, COMPUTER SCIENCES AND SYSTEMS Contract

Specific Title of the Research Program of this Unit

Distributed query processing based on domain ontologies and source profiles

 
Description of the Research Program of this Unit

The topics the Bologna Unit will deal with are part of the resarch Themes 1 and 3, and can be summarized as follows:
Theme 1:
- “Content summaries” creation for information sources (Content summaries)
Theme 3:
- Distributed query execution in WISDOM (Execution)
- Usage and navigation, based on ontologies, of the query results (Navigation)

In particular, the task concerning “Content summaries” aims to deliver a characterization (“profile”) of the data sources from the statistical point of view in order to accurately evaluate their relevance with respect to a given query and, consequently, to allow a
smart selection of the most relevant data sources. The “Execution” task deals with aspects related to the distributed execution of queries on different data sources and their coordination/synchronization, so as to determine, with a minimal amount of resources,
the most relevant results. Finally, the “Navigation” task, which takes place after query processing, is aimed at defining mechanisms for exploiting the results in a synthetic and flexible representation by relying on the multiple abstraction levels available with a
specific domain ontology.
Research on such topics, given the project structure, will be organized as follows:

PHASE 1
The first phase will be devoted to accurately define the requirements for the 3 research topics, then we will analyze and criticize the related literature in order to identify the limits of the available solutions with respect to our current goals. In detail:
(Content summaries) The analysis phase will survey state-of-the-art techniques for building profiles, in order to determine how they can be extended to the case where a data source is described by a domain ontology. In particular, we will define the requirements
that must be satisfied by the content summaries so as to ensure that they can be effectively exploited to determine the relevance of a data source in answering a query.
(Execution) A thorough analysis of the different distributed query processing techniques will be carried on, so as to highlight the limits of such techniques with respect to the WISDOM architecture (we remind that in WISDOM a data source is externally perceived only through its domain ontology). In particular, the different aspects that may influence the relevance of a result will be analyzed to see at which extent they are influenced by the WISDOM architecture.
(Navigation) We will analyze how query results can be elaborated in order to be returned to the user in a compact and easy to use form. Then, we will evaluate how navigation and aggregation techniques experienced in business intelligence and data mining can be combined in order to ensure the maximum flexibility in choosing the level of granularity for presenting data. Furthermore, we will study how the paradigms devised for visual querying databases can be extended to queries that involve the use of ontologies.
Finally, we will work, together with the other Units, on the definition of the methodological and functional architecture for the whole project (deliverable D0.R1).

PRODUCTS
The expected deliverables in this phase are technical reports (R). The number after the letter D represents the theme (0 for propdotti common to all the themes).
D0.R1 Report on the methodological and functional architecture (in collaboration with Modena e Reggio Emilia - MO, Roma - RM, Trento - TN)
D1.R1 Review of the languages and emerging standards for ontologies (in collaboration with MO, RM, TN)
D3.R1 Review of the query languages and of the rewriting techniques based on ontologies (in collaboration with MO, TN)
D3.R2 Review of query processing techniques in heterogeneous environments

 

PHASE 2
During the second phase we will work on solutions for the 3 topics handled by the Unit:
(Content summaries) During the second phase we will define the mechanisms for adding numeric information to the domain ontologies. The basic idea is to extend the existing techniques for “probing” the data sources by considering ontological information and the derived constraints. The extension will be inspired by economy principles as: 1) require as few “probes” as possible, and 2) return the most significant quantitative information given fixed quantity of memory for storing content summaries. According to the targets of Theme 1, we will specify the update methods for the content summaries when a new data source is added and when the corresponding domain ontology is extended.
(Execution) The aim of this phase is the definition of a set of techniques for the execution of distributed queries that, considering the limits imposed by WISDOM architecture, return the most relevant results while minimizing the used resources. Since the relevance of a given object depends on several factors and on their relationships, the techniques that will be developed will be very general in order to be capable of working properly and efficiently even when the combination criterion is changed. For this criterion, which initially may be implemented as a weighted sum of the different factors, we will also consider the more general and expressive “qualitative” case, that is, based not only on numerical techniques.
(Navigation) As concern the exploitation of the query results, we will identify the techniques necessary to precisely define the desired granularity level. In particular, we will define the compact and rich in semantic representations for information available at different
abstraction levels, and we will identify the operators necessary for an interactive navigation on the different levels according to the domain ontology.
Finally, we will work, in collaboration with the other Units, to the definition of the interfaces of the different components of the integrated prototype (deliverable D0.R2).

PRODUCTS
D0.R2 Specification of the component interfaces of the integrated prototype (in collaboration with MO, RM, TN)
D1.R2 Definition of the language for the specification of domain ontologies (in collaboration with MO, TN)
D1.R3 Definition of the techniques for the creation of content summaries
D3.R3 Definition of the query language and of the ontology-based query rewriting techniques (in collaboration with MO, TN)
D3.R4 Definition of query execution techniques in the WISDOM environment

 

PHASE 3
During the third phase we will develop 3 prototypes and, jointly with the other Units, we will collaborate to the integration of the prototypes developed in the project.
The first prototype, starting from a pre-existing domain ontology, will implement “probing” techniques for the corresponding data sources and it will define algorithms for building the content summaries starting from the results obtained.
The second prototype (joint work with the MO Unit) will accept and analyze user queries. It will also determine the sets of relevant sources for the query at hand.
The third prototype will implement the query execution techniques defined during phase 2. Further, it will include an interface for an ontology-based interactive navigation at different abstraction levels.
An extensive experimental activity will be carried on in order to asses the performance of prototypes.

PRODUTCS
Deliverables expected for this phase are software prototypes (P).
D0.P1 Integrated system prototype (in collaboration with MO, RM, TN)
D1.P2 Prototype for the creation of content summaries
D3.P1 Prototype for query specification (in collaboration with MO)
D3.P2 Prototype of the query execution engine