Tuesday, July 22
Location: Gloria Ballroom

*Part I*
10:50 a.m. - 12:30 p.m.

- Introduction - Giuseppe Fiameni (10')
- Big Data Analysis: Trends & Challenges - Sonia Bergamaschi (30')
- Real-Time Big Data Analytics: Applications and Challenges - Nader Mohamed (20')
- Using Big Data to Support Automatic Word Sense Disambiguation - Giovanni Simonini - (20')
- From News to Facts: An Hadoop-based Social Graphs Analysis - Piera Laura Puglisi - (20')

*Part II*
13:30 p.m. - 15:35 p.m.

- Personalized Management of Semantic, Dynamic Data in Pervasive Systems: Context-Addict Revisited - Emanuele Panigati (20')
- ART Lab Infrastructure for Semantic Big Data Processing - Maria Teresa Pazienza (20')
- A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce - Fabio Fumarola (20')

- Ophidia: A Full Software Stack for Scientific Data Analytics - Sandro Fiore (20')
- Toward a Big Data Exploration Framework for Astronomical Archives - Ugo Becciani (20')

University of Milan-Bicocca has an open position for a Postdoctoral researcher in the context of the EU FP7 COMSODE Project ( The researcher will address research topics in the field of knowledge extraction and semantic matching methods for multi-lingual open data management.
Title of the position: “Knowledge extraction and information matching for profiling and linking multi-lingual open data sets.”
COMSODE is an SME-driven RTD project aimed at progressing the capabilities in the field of Open Data re-use. Our concept is an answer to barriers still present in this young area: data published by various Open Data catalogues are poorly integrated; quality assessment and cleansing are seldom addressed. Costs of Open Data consumption are high and Open Data usage is still poor. COMSODE addresses these challenges by developing components for the quality assessment and improvement of open data, possibly published as linked open data.
The postdoctoral researcher will support the research team in the field of linked open data, for example by investigating methods for the semantic profiling of open data sets and the annotation and linking of the data schemas.
Data-driven methods for schema extraction from weakly structured sources (tabular data and RDF) and for the alignment of Linked Open schemas will be investigated in order to support the user of the COMSODE platform to (i) publish data profiles that express the content of the data (often available in CSV format), so as to make the data searchable from inside the COMSODE platform, (2) semantically annotate the schema and link the schema elements to existing data sets, so as to make the data interlinked inside and outside the COMSODE platform.
Research can leverage previous work carried out by the research team at Unimib, some of which in collaboration with other international research groups, in the field of ontology matching for linked open data, facet extraction, cross-language link discovery, and schema integration at large-scale.
The research will be carried out in a highly collaborative environment: the candidate is expected to collaborate with members of our team (faculty members, postdocs and PhD students), other COMSODE partners, and researchers of other institutions.  The candidate is expected to publish his/her results by co-authoring papers on topics such as schema annotation, data profile, and data linking in high-profile conferences and journals in the area of data management and semantic web. 
Required/desired qualifications:
Requirements for applying to the position are:
- A PhD in one of these fields: Computer Science, Computer Engineering, Information Systems Management, Computational Linguistics, or related topics
- Good communication and writing skills in English
Other desired (but not mandatory) requested skills are
- Excellent programming skills
- Familiarity with Linked Data and Semantic Web languages (RDF, OWL2)
- Familiarity with probabilistic models or other matching methods 
- Python and shell scripting
- Previous experience with cloud infrastructure such as: Amazon EC2 and EBS, Heroku, etc.
Note that knowledge of Italian is NOT required for this position.
Salary, Start date and duration:
The salary is 32.600 euro  gross by year (approximate 2.000 euro net by month). The position will start on September 1, 2014 and is offered for a period of 1 year, with a possible extension subject to funding availability.
Milan is the main industrial, commercial and financial centre of Italy and a lively leading global city. University of Milan-Bicocca is a young but well-established university with a strong asset in research (it was ranked 21st in the on list of world's young universities by Times Higher Education). The research will be carried out at the ITIS Lab of the Department of Computer Science, Systems and Communication, which is involved in several European and national projects and conduct theoretical and applied research in several fields such as Semantic Web, Information Systems and Service Science. UniMiB offers facilities for postdoctoral researchers
As for every other equivalent position in Italy candidates have to apply to a public competition following the instructions available at . Please, notice that the application has to arrive at Unimib and registered before 7th July 2014.
For any information bout this position please contact Andrea Maurino (maurino{at} or Matteo Palmonari (palmonari{at}

Sonia Bergamaschi and Giovanni Simonini presented a paper at SEBD 2014:


Towards Declarative Imperative Data-parallel Systems (Discussion Paper).
Matteo Interlandi, Giovanni Simonini and Sonia Bergamaschi.

La progettazione dei sistemi informatizzati

Incontro di approfondimento sul DM 37/2008

Locandina dell'evento

Our world is in the midst of a revolution in information technology, at the centre of which is Big Data. We generate, collect, process, and store data at phenomenally high rates. Communication, entertainment, financial and health services, social networking and mobile services are just a few examples of how our day-to-day interactions have transformed into data exchanges. Science has entered a data-centric “fourth paradigm” which complements theory, experimentation and simulation with analytics on massive scientific data, enabling discoveries from Big Data based on empirical principles; the fourth paradigm constitutes a fundamental change in the scientific methods as we know them. Big Data is a resource that can be turned into economic, cultural and scientific value. At the 2014 Research Day international top experts in managing Big Data will explore in a series of talks both opportunities and risks that the Big Data revolution brings along, and EPFL researchers will provide an opportunity to have a glimpse of EPFL research highlights on Big Data.

Anastasia Ailamaki and Karl Aberer, Chairs

Poster of the event

Smart Data enabling Personalized Digital Health (4 Giugno 2014, ore 16:00, AULA FA2C)



The proliferation of smartphones and sensors, the continuous monitoring of physiology and environment (personal health signals), notifications from public health sources (public health signals), and more digital access to clinical data, are resulting in massive multisensory and multimodal observational data.  The technology has significant potential to improve health and well-being, through early detection, better diagnosis, effective prevention and treatment of a disease; and improved the quality of life. However, to make this personalized digital medicine a reality, it is crucial to derive actionable insights from data including heterogeneous and fine-grained observations.


At Kno.e.sis, we have collaborations with clinicians in growing number of specializations (Cardiovascular, Pulmonology, Gastroenterology)  to study personalized health decision making that involve the use of real-world patient data, deep background knowledge and well targeted clinical applications. For example:


  • For a patient discharged from hospital with Acute Decompensated Heart Failure, can  we compute post hospital discharge risk factor to reduce 30-day readmissions?
  • For children with Asthma,  can we predict an impending attack to enable actions that prevent an attack reducing the need for post-attack symptomatic relief?
  • For Parkinson’s Disease,  can we characterize the progression to adjust medication and therapeutic changes?


The above provides the context for a research agenda around what I call Smart Data, which (a) provides value from harnessing the challenges posed by volume, velocity, variety and veracity  of Big Data, in-turn providing actionable information and improve decision making, and/or (b) is focused on the actionable value achieved by human involvement in data creation, processing and consumption phases for improving the Human experience.  In describing Smart Data approach to above heath applications, I will cover the following technical capabilities that adds semantics to enhance or complement traditional NLP and ML centric solutions:


  • Semantic Sensor Web- including semantic computation infrastructure, ability to semi-automatically create domain specific background knowledge (ontology) from unstructured data (e.g., EMR), and automatically do semantic annotation of multimodal and multisensory data
  • Semantic perception – convert low level signals into higher level abstractions using IntellegO framework that utilizes domain knowledge and hybrid abductive/deductive reasoning
  • Intelligence at Edge - perform scalable and efficient semantic computation on resource constrained devices

Transforming Big Data into Smart Data (5 Giugno 2014, ore 11:00 AULA FA2A):


Deriving Value via harnessing Volume, Variety, and Velocity using semantic techniques and technologies




Big Data has captured a lot of interest in industry, with anticipation of better decisions, efficient organizations, and many new jobs. Much of the emphasis is on the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity, and technologies that handles volume, including storage and computational techniques to support analysis (Hadoop, NoSQL, MapReduce, etc), and.  However, the most important feature of Big Data, the raison d'etre, is none of these 4 Vs -- but value. In this talk, I will forward the concept of Smart Data that is realized by extracting value from a variety of data, and how Smart Data for growing variety (e.g., social, sensor/IoT, health care) of Big Data enable much larger class of applications that can benefit not just large companies but each individual. This requires organized ways to harness and overcome the four V-challenges. In particular, we will need to utilize metadata, employ semantics and intelligent processing, and go beyond traditional reliance on ML and NLP.


For harnessing volume, I will discuss the concept of Semantic Perception, that is, how to convert massive amounts of data into information, meaning, and insight useful for human decision-making. For dealing with Variety, I will discuss experience in using agreement represented in the form of ontologies, domain models, or vocabularies, to support semantic interoperability and integration.  Lastly, for Velocity, I will discuss somewhat more recent work on Continuous Semantics, which seeks to use dynamically created models of new objects, concepts, and relationships and uses them to better understand new cues in the data that capture rapidly evolving events and situations.  


Smart Data applications in development at Kno.e.sis come from the domains of personalized health, energy, disaster response and smart city. I will present examples from a couple of these. 

Speaker Bio: Amit P. Sheth ( is an educator, researcher, and entrepreneur. He is the LexisNexis Eminent Scholar and founder/executive director of the Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis). Kno.e.sis conducts research in social/sensor/semantic data and Web 3.0 with real-world applications and multidisciplinary solutions for translational research, healthcare and life sciences, cognitive science, material sciences, etc. Kno.e.sis' activities have resulted in Wright State University being recognized as a top organization in the world on World Wide Web in research impact. Prof. Sheth is one of top authors in Computer Science, World Wide Web and databases (cf: Microsoft Academic Search; Google H-index=85). His research has led to several commercial products, many real-world applications, and two earlier companies with two more in early stages. One of these was Taalee/Voquette/Semagix, which was likely the first company (founded in 1999) that developed Semantic Web enabled search and analysis, and semantic application development platforms. He is founding EIC of IJSWIS and co-EIC of IJSWIS and DAPD.

Website of the conference: IC3K 2014


Big data is a popular term for  describing the exponential growth, availability and use of information, both structured and unstructured. Much has been written on the big data trend and its potentiality for innovation and growth of enterprises. The advise of IDC (one of the premier advisory firm  specialized in information technology) for organizations and IT leaders is to focus on the ever-increasing volume, variety and velocity of information that forms big data.
In most cases, such huge volume of data comes from multiple sources and across heterogeneous systems, thus, data have to be  to linked, matched, cleansed and transformed. Moreover,  it is necessary to determine how disparate data relates to common definitions and how to systematically integrate structured and unstructured data assets to produce useful, high-quality and up-to-date information. 
The research area of Data Integration, active since 90s, provided good techniques for facing  the above issues in a unifying framework, Relational Databases (RDB), with reference to a less complex scenario (smaller volume, variety and velocity). Moreover, simpler forms of integration among different databases can be efficiently resolved by Data Federation technologies used for DBMS today.
Adopting RDB as a general framework for big data integration and solving the issues above, namely volume, variety, variability and velocity, by using more powerful RDBMs technologies enhanced with data integration techniques is a possible choice. On the other hand, new emerging technologies came into play: NOSQL systems and technologies, datawarehouse appliances platforms provided by the major software players, data governance platforms, etc.
In this talk, prof. Sonia Bergamaschi will provide an overview of this exciting field that will become more and more important.

Friday, 11 April 2014 14:06

Presentation at WEBIST 2014 Conference

Written by

Laura Po attended the 10th International Conference on Web Information Systems and Technologies WEBIST 2014.
The conference was held in Barcelona, Spain from the 3th to the 5th of April.
Laura was the session chair and the presenting author for the paper "Comparing Topic Models for a Movie Recommendation System", paper written in collaboration with Sonia Bergamaschi and Serena Sorrentino. The presentation is now available online at slideshare.

Copyright @  2018   DataBase Group for suggestions write to  Webmaster