Our world is in the midst of a revolution in information technology, at the centre of which is Big Data. We generate, collect, process, and store data at phenomenally high rates. Communication, entertainment, financial and health services, social networking and mobile services are just a few examples of how our day-to-day interactions have transformed into data exchanges. Science has entered a data-centric “fourth paradigm” which complements theory, experimentation and simulation with analytics on massive scientific data, enabling discoveries from Big Data based on empirical principles; the fourth paradigm constitutes a fundamental change in the scientific methods as we know them. Big Data is a resource that can be turned into economic, cultural and scientific value. At the 2014 Research Day international top experts in managing Big Data will explore in a series of talks both opportunities and risks that the Big Data revolution brings along, and EPFL researchers will provide an opportunity to have a glimpse of EPFL research highlights on Big Data.

Anastasia Ailamaki and Karl Aberer, Chairs

Poster of the event

Smart Data enabling Personalized Digital Health (4 Giugno 2014, ore 16:00, AULA FA2C)



The proliferation of smartphones and sensors, the continuous monitoring of physiology and environment (personal health signals), notifications from public health sources (public health signals), and more digital access to clinical data, are resulting in massive multisensory and multimodal observational data.  The technology has significant potential to improve health and well-being, through early detection, better diagnosis, effective prevention and treatment of a disease; and improved the quality of life. However, to make this personalized digital medicine a reality, it is crucial to derive actionable insights from data including heterogeneous and fine-grained observations.


At Kno.e.sis, we have collaborations with clinicians in growing number of specializations (Cardiovascular, Pulmonology, Gastroenterology)  to study personalized health decision making that involve the use of real-world patient data, deep background knowledge and well targeted clinical applications. For example:


  • For a patient discharged from hospital with Acute Decompensated Heart Failure, can  we compute post hospital discharge risk factor to reduce 30-day readmissions?
  • For children with Asthma,  can we predict an impending attack to enable actions that prevent an attack reducing the need for post-attack symptomatic relief?
  • For Parkinson’s Disease,  can we characterize the progression to adjust medication and therapeutic changes?


The above provides the context for a research agenda around what I call Smart Data, which (a) provides value from harnessing the challenges posed by volume, velocity, variety and veracity  of Big Data, in-turn providing actionable information and improve decision making, and/or (b) is focused on the actionable value achieved by human involvement in data creation, processing and consumption phases for improving the Human experience.  In describing Smart Data approach to above heath applications, I will cover the following technical capabilities that adds semantics to enhance or complement traditional NLP and ML centric solutions:


  • Semantic Sensor Web- including semantic computation infrastructure, ability to semi-automatically create domain specific background knowledge (ontology) from unstructured data (e.g., EMR), and automatically do semantic annotation of multimodal and multisensory data
  • Semantic perception – convert low level signals into higher level abstractions using IntellegO framework that utilizes domain knowledge and hybrid abductive/deductive reasoning
  • Intelligence at Edge - perform scalable and efficient semantic computation on resource constrained devices

Transforming Big Data into Smart Data (5 Giugno 2014, ore 11:00 AULA FA2A):


Deriving Value via harnessing Volume, Variety, and Velocity using semantic techniques and technologies




Big Data has captured a lot of interest in industry, with anticipation of better decisions, efficient organizations, and many new jobs. Much of the emphasis is on the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity, and technologies that handles volume, including storage and computational techniques to support analysis (Hadoop, NoSQL, MapReduce, etc), and.  However, the most important feature of Big Data, the raison d'etre, is none of these 4 Vs -- but value. In this talk, I will forward the concept of Smart Data that is realized by extracting value from a variety of data, and how Smart Data for growing variety (e.g., social, sensor/IoT, health care) of Big Data enable much larger class of applications that can benefit not just large companies but each individual. This requires organized ways to harness and overcome the four V-challenges. In particular, we will need to utilize metadata, employ semantics and intelligent processing, and go beyond traditional reliance on ML and NLP.


For harnessing volume, I will discuss the concept of Semantic Perception, that is, how to convert massive amounts of data into information, meaning, and insight useful for human decision-making. For dealing with Variety, I will discuss experience in using agreement represented in the form of ontologies, domain models, or vocabularies, to support semantic interoperability and integration.  Lastly, for Velocity, I will discuss somewhat more recent work on Continuous Semantics, which seeks to use dynamically created models of new objects, concepts, and relationships and uses them to better understand new cues in the data that capture rapidly evolving events and situations.  


Smart Data applications in development at Kno.e.sis come from the domains of personalized health, energy, disaster response and smart city. I will present examples from a couple of these. 

Speaker Bio: Amit P. Sheth ( is an educator, researcher, and entrepreneur. He is the LexisNexis Eminent Scholar and founder/executive director of the Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis). Kno.e.sis conducts research in social/sensor/semantic data and Web 3.0 with real-world applications and multidisciplinary solutions for translational research, healthcare and life sciences, cognitive science, material sciences, etc. Kno.e.sis' activities have resulted in Wright State University being recognized as a top organization in the world on World Wide Web in research impact. Prof. Sheth is one of top authors in Computer Science, World Wide Web and databases (cf: Microsoft Academic Search; Google H-index=85). His research has led to several commercial products, many real-world applications, and two earlier companies with two more in early stages. One of these was Taalee/Voquette/Semagix, which was likely the first company (founded in 1999) that developed Semantic Web enabled search and analysis, and semantic application development platforms. He is founding EIC of IJSWIS and co-EIC of IJSWIS and DAPD.

Website of the conference: IC3K 2014


Big data is a popular term for  describing the exponential growth, availability and use of information, both structured and unstructured. Much has been written on the big data trend and its potentiality for innovation and growth of enterprises. The advise of IDC (one of the premier advisory firm  specialized in information technology) for organizations and IT leaders is to focus on the ever-increasing volume, variety and velocity of information that forms big data.
In most cases, such huge volume of data comes from multiple sources and across heterogeneous systems, thus, data have to be  to linked, matched, cleansed and transformed. Moreover,  it is necessary to determine how disparate data relates to common definitions and how to systematically integrate structured and unstructured data assets to produce useful, high-quality and up-to-date information. 
The research area of Data Integration, active since 90s, provided good techniques for facing  the above issues in a unifying framework, Relational Databases (RDB), with reference to a less complex scenario (smaller volume, variety and velocity). Moreover, simpler forms of integration among different databases can be efficiently resolved by Data Federation technologies used for DBMS today.
Adopting RDB as a general framework for big data integration and solving the issues above, namely volume, variety, variability and velocity, by using more powerful RDBMs technologies enhanced with data integration techniques is a possible choice. On the other hand, new emerging technologies came into play: NOSQL systems and technologies, datawarehouse appliances platforms provided by the major software players, data governance platforms, etc.
In this talk, prof. Sonia Bergamaschi will provide an overview of this exciting field that will become more and more important.

Friday, 11 April 2014 14:06

Presentation at WEBIST 2014 Conference

Written by

Laura Po attended the 10th International Conference on Web Information Systems and Technologies WEBIST 2014.
The conference was held in Barcelona, Spain from the 3th to the 5th of April.
Laura was the session chair and the presenting author for the paper "Comparing Topic Models for a Movie Recommendation System", paper written in collaboration with Sonia Bergamaschi and Serena Sorrentino. The presentation is now available online at slideshare.

Sull’inserto settimanale “Eventi” (Anno 7, Numero 13, Lunedì 24 marzo 2014) a cura de Il Sole 24 ORE   si parla dell'attività di ricerche del DBGroup:

Più informazione dai big data

Dalla ricerca pura al trasferimento tecnologico: nuove soluzioni per gestire i big data 


Leggi l’articolo.

The DBGROUP contributed to the  whitepaper “UNLEASHING THE POTENTIAL OF BIG DATA” (link),  based on the 2013 World Summit on Big Data and Organization Design, initiated by the Organizational Design Community (ODC) and co-sponsored by IBM.

Sonia Bergamaschi è organizzatrice della Sessione special su Big Data Principles, Architectures & Applications presso l’International Conference on High Performance Computing & Simulation 2014. La conferenza si terrà a Bologna dal 21 al 25 Luglio. Le sottomissioni sono aperte fino al 31 Marzo.

per info:

Tesi su Big Data negli USA: gli studenti della laurea magistrale in Ingegneria Informatica Paolo Malavolta e Emanuele Charambalis svolgono la tesi magistrale su BIG DATA con tutoraggio congiunto della Professoressa Sonia Bergamaschi e del prof. H.V. Jagadish: (H V Jagadish Bernard A Galler Collegiate Professor Elec. Engg. and Computer Science. della University of Michigan) presso il Dipartimento di Ingegneria “Enzo Ferrari” dell’Università di Modena e Reggio Emilia e i presso l’Università del Michigan.
Si ringraziano gli sponsor, per la cifra di euro 2000 cadauno : ing. Ragni presidente dell'Associazione Specialisti di Sistemi Informativi di Bologna (assi-bo)

e l'ing. Orsini, presidente della spin-off UNIMORE DATARIVER

Copyright @  2019   DataBase Group for suggestions write to  Webmaster