Hospitals and life-science institutes produce a tremendous amount of data on
a daily basis during the healthcare process and ordinary scientific
activity. Such data are highly valuable as they can be used to improve the
process of care delivery and prevention and can also play a pivotal role in
prospective clinical research. However, clinical, biological and imaging
data are usually gathered by means of diverse data collection channels and
procedures exhibiting a diverse degree of reliability and trustability. As
a consequence, the collected data is usually scattered over heterogeneous
data sources and suffers from quality problems that hampers its use for
analysis purposes.

Classical data quality issues can be observed, including missing or
erroneous data, and also more complex problems can be perceived, for
example due to secondary use in different contexts than the ones they were
meant to be collected for. Additionally, the distribution of data can
evolve over time creating “data-glitches” than can cause interpretation
errors of high severity.

