Mini Course MAKE-Health:
Machine Learning & Knowledge Extraction
for Health Informaticspierre-simon-de-laplace-machine-learning

“It is remarkable that a science which began with the consideration of games of
chance should have become the most important object of human knowledge”
Pierre Simon de Laplace, 1812.

Summer Term 2017
Venue: Universita di Verona > Dipartimento di Informatica

Week 15, April 10-14, 2017
Lecture Room: Strada Le Grazie 15 – 37134 VERONA machine-learning-health-informatics-lnai-9605-holzinger

Short Description: The mini-course MAKE-Health follows a research-based teaching (RBT) approach and discusses experimental methods for combining human intelligence with machine learning to extract and discover knowledge from health data. For practical applications we focus on Python – which is to date the worldwide most used  language for machine learning and knowledge extraction.

Motto: Science is to test crazy ideas – Engineering is to put these ideas into Business.


Machine Learning & Health Informatics is a Growing Challenge:

Machine learning (ML) is the most growing field in computer science (Jordan & Mitchell, 2015. Machine learning: Trends, perspectives, and prospects. Science, 349, (6245), 255-260), and it is well accepted that health informatics is amongst the greatest challenges (LeCun, Bengio, & Hinton, 2015. Deep learning. Nature, 521, (7553), 436-444).
Future Medicine will be a data science and Privacy aware machine (un-)learning is no longer a nice to have, but a must.

Internationally outstanding universities count on the combination of machine learning and health informatics and expand these fields, for example: Carnegie-Mellon UniversityHarvardStanford – just to name a few!

Machine Learning & Health Informatics pose enormous Business Opportunities:

McKinsey: An executive’s guide to machine learning
NY Times: The Race Is On to Control Artificial Intelligence, and Tech’s Future
Economist: Million-dollar babies

Machine Learning & Health Informatics provide Employability Graduates:

“Fei-Fei Li, a Stanford University professor who is an expert in computer vision, said one of her Ph.D. candidates had an offer for a job paying more than $1 million a year, and that was only one of four from big and small companies.”

Machine Learning & Health Informatics has Market Opportunity for Spin-offs:

“By 2020, the market for machine learning applications will reach $40 billion, IDC, a market research firm, estimates.
By 2018, IDC predicts, at least 50 percent of developers will include A.I. features in what they create.”


The goal of ML is to develop algorithms which can learn and improve over time and can be used for predictions. In automatic Machine learning (aML), great advances have been made, e.g., in speech recognition, recommender systems, or autonomous vehicles. Automatic approaches, e.g. deep learning, greatly benefit from big data with many training sets. In the health domain, sometimes we are confronted with a small number of data sets or rare events, where aML-approaches suffer of insufficient training samples. Here interactive Machine Learning (iML) may be of help, having its roots in Reinforcement Learning (RL), Preference Learning (PL) and Active Learning (AL). The term iML can be defined as algorithms that can interact with agents and can optimize their learning behaviour through these interactions, where the agents can also be human. This human-in-the-loop can be beneficial in solving computationally hard problems, e.g., subspace clustering, protein folding, or k-anonymization, where human expertise can help to reduce an exponential search space through heuristic selection of samples. Therefore, what would otherwise be an NP-hard problem reduces greatly in complexity through the input and the assistance of a human agent involved in the learning phase. However, although humans are excellent at pattern recognition in dimensions of ≤3; most biomedical data sets are in dimensions much higher than 3, making manual data analysis very hard. Successful application of machine learning in health informatics requires to consider the whole pipeline from data preprocessing to data visualization. Consequently, this course fosters the HCI-KDD approach, which encompasses a synergistic combination of methods from two areas to unravel such challenges: Human-Computer Interaction (HCI) and Knowledge Discovery/Data Mining (KDD), with the goal of supporting human intelligence with machine learning.

Course Content:

For the successful application of ML in health informatics a comprehensive understanding of the whole HCI-KDD-pipeline, ranging from the physical data ecosystem to the understanding of the end-user in the problem domain is necessary. In the medical world the inclusion of privacy, data protection, safety and security is mandatory. This three-module (lucky Chinese number three) course provides an introuction into some selected topics of machine learning and knowledge extraction (MAKE) for health informatics.

Module 01 – Introduction: Machine Learning meets health informatics – challenges and future directions

In the first module we get only a rough overview on the differences between automatic machine learnig and interactive machine learning and we discuss a few future challenges as a teaser.

Topic 01: The HCI-KDD approach: Towards an integrative MAKE-pipeline
Topic 02: Understanding Intelligence
Topic 03: The complexity of the application area Health
Topic 04: Probabilistic Information & Gaussian Processes
Topic 05: Automatic Machine Learning (aML)
Topic 06: Interactive Machine Learning (iML)
Topic 07: Active Representation Learning
Topic 08: Multi-Task Learning
Topic 09: Generalization and Transfer Learning

Lecture slides 2×2 (7,051 kB): 01-DAY-MAKE-Challenges-HOLZINGER-Verona-2017-2×2

Here some prereading/postreading and video recommendations:

Module 02 – Health Data Jungle: Selected Topics on Fundamentals of Data and Information Entropy

Topic 01 Data – The underlying physics of data
Topic 02 Data – Biomedical data sources – taxonomy of data
Topic 03 Data – Integration, Mapping and Fusion of data
Topic 04 Information  – Bayes and Laplace probabilistic information p(x)
Topic 05 Information Theory – Information Entropy
Topic 06 Information Cross-Entropy and Kullback-Leibler Divergence

Lecture Slides 2×2 (8,520 kB) 02-DAY-MAKE-Data-HOLZINGER-Verona-2017-2×2

Keywords: data, information, probability, entropy, cross-entropy, Kullback-Leibler divergence

Learning Goals:
At the end of this module the students
1) are aware of the problematic of health data and understand the importance of data integration in the life sciences.
2) understand the concept of probabilistic information with a focus on the problem of estimating the parameters of a Gaussian distribution (maximum likelihood problem).
3) recognize the usefulness of the relative entropy, called Kullback–Leibler divergence which is very important, particularly for sparse variational methods between stochastic processes.

Here some prereading/postreading recommendations (alphabetically sorted according to author):

Additional reading to foster a deeper understanding of information theory related to the life sciences:

  • Manca, Vincenzo (2013). Infobiotics: Information in Biotic Systems. Heidelberg: Springer. (This book is a fascinating journey through the world of discrete biomathematics and a continuation of the 1944 Paper by Erwin Schrödinger: What Is Life? The Physical Aspect of the Living Cell, Dublin, Dublin Institute for Advanced Studies at Trinity College)

Module 03 – Probabilistic Graphical Models Part 1: From Knowledge Representation to Graph Model Learning

In order to get well prepared for the second tutorial on probabilistic programming, the second module provides some basics on graphical models and goes towards methods for Monte Carlo sampling from probability distributions based on Markov Chains (MCMC), which is very important and cool, as it is similar as our brain may work and allows for computing hierachical models having a large number of unknown paraemeters and also works well for rare event sampling wich is often the case in the health informatics domain. .  and Metropolis Hastings Algorithms The fourth module starts with reasoning under uncertainty, provides some basics on graphical models and goes towards graph model learning. One such MCMC method is the so-called Metropolis-Hastings algorithm which obtains a sequence of random samples from high-dimensional probability distributions -which we are often challenged in the health domain. The algorithm is among the top 10 most important algorithms and is named after Nicholas Metropolis (1915-1999) and Wilfred K. Hastings (1930-2016) – the first found it in 1953 and the latter generalized it in 1970 (remember: Generalization is a grand goal in science).

Topic 01 Reasoning/Decision Making under uncertainty
Topic 02 Graphs > Networks
Topic 03 Examples of Knowledge Representation in Network Medicine
Topic 04 Graphical Models and Decision Making
Topic 05 Bayes’ Nets
Topic 06 Graphical Model Learning
Topic 07 Probabilistic Programming
Topic 08 Markov Chain Monte Carlo (MCMC)
Topic 09 Example: Metropolis Hastings Algorithm

Lecture Slides 2×2 (7,467 kB) 03-DAY-MAKE-Graphs-HOLZINGER-Verona-2017-2×2

For the excercises please refer to the main course pages:

Keywords in this lecture: Reasoning under uncertainty, graph extraction, network medicine, metrics and measures, point-cloud data sets, graphical model learning, MCMC, Metropolis-Hastings Algorithm

Reading List (in alphabetical order):

  • Bishop, Christopher M (2007) Pattern Recognition and Machine Learning. Heidelberg: Springer [Chapter 8: Graphical Models]
  • Chenney, S. & Forsyth, D. A. 2000. Sampling plausible solutions to multi-body constraint problems. Proceedings of the 27th annual conference on Computer graphics and interactive techniques. ACM. 219-228, doi:10.1145/344779.344882.
  • Ghahramani, Z. 2015. Probabilistic machine learning and artificial intelligence. Nature, 521, (7553), 452-459, doi:10.1038/nature14541
  • Gordon, A. D., Henzinger, T. A., Nori, A. V. & Rajamani, S. K. Probabilistic programming. Proceedings of the on Future of Software Engineering, 2014. ACM, 167-181, doi:10.1145/2593882.2593900
  • KOLLER, Daphne & FRIEDMAN, Nir (2009) Probabilistic graphical models: principles and techniques. Cambridge (MA): MIT press.
  • Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. & Teller, E. 1953. Equation of State Calculations by Fast Computing Machines. The Journal of Chemical Physics, 21, (6), 1087-1092, doi:10.1063/1.1699114. (34,123 citations as of 21.03.2017)
  • Wainwright, Martin J. & Jordan, Michael I. (2008) Graphical Models, Exponential Families, and Variational Inference. Foundations and Trends in Machine Learning, Vol.1, 1-2, 1-305, doi: 10.1561/2200000001 [Link to pdf]
  • Wood, F., Van De Meent, J.-W. & Mansinghka, V. A New Approach to Probabilistic Programming Inference. AISTATS, 2014. 1024-1032.

A hot topic in ML are graph bandits:

Very recommendable:

Murphy, K. P. 2012. Machine learning: a probabilistic perspective, MIT press. Chapter 26 (pp. 907) – Graphical model structure

Short Bio of Lecturer:

Andreas HOLZINGER <expertise> is head of the Holzinger Group, HCI-KDD, Institute for Medical Informatics, Statistics and Documentation, Medical University Graz; and Assoc.Prof (Univ.-Doz.) at the Faculty of Computer Science and Biomedical Engineering, Graz University of Technology, Institute of Information Systems and Computer Media. His research interests are in supporting human intelligence with machine learning to help to solve complex problems in biomedical informatics and the life sciences. Andreas obtained a Ph.D. in Cognitive Science from Graz University in 1998 and his Habilitation (second Ph.D.) in Computer Science from Graz University of Technology in 2003. Andreas was Visiting Professor in Berlin, Innsbruck, London (2 times), and Aachen. He was program co-chair of the 14th IEEE International Conference on Machine Learning and Applications of the Association for Machine Learning and Applications (AMLA), and is Associate Editor of the Springer Journal Knowledge and Information Systems (KAIS), Springer Brain Informatics (BRIN), BMC Medical Informatics and Decision Making (MIDM), and founder and leader of the international expert network HCI-KDD. Andreas is member of the IFIG WG 12.9. Computational Intelligence and co-chair of the Cross-Disciplinary IFIP CD-ARES 2016 conference, organizing a special session on privacy aware machine learning (PAML) for health data science. Since 2003 he has participated in leading positions in 30+ R&D multi-national projects, budget 4+ MEUR, 7800+ citations, h-index =39, g-index=166;

Video for Students:

Group Homepage:

Personal Homepage:

Additional study material and reading can be found here: