Current as of: May, 29, 2020 – 16:30 CET
“It is remarkable that a science which began with the consideration of games of chance
should have become the most important object of human knowledge”
Pierre Simon de Laplace, 1812.
According to the current Corona-Virus regulations this course will be held on-line;
to enroll to this course please send – until March, 24th, 2020 17:00 at the latest – an e-Mail to:
andreas.holzinger AT tuwien.ac.at
please put “LV 185.A83 Class of 2020 enrollment” into the header to bypass the spamfilter
>> Link to TISS
>> Course Syllabus Class of 2020 2020-Syllabus-185A83-TU-Wien (pdf, 80 kB)
Health is developing into a data-driven science. Health AI works on the effective use of machine learning methods for medical decision making. This graduate course follows a research-based teaching approach. The topics include methodologies for combining human intelligence with machine intelligence for medical decision support. The European general data protection regulation explicitely has a legal “right for explanation”, and the EU parliament recently approved a resolution on “explainable AI” among the European Digitalization initiative. This calls for novel human-AI interfaces enabling a medical expert to retrace, replicate and understand the machine results. Consequently, the central focus of the class of 2020 is on making machine decisions transparent, re-traceable and interpretable for a medical expert. One decisive requirement for successful AI applications in the future will be to enable a human expert to understand the context and to explore the underlying explanatory factors of why a certain machine decision has been reached. This is desirable in many domains but mandatory in the medical domain. Additionally, explainable AI should enable a health expert to ask counterfactual questions such as “what if?” questions in human-AI dialogue systems for insight and sensemaking.
Students please watch this: https://www.youtube.com/watch?v=UuiV0icAlRs
For German readers:
Andreas Holzinger (2018). Explainable AI (ex-AI). Informatik-Spektrum, 41, (2), 138-143, doi:10.1007/s00287-018-1102-5
Andreas Holzinger & Heimo Müller (2020). Verbinden von Natürlicher und Künstlicher Intelligenz: eine experimentelle Testumgebung für Explainable AI (xAI). HMD Praxis der Wirtschaftsinformatik, 57, (1), 33-45, doi:10.1365/s40702-020-00586-y
For English readers:
Andreas Holzinger, Andre Carrington & Heimo Müller 2020. Measuring the Quality of Explanations: The System Causability Scale (SCS). Comparing Human and Machine Explanations. KI – Künstliche Intelligenz (German Journal of Artificial intelligence), Special Issue on Interactive Machine Learning, Edited by Kristian Kersting, TU Darmstadt, 34, (2), doi:10.1007/s13218-020-00636-z
Andreas Holzinger, Georg Langs, Helmut Denk, Kurt Zatloukal & Heimo Mueller 2019. Causability and Explainability of Artificial Intelligence in Medicine. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9, (4), doi:10.1002/widm.1312
For practical applications we focus on Python – which is to date the worldwide most used ML-language. Tutorial: Python-Tutorial-for-Students-Machine-Learning-course (pdf, 2,279 kB – reference as: Marcus D. Bloice & Andreas Holzinger 2016. A Tutorial on Machine Learning and Data Science Tools with Python. In: Lecture Notes in Artificial Intelligence LNAI 9605. Springer, pp. 437-483, doi:10.1007/978-3-319-50478-0_22)
Machine learning is a highly practical field, consequently this class is a VU: there will be a written exam at the end of the course, and during the course the students will solve related assignments. ECTS Breakdown: 75 hours in 15 hours lecture, 15 hours preparation for the lecture and practicals, 30 hours assignments, 15 hours preparation for the 1 hour written exam.
Lecture 01 – Week 12 (2020)
Introduction: From health informatics to ethical responsible medical AI
Lecture Outline: In the first lecture you get a quick introduction to the application area health informatics, why this application area is complex and why probabilistic learning can help. We start firt with a clarification about the differences between AI/ML/DL (see also here) and then get an overview on the differences between automatic machine learning and interactive machine learning and discuss a few future challenges of the HCAI approach to ensure ethical responsible AI/ML. This shall emphasize the integrative ML approach, where at first we learn from prior data, then extract knowledge in order to generalize and to detect certain patterns in the data and use these to make predictions and help to make decisons under uncertainty. The grand future goal for medical AI in the future is in re-traceability, interpretability and sense-making.
Lecture Keywords: HCI-KDD approach, integrative AI/ML, complexity, automatic ML, interactive ML, explainable AI
Topic 01: The HC-AI appraoch: integrative machine learning
Topic 02: Application Area Health: On the complexity of health informatics
Topic 03: Probabilistic learning on the example of Gaussian processes
Topic 04: Automatic Machine Learning (aML)
Topic 05: Interactive Machine Learning (iML)
Topic 06: “Explainable AI”
Conclusion and Future Outlook
Course slides full size (pdf – 6,304 kB) – 01-185A83-HOLZINGER-health-AI-class-2020-intro
Course slides 2 x 2 (pdf – 11,416 kB) – 01-185A83-HOLZINGER-health-AI-class-2020-intro-4×4
Youtube Video recording (mp4 – 300,224 kB, 1:42:06) https://www.youtube.com/watch?v=yL8UfyzkOgM
Learning Goals: At the end of the first lecture the students …
+ become aware of some problems of the application domain medicine and health
+ have an overview on current trends, challenges, hot topics and future aspects of AI/ML for health informatics
+ know the differences, advantages and disadvantages of automatic ML and interactive ML
+ get an understanding of the importance of re-traceability, transparency, explainability and causality
+ gain awareness for the importance of ethical, legal, and social responsibility in health AI
Reading for Students: (some prereading/postreading and video recommendations):
- Holzinger, A. 2016. Interactive Machine Learning for Health Informatics: When do we need the human-in-the-loop? Springer Brain Informatics, 1-13. doi: doi: 10.1007/s40708-016-0042-6
- Dossier: HOLZINGER (2016) Dossier interactive Machine Learning Health Informatics
- Watch the video of Andreas Holzinger: https://youtu.be/lc2hvuh0FwQ
- Watch the video of Google DeepMindHealth: https://youtu.be/teZ2m5oTKwM
- “Medicine is so complex, the challenges are so great … we need everything that we can bring to make our diagnostics more precise, more accurate and our therapeutics more focused on that patient.” Sir Malcolm GRANT, NHS England, in: Machine learning : ROYAL SOCIETY Conference report, Part of the conference series Breakthrough science and technologies Transforming our future with machine learning), https://royalsociety.org/topics-policy/projects/machine-learning
Watch the videos: https://www.youtube.com/playlist?list=PLg7f-TkW11iX3JlGjgbM2s8E1jKSXUTsG
Lecture 02 – Week 13 (2020)
From data for machine learning to probabilistic information, entropy and knowledge:
On data quality, data integration, data augmentation and information theory
Lecture Outline: The importance of the quality of the overall machine learning ecosystem is often underestimated. In order to carry out successful machine learning, we need not only appropriate algorithms, but above all top quality – and relevant – data, and appropriate domain knowledge! You will always get a result, the crucial question is whether and to what extent the results are relevant to support medical decision making from uncertainty. In the second lecture we get an overview of three essential topics: Data, Information and Knowledge. We will see that the big challenges in AI/machine learning lie in these areas. Data quality is extremely important. Data integration is the grand challenge in medical AI. Context understanding is the far-off goal of future artificial intelligence. We follow in our course the definition of the American Association of Medical Informatics (AMIA): Biomedical informatics (BMI) is the interdisciplinary field that studies and pursues the effective use of biomedical data, information, and knowledge for scientific problem solving, and decision making, motivated by efforts to improve human health. Medicine is ongoing decision making under uncertainty and our quest is to provide relevant information for making better decisions.
Lecture Keywords: data, information, probability, entropy, cross-entropy, Kullback-Leibler divergence, knowledge, ontology, classification, terminology
Topic 00 Reflection (quiz about the last lecture)
Topic 01 Data – The underlying physics of data
Topic 02 Data – Biomedical data sources – taxonomy of data
Topic 03 Data – Integration, Mapping and Fusion of data, digression on medical communication and data augmentation
Topic 04 Information – Theory and Entropy
Topic 05 Knowledge Representation – Ontologies – Medical Classifications
Course slides full size (pdf – 11,434 kB) – 02-185A83-HOLZINGER-health-AI-class-2020-ah
Course slides 2 x 2 (pdf – 15,490 kB) – 02-185A83-HOLZINGER-health-AI-class-2020-ah-2×2
Youtube Video recording (mp4 – 440,934 kB, 2:33:38) https://www.youtube.com/watch?v=EpM8ffwdgW0
Learning Goals: At the end of this lecture the students
+ are aware of the problematic of health data and understand the importance of data integration in the life sciences.
+ have a good feeling about biomedical data sources, where the data comes from and how to deal with them
+ recognize the usefulness of the relative entropy, called Kullback–Leibler divergence which is very important, particularly for sparse variational methods between stochastic processes.
+ have insight into the problematic of knowledge represntation, an overview on the usefulness and limitations of ontologies, terminologies and medical classifications.
Reading for Students: (some prereading/postreading recommendations):
- Banerjee, O., El Ghaoui, L. & D’aspremont, A. 2008. Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data. The Journal of Machine Learning Research, 9, 485-516, https://www.jmlr.org/papers/v9/banerjee08a.html
- De Boer, P.-T., Kroese, D. P., Mannor, S. & Rubinstein, R. Y. 2005. A tutorial on the cross-entropy method. Annals of operations research, 134, (1), 19-67. doi:10.1007/s10479-005-5724-z
- Galas, D. J., Dewey, T. G., Kunert-Graf, J. & Sakhanenko, N. A. 2017. Expansion of the Kullback-Leibler Divergence, and a new class of information metrics. arXiv:1702.00033.
- Holzinger, A., Dehmer, M. & Jurisica, I. (2014). Knowledge Discovery and interactive Data Mining in Bioinformatics – State-of-the-Art, future challenges and research directions. BMC Bioinformatics, 15, (S6), I1. doi:10.1186/1471-2105-15-S6-I1
- Holzinger, A., Hörtenhuber, M., Mayer, C., Bachler, M., Wassertheurer, S., Pinho, A. & Koslicki, D. (2014). On Entropy-Based Data Mining. In: Holzinger, A. & Jurisica, I. (eds.) Interactive Knowledge Discovery and Data Mining in Biomedical Informatics, Lecture Notes in Computer Science, LNCS 8401. Berlin Heidelberg: Springer, pp. 209-226. doi:10.1007/978-3-662-43968-5_12
Online available: https://pure.tugraz.at/portal/files/3108669/HOLZINGER_Entropy_based_data_mining.pdf
- Loshchilov, Ilya, Schoenauer, Marc & Sebag, Michele (2013). KL-based Control of the Learning Schedule for Surrogate Black-Box Optimization. arXiv:1308.2655.
- Matthews, A., Hensman, J., Turner, R. E. & Ghahramani, Z. On sparse variational methods and the Kullback-Leibler divergence between stochastic processes. Proceedings of the Nineteenth International Conference on Artificial Intelligence and Statistics (AISTATS), 2016. JMLR, 231-239 https://www.jmlr.org/proceedings/papers/v51/matthews16.html
Additional Reading: (to foster a deeper understanding of information theory related to the life sciences):
- Manca, Vincenzo (2013). Infobiotics: Information in Biotic Systems. Heidelberg: Springer. (This book is a fascinating journey through the world of discrete biomathematics and a continuation of the 1944 Paper by Erwin Schrödinger: What Is Life? The Physical Aspect of the Living Cell, Dublin, Dublin Institute for Advanced Studies at Trinity College)
Lecture 03 = Tutorial week 14 (2020)
Tutorial T1 & Assignment A1 (Tutor: Marcus BLOICE): Data Augmentaion
All material can be found on our GitHub page:
If you have any technical questions please open an issue on the repository itself, via:
Introduction to Python can be found here:
Python-Tutorial-for-Students-Machine-Learning-course (pdf, 2,279 kB)
Lecture 04 – Week 17 (2020)
From Decision Making under Uncertainty to Probabilistic Graphical Models
Lecture Outline: In order to get well prepared for the second tutorial on probabilistic programming, this module provides some basics on graphical models and goes towards methods for Monte Carlo sampling from probability distributions based on Markov Chains (MCMC). This is not only very important, it is awesome, as it is similar as our brain may work. It allows for computing hierachical models having a large number of unknown parameters and also works well for rare event sampling wich is often the case in the health informatics domain. So, we start with reasoning under uncertainty, provide some basics on graphical models and go towards graph model learning. One particular MCMC method is the so-called Metropolis-Hastings algorithm which obtains a sequence of random samples from high-dimensional probability distributions -which we are often challenged in the health domain. The algorithm is among the top 10 most important algorithms and is named after Nicholas METROPOLIS (1915-1999) and Wilfred K. HASTINGS (1930-2016); the former found it in 1953 and the latter generalized it in 1970 (remember: Generalization is a grand goal in science).
Lecture Keywords: Reasoning under uncertainty, graph extraction, network medicine, metrics and measures, point-cloud data sets, graphical model learning, MCMC, Metropolis-Hastings Algorithm
Topic 00 Reflection from last lecture
Topic 01 Decision Making under uncertainty
Topic 02 Some basics of Graphs/Networks
Topic 03 Bayesian Networks (BN) – digression Markov Processes in machine learning
Topic 04 Markov Chain Monte Carlo (MCMC) – digression graphical models and decision making
Topic 05 Metropolis Hastings Algorithm (MH)
Topic 07 Probabilistic Programming (PP) – digression on concept learning
- Lecture slides full size (9,299 KB): 04-185A83-HOLZINGER-health-AI-graph-machine-learning-class-2020
- Lecture slides 2 x 2 (9,290 KB): 04-185A83-HOLZINGER-health-AI-graph-machine-learning-class-2020-2×2
- Youtube Video recording mp4 (356,588 kB, 02:12,40): https://www.youtube.com/watch?v=CvdzLuLMlrE
Learning Goals: At the end of this lecture the students
+ are aware of reasoining and decision making
+ have an idea of graphical models
+ understand the advantages of probabilistic programming
Reading for Students:
- Bishop, Christopher M (2007) Pattern Recognition and Machine Learning. Heidelberg: Springer [Chapter 8: Graphical Models]
- Chenney, S. & Forsyth, D. A. 2000. Sampling plausible solutions to multi-body constraint problems. Proceedings of the 27th annual conference on Computer graphics and interactive techniques. ACM. 219-228, doi:10.1145/344779.344882.
- Ghahramani, Z. 2015. Probabilistic machine learning and artificial intelligence. Nature, 521, (7553), 452-459, doi:10.1038/nature14541
- Gordon, A. D., Henzinger, T. A., Nori, A. V. & Rajamani, S. K. Probabilistic programming. Proceedings of the on Future of Software Engineering, 2014. ACM, 167-181, doi:10.1145/2593882.2593900
- KOLLER, Daphne & FRIEDMAN, Nir (2009) Probabilistic graphical models: principles and techniques. Cambridge (MA): MIT press.
- Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. & Teller, E. 1953. Equation of State Calculations by Fast Computing Machines. The Journal of Chemical Physics, 21, (6), 1087-1092, doi:10.1063/1.1699114. (34,123 citations as of 21.03.2017)
- Wainwright, Martin J. & Jordan, Michael I. (2008) Graphical Models, Exponential Families, and Variational Inference. Foundations and Trends in Machine Learning, Vol.1, 1-2, 1-305, doi: 10.1561/2200000001 [Link to pdf]
- Wood, F., Van De Meent, J.-W. & Mansinghka, V. A New Approach to Probabilistic Programming Inference. AISTATS, 2014. 1024-1032.
A hot topic in ML are graph bandits:
Lecture 05 – Week 18 (2020)
Tutorial T2 – Probabilistic Programming with Python (Tutor: Florian ENDEL) and second assigment
In this tutorial, we will explore probabilistic programming with the Python framework PyMC3. “Probabilistic programming allows for automatic Bayesian inference on user-defined probabilistic models.” 
We will start with a brief repetition of the previous lecture by discussing the Bayes’ theorem, Bayesian models and Bayesian parameter estimation using Markov Chain Monte Carlo (MCMC) sampling. Next on, we will dive deeper into the capabilities, workflow and specific utilization of PyMC3. Language primitives, stochastic variables and the intuitive syntax to define complex models and networks will be explored. Increasingly complex examples including, e.g., a simple statistical test, linear (LM) and generalized linear (GLM) models as well as multilevel modelling will highlight the applicability of Bayes’ methodology as well as the potential and simplicity of probabilistic programming with PyMC3. An exercise based on real world research  will demonstrate the advantage of multilevel modelling and probabilistic programming.
1) Please watch the tutorial video online here
2) The Exercise instruction 2020
3) The Exercise data 2020
 John Salvatier, Thomas V. Wiecki & Christopher Fonnesbeck 2016. Probabilistic programming in Python using PyMC3. PeerJ Computer Science, 2, e55, doi:10.7717/peerj-cs.55
 Linda Rosa, Emily Rosa, Larry Sarner & Stephen Barrett 1998. A Close Look at Therapeutic Touch. JAMA, 279, (13), 1005-1010, doi:10.1001/jama.279.13.1005
The 2019 material is still available here:
Introduction to PyMC3: https://florian.endel.at/Presentation/PyMC3Intro/
Assignment Instruction: Exercise-Therapeutic-Touch-LV185A83-2018
The 2019 class will again cover Multilevel Modelling (adapted from Chris Fonnesbeck):
Please refer to our Github pages: https://github.com/human-centered-ai-lab/cla-185A83-machine-learning-health-class-2019
Lecture slides 2017: full size (815 kB) 2017-04-04 Probabilistic Programming – Endel
Examples 2017: https://github.com/FlorianEndel/Probabilistic-Programming-Tutorial
 A. Pfeffer, Practical probabilistic programming. Shelter Island, NY: Manning Publications, Co, 2016.
 C. Davidson-Pilon, Bayesian methods for hackers: probabilistic programming and Bayesian inference. New York: Addison-Wesley, 2016.
 J. K. Kruschke, Doing Bayesian data analysis: a tutorial with R, JAGS, and Stan, Edition 2. Boston: Academic Press, 2015.
Lecture 06 – Week 19 (2020)
Selected Methods of explainable AI (xAI)
Lecture Outline: Medical action is permanent decsion making under uncertainty within limited time (“5 -Minutes”). The problem of the most successful AI/ML methods (e.g. deep learning; see the differences between AI-ML-DL here) is that they are often considered to be “black-boxes” which is not quite true. However, even if we understand the underlying mathematical and theoretical principles, it is difficult to re-enact and to answer the question of why a certain machine decision has been reached. A general serious drawback is that such models have no explicit declarative knowledge representation, hence have difficulty in generating the required explanatory structures – the context – which considerably limits the achievement of their full potential. Interestingly the “old symbolic and logic based AI-approaches” did have such explanatory structures, at least for a very narrow domain space. One future goal is in implicit knowledge elicitation through efficient human-AI interfaces.
Lecture Keywords: medical decsion making, transparency, re-traceability, re-enaction, re-producibility, explainability, interpretability
Topic 01 Explainability, Interpretability, Causability, Students read [1, 2]
Topic 02 is xAI new? History of DSS = History of AI – explainable AI is actually the oldest field of Artificial Intelligence
Topic 03 Examples for ante-hoc models (explainable models, glass-boxes, interpretable machine learning)
Topic 04 Examples for post-hod models (making the “black-box” model interpretable)
Topic 04a LIME, 04b BETA, 04c LRP, 04d Taylor, 04e Prediction difference analysis, 04f TCAV
- Lecture slides full size (7.023 kB): 06-185A83-HOLZINGER-health-AI-explainability-class-2020
- Lecture slides 2 x 2 (9.413 kB): 06-185A83-HOLZINGER-health-AI-explainability-class-2020-2×2
- Youtube Video recording mp4 (303,104 kB, 01:54,44): https://www.youtube.com/watch?v=UayEs-JFoQ4
Learning Goals: At the end of this lecture the students …
+ know the roots of explainable AI, causality and causability and how to measure the quality of explanations
+ see the importance of future human-AI interfaces for medical experts
+ have an overview of post-hoc and ante-hoc methods of explainable AI
+ see how important ground truth in the medical domain is and how explainablity and causability must be mapped for a mutual understanding
+ know a selection of some of the most relevant methods of explainable AI
for more details please go to the course page (taking place each semester at Graz University of Technology):
 Andreas Holzinger, Georg Langs, Helmut Denk, Kurt Zatloukal & Heimo Mueller 2019. Causability and Explainability of Artificial Intelligence in Medicine. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9, (4), doi:10.1002/widm.1312.
Online available: https://onlinelibrary.wiley.com/doi/full/10.1002/widm.1312
 Andreas Holzinger, Andre Carrington & Heimo Müller 2020. Measuring the Quality of Explanations: The System Causability Scale (SCS). Comparing Human and Machine Explanations. KI – Künstliche Intelligenz (German Journal of Artificial intelligence), Special Issue on Interactive Machine Learning, Edited by Kristian Kersting, TU Darmstadt, 34, (2), doi:10.1007/s13218-020-00636-z.
Online available: https://link.springer.com/article/10.1007/s13218-020-00636-z
Third Assignment (Tutor: Anna SARANTI): Layer-Wise Relevance Propagation (LRP)
Please read the task description:
Assignment-3-Saranti-Machine-Learning-for-Health-Informatics-Class-of-2020 (pdf, 100kB)
and watch the instructional video here:
All material can be found on our GitHub page:
If you have any technical questions please open an issue on the repository itself, via:
Introduction to Python can be found here:
Python-Tutorial-for-Students-Machine-Learning-course (pdf, 2,279 kB)
Final Lecture tba
The grading consists of three independent parts:
I) Final Exam (written test quiz, 30%) – see sample exam here
II) Presentations of the assigments (orally, 10 %) – will be held online
III) Grading of the assignments (coding, 20 % each, 60 % total)
Note: The course will be adpated to the students accordingly as the course progresses. Each lecture is preceded by a quiz from the last lecture. The slides will be put online AFTER each lecture – and only those are binding for the final exam. Note that the slides presented and the slides showed on the Web can be different for didactical purposes.
Short Bio of Lecturer:
Andreas HOLZINGER <expertise> promotes a synergistic approach to Human-Centred AI (HCAI) and has pioneered in interactive machine learning (iML) with the human-in-the-loop. He promotes an integrated machine learning approach with the goal to augment human intelligence with artificial intelligence to help to solve problems in health informatics.
Due to raising ethical, social and legal issues governed by the European Union, future AI supported systems must be made transparent, re-traceable, thus human interpretable. Andreas’ aim is to explain why a machine decision has been reached, paving the way towards explainable AI and Causability, ultimately fostering ethical responsible machine learning, trust and acceptance for AI.
Andreas obtained a Ph.D. in Cognitive Science from Graz University in 1998 and his Habilitation (second Ph.D.) in Computer Science from Graz University of Technology in 2003. Andreas was Visiting Professor for Machine Learning & Knowledge Extraction in Verona, RWTH Aachen, University College London and Middlesex University London. Since 2016 Andreas is Visiting Professor for Machine Learning in Health Informatics at the Faculty of Informatics at Vienna University of Technology. Currently, Andreas is Visiting Professor for explainable AI, Alberta Machine Intelligence Institute, University of Alberta, Canada.
Group Homepage: https://explainable-ai.org
Personal Homepage: https://www.aholzinger.at
Youtube Introduction Video: https://youtu.be/lc2hvuh0FwQ
Conference Homepage: https://cd-make.net
Short Bio of Tutors:
Marcus BLOICE is finishing his PhD this year with the application of deep learning on medical images. Currently, he is working on the Augmentor project and the Digital Pathology project, and is involved in the featureCloud project. He has a background in computer science from the University of Sunderland (UK). He is a programmer in Python and has experience with the popular machine learning pipelines. Marcus has also experience in machie learning on large medical images.
Florian ENDEL started working as a database developer in the general field of healthcare research in 2007 – after gathering first experiences as high school teacher for two years and working as freelance Web designer, A specific highlight is the development and supervision of “GAP-DRG”, a database holding massive amounts of reimbursement data from the Austrian social insurance system, since 2008. Since then, he was part of several national and international research projects handling, among others, data management, data governance, statistical analytics and secure computing infrastructure. He is currently participating in the EU FP7 project CEPHOS-LINK, the FFG K-Projekt DEXHELPP and still finishing his master’s thesis.
Anna SARANTI is just finalizing her Master’s studies with a work on Applying Probabilistic Graphical Models and Deep Reinforcement Learning in a Learning-Aware Application, supervised by Andreas Holzinger and Martin Ebner at Graz University of Technology. Anna is currently working as machine learning engieer in Vienna.
Additional pointers and reading suggestions can be found a the
Learning Machine Learning page
Excellent Ressources for excercises
Github repository by Alberto Blanco Garcés https://github.com/alberduris
Related Books in Machine Learning:
- MITCHELL, Tom M., 1997. Machine learning, New York: McGraw Hill. (Book Webpages)
Undoubtedly, this is the classic source from the pioneer of ML for getting a perfect first contact with the fascinating field of ML, for undergraduate and graduate students, and for developers and researchers. No previous background in artificial intelligence or statistics is required.
- FLACH, Peter, 2012. Machine Learning: The Art and Science of Algorithms that Make Sense of Data. Cambridge: Cambridge University Press. (Book Webpages)
Introductory for advanced undergraduate or graduate students, at the same time aiming at interested academics and professionals with a background in neighbouring disciplines. It includes necessary mathematical details, but emphasizes on how-to.
- MURPHY, Kevin, 2012. Machine learning: a probabilistic perspective. Cambridge (MA): MIT Press. (Book Webpages)
This books focuses on probability, which can be applied to any problem involving uncertainty – which is highly the case in medical informatics! This book is suitable for advanced undergraduate or graduate students and needs some mathematical background.
- BISHOP, Christopher M., 2006. Pattern Recognition and Machine Learning. New York: Springer-Verlag. (Book Webpages)
This is a classic work and is aimed at advanced students and PhD students, researchers and practitioners, not asuming much mathematical knowledge.
- HASTIE, Trevor, TIBSHIRANI, Robert, FRIEDMAN, Jerome, 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York: Springer-Verlag (Book Webpages)
This is the classic groundwork from supervised to unsupervised learning, with many applications in medicine, biology, finance, and marketing. For advanced undergraduates and graduates with some mathematical interest.
To get an understanding of the complexity of the health informatics domain:
- Andreas HOLZINGER, 2014. Biomedical Informatics: Discovering Knowledge in Big Data.
New York: Springer. (Book Webpage)
This is a students textbook for undergraduates, and graduate students in health informatics, biomedical engineering, telematics or software engineering with an interest in knowledge discovery. This book fosters an integrated approach, i.e. in the health sciences, a comprehensive and overarching overview of the data science ecosystem and knowledge discovery pipeline is essential.
- Gregory A PETSKO & Dagmar RINGE, 2009. Protein Structure and Function (Primers in Biology). Oxford: Oxford University Press (Book Webpage)
This is a comprehensive introduction into the building blocks of life, a beautiful book without ballast. It starts with the consideration of the link between protein sequence and structure, and continues to explore the structural basis of protein functions and how this functions are controlled.
- Ingvar EIDHAMMER, Inge JONASSEN, William R TAYLOR, 2004. Protein Bioinformatics: An Algorithmic Approach to Sequence and Structure Analysis. Chicheser: Wiley.
Bioinformatics is the study of biological information and biological systems – such as of the relationships between the sequence, structure and function of genes and proteins. The subject has seen tremendous development in recent years, and there are ever-increasing needs for good understanding of quantitative methods in the study of proteins. This book takes the novel approach of covering both the sequence and structure analysis of proteins and from an algorithmic perspective.
Amongst the many tools (we will concentrate on Python), some useful and popular ones include:
- WEKA. Since 1993, the Waikato Environment for Knowledge Analysis is a very popular open source tool. In 2005 Weka received the SIGKDD Data Mining and Knowledge Discovery Service Award: it is easy to learn and easy to use [WEKA]
- Mathematica. Since 1988 a commercial symbolic mathematical computation system, easy to use [Mathematica]
- MATLAB. Short for MATrix LABoratory, it is a commercial numerical computing environment since 1984, coming with a proprietary programming language by MathWorks, very popular at Universities where it is licensed, awkward for daily practice [Matlab]
- R. Coming from the statistics community it is a very powerful tool implementing the S programming language, used by data scientists and analysts. [The R-Project]
- Python. Currently maybe the most popular scientific language for ML [Python Software Foundation]
An excellent source for learning numerics and science with Python is: https://www.scipy-lectures.org/
- Julia. Since 2012, raising scientific language for technical computing with better performance than Python. IJulia, a collaboration between the Jupyter and Julia, provides a powerful browser-based graphical notebook interface to Julia. [julialang.org]
Please have a look at: What tools do people generally use to solve problems?
Recommendable reading on tools include:
- Wes McKINNEY (2012) Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. Beijing et al.: O’Reilly.
This is a practical introduction from the author of the Pandas library. [Google-Books]
- Ivo BALBAERT (2015) Getting Started with Julia Programming. Birmingham: Packt Publishing.
A good start for the Julia language and more focused on scientific computing projects, it is assumed that you already know about a high-level dynamic language such as Python. [Google-Books]
International Courses on Machine Learning:
- Carnegie Mellon University > Machine Learning Course 10-701 2015
by Eric XING (expertise) and Ziv-Bar JOSEPH (expertise)
- Carnegie Mellon University > Machine Learning Course 10-701/15-781 2011
by Tom MITCHELL (expertise)
- Carnegie Mellon University > Machine Learning Course 10-601 2015
by Maria-Florina BALCAN (expertise) and Tom MITCHELL (expertise)
- Carnegie Mellon University > Machine Learning Course 10-701 2013
by Alex SMOLA (expertise)
- Carnegie Mellon University > Machine Learnigng Course 10601b 2015
by Seyoung KIMhttps://www.cs.cmu.edu/~10601b/
- Cornell University > Machine Learning CS 4780/5780 2014
by Thorsten JOACHIMS (expertise)
- Cornell University > General Machine Learning, Knowledge Extraction
and Data Science courses
- Oxford > Department of Computer Science > Machine Learning: 2014-2015
by Nando de FREITAS (expertise)
Conferences on Machine Learning with a special focus on health application
A) Students with a GENERAL interest in machine learning should definitely browse these sources:
- TALKING MACHINES – Human conversation about machine learning by Katherine GORMAN and Ryan P. ADAMS <expertise>
excellent audio material – 24 episodes in 2015 and three new episodes in season two 2016 (as of 14.02.2016)
- This Week in Machine Learning and Artificial Intelligence Podcast
- Data Skeptic – Data science, statistics, machine learning, artificial intelligence, and scientific skepticism
- VIDEOLECTURES.NET Machine learning talks (3,580 items up to 31.01.2017) ML is grouped into subtopics
and displayed as map – highly recommendable
- TUTORIALS ON TOPICS IN MACHINE LEARNING by Bob Fisher from the University of Edinburgh, UK
B) Students with a SPECIFIC interest in interactive machine learning should have a look at: