LV 706.315 Selected Topics on interactive Knowledge Discovery:
interactive Machine Learning (iML) – Winter term 2015 (again in WS 2016)
LV-706315-interactive-machine-learning-holzinger
Motivation for this lecture:
Whilst automated Machine Learning (aML) approaches (“Google car“) work well in many domains, particularly with big data sets, in complex domains with small training data sets the application of aML entails the danger of modelling artefacts. An example for a complex domain is Biomedicine, where experts are often confronted with high-dimensional, probabilistic, incomplete and often small data sets, which makes the application of aML-approaches difficult. In such situations it can be advantageous to integrate human domain knowledge and expertise into the machine learning loop. The foundations of iML-approaches can be found in reinforcement learning, preference learning and active learning.
Definition of interactive Machine Learning (iML):
We define iML-approaches as algorithms that can interact with both computational and human agents *) and can optimize its learning behaviour trough this interaction.
*) Such agents are called in Active Learning “oracles” (see e.g.: Settles, B. 2011. From theories to queries: Active learning in practice. In: Guyon, I., Cawley, G., Dror, G., Lemaire, V. & Statnikov, A. (eds.) Active Learning and Experimental Design Workshop 2010. Sardinia: JMLR Proceedings. 1-18.
Goal of this lecture:
This graduate course follows a research-based teaching (RBT) approach and provides a broad overview of models and discusses methods for combining human intelligence with machine intelligence to solve computational problems. The application focus is on the biomedical domain. The cross-sectional topic is evaluation, because the all-in-one method suitable for every purpose (in German: eierlegende Wollmilchsau) would be nice but is not achievable. Consequently, an evaluation framework is of extreme importance for driving progress in any machine learning approach. For practical applications we use the JULIA language – besides of R and Python.
Background:
A classic challenge in machine learning is in the development of a model, which relates observed data X ܺto categorical variables Y ∈ {x1, x2, x3, … } to infer higher level information from the data. This problem is widely applicable as X may represent any data and Y any information. In domains dealing with uncertainties (such as the biomedical domain), we seek applications where Y includes unknowns. Machine learning solutions to this problem must involve human expertise during the design phase to provide relevant training data sets. Unfortunately, to date such experts are not part of machine learning algorithms, on the contrary, the grand goal of artificial intelligence is to exclude the human from the loop, hence make it fully automatic (see “Google car”). The objective of interactive machine learning methods is to develop algorithms which can interact with both: computational agents and human agents – towards hybrid multi-agent systems.
General information:
Machine learning is a large subfield of computer science that evolved from artificial intelligence (AI) and is tightly connected with data mining and knowledge discovery. The grand goal is to design and develop algorithms which can learn from data. Consequently, machine learning systems learn and improve with experience and time and can be used to predict outcomes of questions based on previous knowledge. Actually, to learn intelligent behaviour from noisy examples is a grand exciting challenge in AI. This is highly relevant for many applications in biomedical informatics generally, and for stratified and personalized medicine in particular.
Target Group:
Research students of Computer Science who are interested in knowledge discovery/data mining by following the idea of iML-approaches, i.e. human-in-the-loop learning systems. This is a cross-disciplinary computer science topic and highly relevant for the application in complex domains, such as biomedicine.
Keywords:
Interactive Machine Learning, Reinforcement Learning, Preference Learning, Active Learning, Computational Intelligence
Some Quick Explanations:
Active Learning (AL) := to select training samples to enable a minimization of loss in future cases; a learner must take actions to gain information, and has to decide which actions will provide the information that will optimally minimize future loss. The basic idea goes back to Fedorov, V. (1972). Theory of optimal experiments. New York: Academic Press. According to Sanjoy Dasgupta the frontier of active learning is mostly unexplored, and except for a few specic cases, we do not have a clear sense of how much active learning can reduce label complexity: whether by just a constant factor, or polynomially, or exponentially. The fundamental statistical and algorithmic challenges involved along with huge practical application possibility make AL a very important area for future research.
Interactive Machine Learning (iML) := machine learning algorithms which can interact with – partly human – agents and can optimize its learning behaviour trough this interaction. Holzinger, A. 2016. Interactive Machine Learning for Health Informatics: When do we need the human-in-the-loop? Brain Informatics (BRIN), 3, (2), 119-131.
Preference learning (PL) := concerns problems in learning to rank, i.e. learning a predictive preference model from observed preference information, e.g. with label ranking, instance ranking, or object ranking. Fürnkranz, J., Hüllermeier, E., Cheng, W. & Park, S.-H. 2012. Preference-based reinforcement learning: a formal framework and a policy iteration algorithm. Machine Learning, 89, (1-2), 123-156.
Reinforcement Learning (RL) := examination on how an agent may learn from a series of reinforcements (sucess/rewards or failure/punishments). A must read is Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of artificial intelligence research, 237-285.
Multi-Agent Systems (MAS) := include collections of several independent agents, could also be a mixture of computer agents and human agents. An exellent pointer of the later one is: Jennings, N. R., Moreau, L., Nicholson, D., Ramchurn, S. D., Roberts, S., Rodden, T. & Rogers, A. 2014. On human-agent collectives. Communications of the ACM, 80-88.
Transfer Learning (TL) := The ability of an algorithm to recognize and apply knowledge and skills learned in previous tasks to
novel tasks or new domains, which share some commonality. Central question: Given a target task, how do we identify the
commonality between the task and previous tasks, and transfer the knowledge from the previous tasks to the target one?
Pan, S. J. & Yang, Q. 2010. A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering, 22, (10), 1345-1359, doi:10.1109/tkde.2009.191.
Pointers:
A) My students with a GENERAL interest in machine learning should definitely browse these sources:
1) TALKING MACHINES – Human conversation about machine learning by Katherine GORMAN and Ryan P. ADAMS <expertise>
excellent audio material – 24 episodes up to 22.11.2015
2) VIDEOLECTURES.NET Machine learning talks (3,106 items up to 4.7.2015)
3) TUTORIALS ON TOPICS IN MACHINE LEARNING by Bob Fisher from the University of Edinburgh, UK
B) My students with a PARTICULAR interest in interactive machine learning should browse these sources:
1) Theory, Methods and Applications of Active Learning, by Robert D. NOWAK, University of Wisconsin <expertise>, MLSS 2009
2) Active Learning Tutorial by Sanjoy DASGUPTA & John LANGFORD, ICML 2009
3) Nonparametric Active Learning, Robert D. NOWAK, NIPS Workshops 2013
Reading List:
1) Articles
Cohn, D. A., Ghahramani, Z. & Jordan, M. I. (1996). Active learning with statistical models. Journal of Artificial Intelligence Research, 4, 129-145.
Dasgupta, S. (2011). Two faces of active learning. Theoretical computer science, 412, (19), 1767-1781.
Holzinger, A. (2016). Interactive Machine Learning for Health Informatics: When do we need the human-in-the-loop? Springer Brain Informatics (BRIN), 3, (1), in print.
Holzinger, A. (2016). Interactive Machine Learning (iML). Informatik Spektrum, 39, (1), in print.
Holzinger, A. (2014). Trends in Interactive Knowledge Discovery for Personalized Medicine: Cognitive Science meets Machine Learning. Intelligent Informatics Bulletin, 15(1): 6-14.
Holzinger, A. (2014). Extravaganza Tutorial on Hot Ideas for Interactive Knowledge Discovery and Data Mining in Biomedical Informatics. In: Slezak, D., Tan, A.-H., Peters, J. F. & Schwabe, L. (eds.) Brain Informatics and Health, BIH 2014, Lecture Notes in Artificial Intelligence, LNAI 8609. Heidelberg, Berlin: Springer, pp. 502-515.
Porter, R., Hush, D., Harvey, N. & Theiler, J. (2010). Toward interactive search in remote sensing imagery. Proceedings of SPIE Cyber Security, Situation Management, and Impact Assessment II. International Society for Optics and Photonics. 77090V.
2) Books
Links:
Association for the Advancement of Artificial Intelligence AAAI > AI Magazine
[bibtex file=Interactive-Machine-Learning.bib]
Time Line of relevant events for interactive Machine Learning (iML):
1950 Reinforcement Learning: Alan Turing (1912-1954) discusses RL within his paper on “Computing Machinery and Intelligence” in Oxford MIND, Volume 59, Issue 236, October 1950, pp. 433-460 doi:10.1093/mind/LIX.236.433 [link to pdf]
2000 Utility Theory:
Glossary (incomplete)
Dimension = n attributes which jointly describe a property.
Features = any measurements, attributes or traits representing the data. Features are key for learning and understanding.
Reals = numbers expressible as finite/infinite decimals
Regression = predicting the value of a random variable y from a measurement x.
Reinforcement learning = adaptive control, i.e. to learn how to (re-)act in a given environment, given delayed/ nondeterministic rewards. Human learning is mostly reinforcement learning.
Historic People (incomplete)
Bayes, Thomas (1702-1761) gave a straightforward definition of probability [Wikipedia]
Laplace, Pierre-Simon, Marquis de (1749-1827) developed the Bayesian interpretation of probability [Wikipedia]
Price, Richard (1723-1791) edited and commented the work of Thomas Bayes in 1763 [Wikipedia]
Tukey, John Wilder (1915-2000) suggested in 1962 together with Frederick Mosteller the name “data analysis” for computational statistical sciences, which became much later the name data science [Wikipedia]
Antonyms (incomplete)
big data sets < > small data sets
correlation < > causality
discriminative < > generative
Frequentist < > Bayesian
low dimensional < > high dimensional
underfitting < > overfitting
parametric < > non-parametric
supervised < > unsupervised