Introduction: Holzinger, A., Goebel, R., Palade, V. & Ferri, M. 2017. Towards Integrative Machine Learning and Knowledge Extraction. Towards Integrative Machine Learning and Knowledge Extraction: Springer Lecture Notes in Artificial Intelligence LNAI 10344. Cham: Springer International, pp. 1-12, doi:10.1007/978-3-319-69775-8_1.

Springer Lecture Notes in Artificial Intelligence LNAI 10344

Editors: Andreas HOLZINGER, Randy GOEBEL, Massimo FERRI, Vasile PALADE

Towards Integrative Machine Learning and Knowledge Extraction

The BIRS Workshop 15w2181 in Banff was dedicated to stimulate a cross-domain integrative machine learning approach and appraisal of “hot topics” towards tackling the grand challenge of reaching a level of useful and useable computational intelligence. This encompasses learning from prior data, extracting and discovering knowledge, generalizing the results, fighting the curse of dimensionality, and ultimately disentangling the underlying explanatory factors in complex data, i.e. to make sense of data within the context of an application domain.

The workshop particularly tried to contribute advancements in promising novel areas as, for example, at the intersection of machine learning (ML) and topological data analyis (TDA). Most often overlapping areas at intersections are key for stimulation of new insights and further advances. This is particularly true for the extremely broad field of machine learning, and successful machine learning needs a concerted effort, fostering integrative research between experts ranging from diverse disciplines – from data science to data visualization. Tackling complex research undertakings needs both disciplinary excellence and cross-disciplinary networking without boundaries, and a cross-domain integration of experts, following the HCI-KDD approach in combining the best of two worlds (see image below).

There are many challenges and open problems from various domains of our daily life (autonomous driving, smart factories, recommender systems, natural language understanding etc.). A good example for a complex application domain is health, and modern health sciences are turning into a data intensive science. However, sometimes the problem is not big data but complex data: heterogeneous, high-dimensional, probabilistic, incomplete, uncertain and noisy data. The effective and efficient use of ML-algorithms for solving complex problems in health informatcis are a commandment of our time and may support evidence-based decision-making and help eventually to realize the grand goals of personalized medicine: in modelling the complexity of patients to tailor medical decisions, health practices and therapies to the individual patient.

Effective and efficient application of machine learning in the an applicastion domain need a concerted effort of seven research areas as outlined in the image below:

integrative interactive machine learning human-in-the-loop

Integrative interactive machine learning for health informatics

This Volume of LNAI fosters cross-combination of the 7 areas – see image above – with the goal of realizing integrated machine learning solutions at the clinical and scientific workplace of biomedcine and health. This needs a concerted effort in bringing together experts with diverse backgrounds, complementary competencies, but common interests and a shared central vision: to extract and discover knowledge from data.



Papers are sought from the 7 topical areas (image above) but always with the big picture in mind and justifying how your view may contribute to an overall solution. Each paper shall have a special structure (see below).

Papers which deal with fundamental research questions and theoretical aspects in machine learning are very welcome.


Quality needs time – and we want to ensure the highest possible quality, to provide a clear benefit to our potential readers – so there is no fixed deadline, however, the Volume is now in production and expected to appear in summer 2017.

After invitation for submission prepare your paper following the Springer llncs2e style (llncs.cls, splncs.bst).
There is no definite page limit – but the ususal chapters are between 10 and 20 pages. However, in any case, please produce even pages to ensure smooth page breaks, e.g.  … 8, 10, 12, 14, 16, 18, 20, 22, … pages.

NOTE: This State-of-the-art volume shall bring exclusive benefits for the readers and shall be of archival value on the desks and benches of both scientists, industrial practictioners, teachers and students (it is not necessary to produce rocket-science papers or Nobel-prize papers, which only a few people on this planet understand). Moreover, it shall be useful for fostering joint projects at national, European, and international level.


For this purpose each chapter is required to follow a specific structure:

    A very short and concise introduction and motivation on why and how this chapter is important and for whom;
    The used terms shall be defined at first, so that a common understanding is guaranteed;
    This is the main part and may be divided into traditional subchapters accordingly;
    Here you should highlight potential known obstacles so that others can avoid to make errors in advance;
    This shall outline future research avenues, hot topics and research challenges of further interest;

You can refer to a successful sample Volume and look there for sample papers:
Holzinger, A. &  Jurisica I. (2014). Knowledge Discovery and Data Mining in Biomedical Informatics: State-of-the-Art and Future Challenges, LNCS 8401, Berlin Heidelberg: Springer, 2014.

and as a template you can use the following sample papers:

and here the general Springer LNCS information page


Please submit your paper directly to


Your paper will be assigned to two to three reviewers, so that you will receive valuable, useful feedback on how to further improve your paper. You will get notified in due course to prepare the final version.
The review template can be found here (scroll down to the middle of the page)

To ensure the highest possible standards of Springer, each paper will be reviewed by at least two members of the HCI-KDD international expert network
using this REVIEW-TEMPLATE-XXXX (word doc, 140kB).


To fully understand the intention of this Volume you can read the draft editorial here (comments are very welcome!):
Holzinger (2016) DRAFT-Editorial-Integrative-Machine-Learning-for-Health  (pdf, 564kB)

Please revise your paper according to the reviewer requests and send the following three items
directly to
1) Your paper as pdf (please ensure even page numbers, e.g. … 14, 16, 18, 20, 22 … pages)
2) Your source files (LaTeX (preferred – pack all source files in one single zip-folder) – or MS Word)
3) The signed letter of consent as pdf scan –
please download the form here: LNCS-Springer-Letter-of-Consent-LNCS-SOTA-MLHealth   (pdf, 68kB)


Your files will be carefully checked and send into production. Authors will be contacted for checking the page proofs directly by the Springer production team.  The Volume is targeted to be finalized and printed in October 2016 – quality needs time. As a gratitude you will receive one copy of the printed volume fresh from the printing press 🙂

Thank you for your kind interest!

Some background Information:

Health systems worldwide are challenged by big and complex sets of heterogeneous, high-dimensional, complex data and increasing amounts of unstructured information. Due to the fact that biomedicine, health and the life sciences are turning into a data intensive science, machine learning can help to more evidence-based decision-making and support to realize the grand goals of personalized medicine [Holzinger, A. 2014. Trends in Interactive Knowledge Discovery for Personalized Medicine: Cognitive Science meets Machine Learning. IEEE Intelligent Informatics Bulletin, 15, (1), 6-14].

A grand goal of future medicine is in modelling the complexity of patients to tailor medical decisions, health practices and therapies to the individual patient. This trend towards personalized medicine produces unprecedented amounts of data, and even though the fact that human experts are excellent at pattern recognition in dimensions of smaller than three, the problem is that most biomedical data is in arbitrarily high dimensions – much higher than three. This makes manual analysis difficult, yet often practically impossible. Consequently, experts in daily biomedical routine are decreasingly capable of dealing with the complexity of such data. Moreover, they are – understandably – not interested in struggling around with the complexity of their data sets. Rather, the experts need insight into the data, so to gain knowledge  in order to support their direct workflows and to find answers to their questions and hypotheses. Therefore, it is necessary to provide efficient, useable and useful computational methods, algorithms and tools to discover knowledge and to interactively make sense of such high-dimensional data.

The classic goal of machine learning is to develop algorithms which can learn and improve over time and can be used for predictions. In automatic machine learning (aML) great advances have been made, for example, in speech recognition, recommender systems, or autonomous vehicles (“Google Car”). Such automatic approaches greatly benefit from big data with many training sets available. However, in the health domain, particualarly in the clinical domain, sometimes we are confronted with a small number of data sets or even rare events, where aML-approaches suffer of insufficient training sets. Here, interactive machine learning (iML) may be of help. However, the term iML is not yet well used, so we define it as “algorithms that can interact with agents and can optimize their learning behavior through these interactions, where the agents can also be human.” This “human-in-the-loop” can be beneficial in solving computationally hard problems. Relevant examples include subspace clustering, protein folding, or k-anonymization of health data. In such problems  the human expertise may help to reduce an exponential search space through heuristic selection of samples. Therefore, what would otherwise be an NP-hard problem, reduces greatly in complexity through the input and the assistance of a human agent involved in the learning phase.

Read full article here: