Holzinger, A. & Jurisica, I. 2014. Interactive Knowledge Discovery and Data Mining in Biomedical Informatics: State-of-the-Art and Future Challenges, Springer LNCS 8401, doi:10.1007/978-3-662-43968-5, [Springer Link], Contents in [DBLP], [Amazon].
One of the grand challenges in our digital world are the large, complex and often weakly structured data sets, and massive amounts of unstructured information. This “big data” challenge is most evident in biomedical informatics. A synergistic combination of methodologies and approaches of two fields offer ideal conditions towards unraveling these problems: Human–Computer Interaction (HCI) and Knowledge Discovery/Data Mining (KDD), with the goal of supporting human capabilities with machine learning.
This state-of-the-art survey is an output of the HCI-KDD expert network and features 19 carefully selected and reviewed papers related to seven hot and promising research areas: Area 1: Data Integration, Data Pre-processing and Data Mapping; Area 2: Data Mining Algorithms; Area 3: Graph-based Data Mining; Area 4: Entropy-Based Data Mining; Area 5: Topological Data Mining; Area 6 Data Visualization and Area 7: Privacy, Data Protection, Safety and Security.
Some background Information:
Health systems worldwide are challenged by big and complex sets of heterogeneous, high-dimensional, complex data and increasing amounts of unstructured information. Due to the fact that biomedicine, health and the life sciences are turning into a data intensive science, machine learning can help to more evidence-based decision-making and support to realize the grand goals of personalized medicine [Holzinger, A. 2014. Trends in Interactive Knowledge Discovery for Personalized Medicine: Cognitive Science meets Machine Learning. IEEE Intelligent Informatics Bulletin, 15, (1), 6-14].
A grand goal of future medicine is in modelling the complexity of patients to tailor medical decisions, health practices and therapies to the individual patient. This trend towards personalized medicine produces unprecedented amounts of data, and even though the fact that human experts are excellent at pattern recognition in dimensions of smaller than three, the problem is that most biomedical data is in arbitrarily high dimensions – much higher than three. This makes manual analysis difficult, yet often practically impossible. Consequently, experts in daily biomedical routine are decreasingly capable of dealing with the complexity of such data. Moreover, they are – understandably – not interested in struggling around with the complexity of their data sets. Rather, the experts need insight into the data, so to gain knowledge in order to support their direct workflows and to find answers to their questions and hypotheses. Therefore, it is necessary to provide efficient, useable and useful computational methods, algorithms and tools to discover knowledge and to interactively make sense of such high-dimensional data.