AI will change Radiology – NOT replace Radiologists

After the rather shocking statement of Geoffrey HINTON during the Machine Learning and Market for Intelligence Conference in Toronto, where he recommended that hospitals should stop training radiologists, because deep learning will replace them (watch video below), on March, 27, 2018 Thomas H. DAVENPORT and Keith J. DREYER published a really nice article on “AI will change radiology, but it won’t replace radiologists” (see [1]) – which supports our human-in-the-loop approach: for sure, AI/machine learning (difference here) will change workflows, but we envision that the expert will be augmented by new technologies, i.e. routine (boring) tasks will be replaced by automatic algorithms, but this will free up expert time to spent on challenging (cool) tasks and more research – and there are plenty of problems where we need human intelligence!

[1] https://hbr.org/2018/03/ai-will-change-radiology-but-it-wont-replace-radiologists

 

 

Human-in-the-loop AI

Human-in-the-Loop-AI

This is really very interesting. In the recent April, 5, 2018, TWiML & AI (This Week in Machine Learning and Artificial Intelligence) podcast, Robert MUNRO (a graduate from Stanford University, who is an recognized expert in combining human and machine intelligence) reports on the newly branded Figure Eight [1] company, formerly known as CrowdFlower. Their Human-in-the-Loop AI platform supports data science & machine learning teams working on various topics, including autonomous vehicles, consumer product identification, natural language processing, search relevance, intelligent chatbots, and more. Most recently on disaster response and epidemiology. This is a further proof on the enormous importance and potential usefulness of the human-in-the-loop interactive machine Leanring (iML) approach! Listen to this awesome discussion led excellently by Sam CHARRINGTON:

https://twimlai.com/twiml-talk-125-human-loop-ai-emergency-response-robert-munro/

This discussion fits well to the previous discussion with Jeff DEAN (head of the Google Brain team) – who emphasized the importance of health and the limits of automatic approaches including deep learning. Enjoy to listen directly at:

https://twimlai.com/twiml-talk-124-systems-software-machine-learning-scale-jeff-dean/

[1] https://www.figure-eight.com/resources/human-in-the-loop

 

A good proof of the importance of the HCI-KDD approach, worth: 2,1 Billion USD !

Our strategic aim is to find solutions for data intensive problems by the combination of two areas, which bring ideal pre-conditions towards understanding intelligence and to bring business value in AI: Human-Computer Interaction (HCI) and Knowledge Discovery (KDD). HCI deals with questions of human intelligence, whereas KDD deals with questions of artificial intelligence, in particular with the development of scalable algorithms for finding previously unknown relationships in data, thus centers on automatic computational methods. A proverb attributed perhaps incorrectly to Albert Einstein illustrates this perfectly: “Computers are incredibly fast, accurate, but stupid. Humans are incredibly slow, inaccurate, but brilliant. Together they may be powerful beyond imagination” [1].

An article published on February, 18, 2018 by David Shaywitz [2] from Forbes reports on the recent purchase of  the oncolology data company Flatiron Health for the enormous sum of 2,1 Billion USD (remember: Deep Mind was purchased by Google for a mere 400 million GBP 😉

This supports a few hypotheses which I try to convince my students all the time (but they won’t believe me unless Google is doing it 😉

a) those who can turn raw health data into insights and understandable knowledge can produce value
b) data – and particularly big data – is useless for the decision maker, what they need is reliable, valuable and trustworthy information
c) for the complexity of sensemaking from health data we (still) need a human-in-the-loop:  Humans (still) exceed machine performance in understanding the context and explaining the underlying explanatory factors of the data
d) consequently this is a good example for the business value of our HCI-KDD approach: Let the computer find in arbitrarily high-dimensional spaces what no human is able to do – but let the human do what no computer is able to do: BOTH working together are powerful beyond imagination!

Flatiron Health [3] is a company which is specialized on health data curation, supported by technology of course, but mostly done manually by human experts in the Mechanical Turk style. Remark: The name mechanical turk has historic origins as it was inspired by an automatic 18th-century chess-playing machine by Wolfgang von Kempelen,  that beats e.g. Benjamin Franklin in chess playing – and was acclaimed as “AI”. However, ti was later revealed that it was neither a machine nor an automatic device – in fact it was a human chess master hidden in a secret space under the chessboard and controlling the movements of an humanoid dummy. Similarly,  services which help to solve problems via human intelligence are called “Mechanical Turk online services”.

[1] Holzinger, A. 2013. Human–Computer Interaction and Knowledge Discovery (HCI-KDD): What is the benefit of bringing those two fields to work together? In: Cuzzocrea, Alfredo, Kittl, Christian, Simos, Dimitris E., Weippl, Edgar & Xu, Lida (eds.) Multidisciplinary Research and Practice for Information Systems, Springer Lecture Notes in Computer Science LNCS 8127. Heidelberg, Berlin, New York: Springer, pp. 319-328, doi:10.1007/978-3-642-40511-2_22

[2] https://www.forbes.com/sites/davidshaywitz/2018/02/18/the-deeply-human-core-of-roches-2-1b-tech-acquisition-and-why-they-did-it/#6242fdbc29c2

[3] https://flatiron.com

On-Device Machine Intelligence

One very interesting approach of federated machine learning is presented by Sujith Ravi from Google: Machine learning models (e.g. CNN) are sucessfully used for the design of intelligent systems capable of visual recognition, speech and language understanding. Most of these are running on a cloud – which is often inpredictable where it is physically running. A huge problem so far is that typical machine learning models are awkward to use on mobile devices due to both computational and memory constraints. While these devices could make use of models running on high-performance data centers with CPUs or GPUs, this is not feasible for many applications and scenarios where inference needs to be performed directly “on” device. This requires re-thinking existing machine learning algorithms and coming up with new models that are directly optimized for on-device machine intelligence rather than doing post-hoc model compression. Sujith Ravi is introducing a novel “projection-based” machine learning system for training compact neural networks. The approach uses a joint optimization framework to simultaneously train a “full” deep network and a lightweight “projection” network. Unlike the full deep network, the projection network uses random projection operations that are efficient to compute and operates in bit space yielding a low memory footprint. The system is trained end-to-end using backpropagation. Ravi shows that the approach is flexible and easily extensible to other machine learning paradigms, for example, they can learn graph-based projection models using label propagation. The trained “projection” models are then directly used for inference, please watch the origial video on:

 

Prefetching – Predicting what will be most likely needed next

A very interesting paper has just been published  about prefetching, which is a nice machine learning solution: predicting which information will be most likely useful next and consequently can be prepared in advance:

Milad Hashemi, Kevin Swersky, Jamie A Smith, Grant Ayers, Heiner Litz, Jichuan Chang, Christos Kozyrakis & Parthasarathy Ranganathan 2018. Learning Memory Access Patterns. arXiv preprint arXiv:1803.02329.

Prefetching is the process of predicting future memory accesses that will miss in the on-chip cache and access memory based on past history. Each of these memory addresses are generated by a memory instruction (a load/store). Memory instructions are a subset of all instructions that interact with
the addressable memory of the computer system.

 

There is a nice article in the MIT Technology Review by Will Knight on March, 8, 2018 on the similarities on how human improve their behaviour with age – a very nice read:

https://www.technologyreview.com/s/610453/your-next-computer-could-improve-with-age/?set=

Python in Machine Learning still Nr. 1 and increasing

There is of course no such thing like a ‘best language for machine learning’ – but as a matter of fact Python is still Nr. 1 and increasing:
Image Source: https://stackoverflow.blog/2017/09/06/incredible-growth-python/

We use in all our courses Python due to the fact that it is an “industrial standard” and widely available. I would love e.g. Julia, which is much faster, but it remains rather academic and needs a lot of additional effort. It is not astonishing that Python is worldwide the most popular tool for machine learning and artificial intelligence as there are deep learning frameworks available, including Tensor Flow, Pandas, NumPy, PyBrain, Scikit, SimpleAI, EasyAI, etc. etc.

Consequently, in our courses we teach Python, have a look at:

Marcus D. Bloice & Andreas Holzinger 2016. A Tutorial on Machine Learning and Data Science Tools with Python. In: Holzinger, Andreas (ed.) Machine Learning for Health Informatics, Lecture Notes in Artificial Intelligence LNAI 9605. Heidelberg: Springer, pp. 437-483, doi:10.1007/978-3-319-50478-0_22. [link to paper]

iML with the human-in-the-loop mentioned among 10 coolest applications of machine learning

Within the “Two Minute Papers” series, Karol Károly Zsolnai-Fehér from the Institute of Computer Graphics and Algorithms at the Vienna University of Technology mentions among “10 even cooler Deep Learning Applications” our human-in-the-loop paper:

Seid Muhie Yimam, Chris Biemann, Ljiljana Majnaric, Šefket Šabanović & Andreas Holzinger 2016. An adaptive annotation approach for biomedical entity and relation recognition. Springer/Nature: Brain Informatics, 3, (3), 157-168, doi:10.1007/s40708-016-0036-4

Watch the video here (iML is mentinoned from approx. 1:20):

Here the list of all 10 papers discussed within this 2-minutes-video

1. Geolocation – https://arxiv.org/abs/1602.05314
2. Super-resolution – https://arxiv.org/pdf/1511.04491v1.pdf
3. Neural Network visualizer – https://experiments.mostafa.io/public/…
4. Recurrent neural network for sentence completion:
5. Human-in-the-loop and Doctor-in-the-loop: https://link.springer.com/article/10.1007/s40708-016-0036-4
6. Emoji suggestions for images – https://emojini.curalate.com/
7. MNIST handwritten numbers in HD – https://blog.otoro.net/2016/04/01/generating-large-images-from-latent-vectors
8. Deep Learning solution to the Netflix prize – https://karthkk.wordpress.com/2016/03/22/deep-learning-solution-for-netflix-prize/
9. Curating works of art –
10. More robust neural networks against adversarial examples – https://cs231n.stanford.edu/reports201…
The Keras library: https://keras.io/

A) The basic principle of the iML human-in-the-loop approach:

Andreas Holzinger 2016. Interactive Machine Learning for Health Informatics: When do we need the human-in-the-loop? Brain Informatics, 3, (2), 119-131, doi:10.1007/s40708-016-0042-6

B) The entry in the GI Lexikon:
https://gi.de/informatiklexikon/interactive-machine-learning-iml

C) The experimental proof-of-concept:

Andreas Holzinger, Markus Plass, Katharina Holzinger, Gloria Cerasela Crisan, Camelia-M. Pintea & Vasile Palade 2017. A glass-box interactive machine learning approach for solving NP-hard problems with the human-in-the-loop. arXiv:1708.01104.

D) Outline and Survey of application possibilities:

Andreas Holzinger, Chris Biemann, Constantinos S. Pattichis & Douglas B. Kell 2017. What do we need to build explainable AI systems for the medical domain? arXiv:1712.09923.

Andreas Holzinger, Bernd Malle, Peter Kieseberg, Peter M. Roth, Heimo Müller, Robert Reihs & Kurt Zatloukal 2017. Towards the Augmented Pathologist: Challenges of Explainable-AI in Digital Pathology. arXiv:1712.06657.

 

NIPS-2017 Best paper “Explainability was one of the major reasons the paper was given the award”

Congratulations to Arthur GRETTON from the Gatsby Computational Neuroscience Unit at the University College London an his team. Their paper titled “A Linear-Time Kernel Goodness-of-Fit Test” authored by Wittawat JITKRITTUM, Wenkai XU, Zoltan SZABO, Kenji FUKUMIZU and Arthur GRETTON won the prestigous NIPS 2017 best paper award. In the interview by Sam Charringtion from TWiML&AI, the authors of the NIPS 2017 best paper said at 14:10 in the following video that ” … explainability was one of the reasons that the paper was given the award …”, listen here:

Here is the original talk:

Algorithms

Live from NIPS 2017, presentations from the Algorithms session:• A Linear-Time Kernel Goodness-of-Fit Test• Generalization Properties of Learning with Random Features• Communication-Efficient Distributed Learning of Discrete Distributions• Optimistic posterior sampling for reinforcement learning: worst-case regret bounds• Regret Analysis for Continuous Dueling Bandit• Minimal Exploration in Structured Stochastic Bandits• Fast Rates for Bandit Optimization with Upper-Confidence Frank-Wolfe• Diving into the shallows: a computational perspective on large-scale shallow learning• Monte-Carlo Tree Search by Best Arm Identification• A framework for Multi-A(rmed)/B(andit) Testing with Online FDR Control• Parameter-Free Online Learning via Model Selection• Bregman Divergence for Stochastic Variance Reduction: Saddle-Point and Adversarial Prediction• Gaussian Quadrature for Kernel FeaturesLearning Linear Dynamical Systems via Spectral Filtering

Posted by Neural Information Processing Systems on Dienstag, 5. Dezember 2017

 

https://papers.nips.cc/paper/6630-a-linear-time-kernel-goodness-of-fit-test

In their paper the authors propose a novel adaptive test of goodness-of-fit, with computational cost linear in the number of samples. They learn the test features, which best indicates the differences between the observed samples and a reference model, by means of minimizing the false negative rate. These features are constructed via the Stein’s method, i.e. that it is not necessary to compute the normalising constant of the model. They further analyse the asymptotic Bahadur efficiency of the new test, and prove that under a mean-shift alternative, the test always has greater relative efficiency than a previous linear-time kernel test, regardless of the choice of parameters for that particular test. In experiments, the performance of their method exceeds that of the earlier linear-time test, and matches or exceeds the power of a quadratic-time kernel test. In high dimensions and where model structure may be exploited, this new goodness of fit test performs far better than a quadratic-time two-sample test based on the Maximum Mean Discrepancy, with samples drawn from the model.

The original paper can be downloaded via the NIPS pages:
https://nips.cc/Conferences/2017/Schedule?showEvent=8823

The paper is also available at arXiv:

Jitkrittum, W., Xu, W., Szabo, Z., Fukumizu, K. & Gretton, A. 2017. A Linear-Time Kernel Goodness-of-Fit Test. arXiv preprint arXiv:1705.07673.

 

People and Artificial Intelligence Research (PAIR) Initiative

We experience enormous advances in AI and ML (see here for the difference), with impressive, daily visible improvements in technical performance, particularly in speech recognition, deep learning from images, autonomous driving, etc.

It is really great that the Google Brain team led by Jeff Dean and the Google Initiative People and Artificial Intelligence Research (PAIR) supports people-centric AI systems. They are interested in augmenting human interaction with machine intelligence and foster a humanistic approach to artificial intelligence towards making people and AI partnerships productive, enjoyable and fair.

See: https://ai.google/pair

This perfectly supports our HCI-KDD approach [1] generally, and specifically our interactive Machine Learning (iML) approach with a human in the loop [2]. The basic idea of augmenting human intelligence with artificial intelligence can foster trust [6], causal reasoning, explainability and re-traceability [5] – which is of utmost importance of the medical domain [4], [3].

[1]          Andreas Holzinger 2013. Human–Computer Interaction and Knowledge Discovery (HCI-KDD): What is the benefit of bringing those two fields to work together? In: Cuzzocrea, Alfredo, Kittl, Christian, Simos, Dimitris E., Weippl, Edgar & Xu, Lida (eds.) Multidisciplinary Research and Practice for Information Systems, Springer Lecture Notes in Computer Science LNCS 8127. Heidelberg, Berlin, New York: Springer, pp. 319-328, doi:10.1007/978-3-642-40511-2_22.

[2]          Andreas Holzinger 2016. Interactive Machine Learning for Health Informatics: When do we need the human-in-the-loop? Brain Informatics, 3, (2), 119-131, doi:10.1007/s40708-016-0042-6.

[3]          Andreas Holzinger, Chris Biemann, Constantinos S. Pattichis & Douglas B. Kell 2017. What do we need to build explainable AI systems for the medical domain? arXiv:1712.09923.

[4]          Andreas Holzinger, Bernd Malle, Peter Kieseberg, Peter M. Roth, Heimo Müller, Robert Reihs & Kurt Zatloukal 2017. Towards the Augmented Pathologist: Challenges of Explainable-AI in Digital Pathology. arXiv:1712.06657.

[5]          Andreas Holzinger, Markus Plass, Katharina Holzinger, Gloria Cerasela Crisan, Camelia-M. Pintea & Vasile Palade 2017. A glass-box interactive machine learning approach for solving NP-hard problems with the human-in-the-loop. arXiv:1708.01104.

[6]          Katharina Holzinger, Klaus Mak, Peter Kieseberg & Andreas Holzinger 2018. Can we trust Machine Learning Results? Artificial Intelligence in Safety-Critical decision Support. ERCIM News, 112, (1), 42-43.

 

What is the difference between AI/ML/DL?