explainable AI comparing human intelligence with artificial intelligence

exAI Survey Version 1.0 as of December, 19, 2018, 13:00 CET

Welcome to the exAI Survey!

This experiment has already been finished, thank you for your participation – please proceed to our new experiment and challenges with our KANDINSKY-Patterns: https://human-centered.ai/project/kandinsky-patterns

Please proceed to the KANDINSKY-Patterns page

If you have any questions please contact:
Principal Investigator Andreas Holzinger: andreas.holzinger AT medunigraz.at
Cognitive Scientist: michael.kickmeier AT phsg.ch
Digital Pathology Expert: heimo.mueller AT medunigraz.at

Some background information for the interested participant:

With this online experiment we will examine differences between human explanations and machine explanations to gain insight into explainable AI generally, on how to build explanation interfaces specifically. Most of all this will provide us ultimately insights into principles of transfer learning.

If you want to know what explainable AI in the context of health informatics means in principle watch this Youtube video.

Technically, for the purpose of this experiment, we train a multi-layer perceptron (MLP *). This is a class of feed-forward artificial neural networks, and the classic example of a “Deep Learning” approach. For the differences between Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning (DL) see this explanation.

We interpret the results of this deep neural network by using one specific method of explainable AI: “Local Interpretable Model-Agnostic Explanations, in short: LIME [1]” developed by Marco Tulio RIBEIRO, Sameer SINGH and Carlos GUESTRIN. Note: Do not confuse LIME with LimeSurvey [2], which we use for carrying out this experiment.

Technically, our experimental images are sent through and classified by our multi-layer perceptron. Consequently, LIME determines why a classification was performed in this way, because LIME will output a rule that contains certain conditions. These conditions contain the properties of an image. This rule can now be transformed into short quantified human understandable sentences [3] in German or English natural language.

Now, we compare the differences between human explanations and machine explanations. The experiment consists of several runs. It starts with simple geometrical objects extending experiments done by cognitive scientists in the 1960ies. Later we will extent the experiment with histopathological images from our Digital Pathology Project. This is relevant due to the fact that explainability is of fundamental and increasing importance within the medical domain [4].

Our experiments have much potential for getting insight into transfer learning – which is a core hot topic in fundamental machine learning research. Basically, transfer learning is avoding catastrophic forgetting, i.e. if one algorithm is trained on a specific task, it is most often unable to solve another, different task. Interestingly, humans are very good to transfer their learned knowledge to solve a previously unknown, new, task.

How is the human-machine comparison carried out? We follow and extent previous work by Stephen K. REED [5], [6].

In the explanations we look for differences in the following details:

1) the language used and/or the linguistic structure of the explanations;

2) Structure of the explanation in its entirety and/or explanation approach and/or explanation structure in detail;

3) Elements of the explanation in the sense of the smallest units, quasi the “features”, for example the position of objects, color of objects, type of objects, size of objects, etc.

Footnotes:

*) A multi-layer perceptron is interesting for us for a number of reasons. As a mathematical function it is simple. It maps input variables to output variables, where the function is composed by even simpler functions (“luckily the world is compositional” **). Learning a relevant representation of the underlying data is a key element of deep learning. Historically, these approaches go back to the early theories of biological learning [7], and became famous in the 1960ies [8]. Under the popular term “Deep Learning” these networks are experiencing a rebirth and a unnessary hype nowadays [9].

**) According to Yann LeCun ” … it’s probably due to the fact that the world is essentially compositional. That means that our perceptual world is compositional in the sense that you know that images are made of edges or oriented contours …” refer to the Youtube lecture at 21’31 ff here: https://www.youtube.com/watch?v=U2mhZ9E8Fk8

***) Explainable AI raises several very old and important questions in fundamental cognitive issues. For example Jean PIAGET & Bärbel INHELDER (1969, p. 87, original in French [10]) distinguish between two aspects of cognition: a) the figurative aspect (approximate copies of objects), and b) the operative aspect (modifying/transforming an object); An important question still today is to what extent do images preserve the detail of objects and what kind of operations can be applied to images. This is important in our digital pathology research project when we study how our pathologists are working: a) How we can imitate an pathologist with machine learning and b) What can machine learning find beyond what a pathologist is able to do. These questions become ultimatively important for medical decision support in the future.

References:

[1] Marco Tulio Ribeiro, Sameer Singh & Carlos Guestrin (2016). Why should I trust you?: Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, San Francisco (CA), ACM, 1135-1144, doi:10.1145/2939672.2939778.

[2] Carsten Schmitz (2012). LimeSurvey: An open source survey tool. Available online: https://www.limesurvey.org [Last accessed on 19.12.2018, 14:20 CET].

[3] Miroslav Hudec, Erika Bednárová & Andreas Holzinger 2018. Augmenting Statistical Data Dissemination by Short Quantified Sentences of Natural Language. Journal of Official Statistics (JOS), 34, (4), 981, doi:10.2478/jos-2018-0048.

[4] Andreas Holzinger, Bernd Malle, Peter Kieseberg, Peter M. Roth, Heimo Müller, Robert Reihs & Kurt Zatloukal 2017. Towards the Augmented Pathologist: Challenges of Explainable-AI in Digital Pathology. arXiv:1712.06657.

[5] Stephen K. Reed 1972. Pattern recognition and categorization. Cognitive psychology, 3, (3), Elsevier, 382-407.
doi: 10.2478/jos-2018-0048

[6] Stephen K. Reed 1974. Structural descriptions and the limitations of visual images. Memory & Cognition, 2, (2), 329-336.

[7] W.S. McCulloch & W. Pitts 1943. A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biology, 5, (4), 115-133, doi:10.1007/BF02459570.

[8] Frank Rosenblatt 1958. The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65, (6), 386-408, doi:10.1037/h0042519.

[9] Yann LeCun, Yoshua Bengio & Geoffrey Hinton 2015. Deep learning. Nature, 521, (7553), 436-444, doi:10.1038/nature14539. https://www.nature.com/articles/nature14539

[10] Jean Piaget, Paul Fraisse, Éliane Vurpillot & Robert Francès (eds.) 1963. Traité de psychologie expérimentale: VI. La perception, Paris: Presses Universitaires de France.

Experiment: Human Intelligence vs. Artificial Intelligence in Pattern Recognition

Welcome to the exAI Survey!

This experiment has already been finished, thank you for your participation – please proceed to our new experiment and challenges with our KANDINSKY-Patterns: https://human-centered.ai/project/kandinsky-patterns

Some background information for the interested participant: