Open and Collaborative Digital Pathology using Cytomine
In this talk Raphael Maree will present the past, present, and future of Cytomine.
Cytomine ,  is an open-source software, continuously developed since 2010. It is based on modern web and distributed software development methodologies and machine learning, i.e. deep learning. It provides remote and collaborative features so that users can readily and securely share their large-scale imaging data worldwide. It relies on data models that allow to easily organize and semantically annotate imaging datasets in a standardized way (e.g. to build pathology atlases for training courses or ground-truth datasets for machine learning). It efficiently supports digital slides produced by most scanner vendors. It provides mechanisms to proofread and share image quantifications produced by machine/deep learning-based algorithms. Cytomine can be used free of charge and it is distributed under a permissive license. It has been installed at various institutes worldwide and it is used by thousands of users in research and educational settings.
Recent research and developments will be presented such as our new web user interfaces and new modules for multimodal and multispectral data (Proteomics Clin Appl, 2019), object recognition in histology and cytology using deep transfer learning (CVMI 2018), user behavior analytics in educational settings (ECDP 2018), as well as our new reproducible architecture to benchmark bioimage analysis workflows.
Raphaël Marée received the PhD degree in computer science in 2005 from the University of Liège, Belgium, where he is now working at the Montefiore EE&CS Institute (http://www.montefiore.ulg.ac.be/~maree/). In 2010 he initiated the CYTOMINE research project (http://uliege.cytomine.org/), and since 2017 he is also co-founder of the not-for-profit Cytomine cooperative (http://cytomine.coop). His research interests are in the broad area of machine learning, computer vision techniques, and web-based software development, with specific focus on their applications on big imaging data such as in digital pathology and life science research, while following open science principles.
 Raphaël Marée, Loïc Rollus, Benjamin Stévens, Renaud Hoyoux, Gilles Louppe, Rémy Vandaele, Jean-Michel Begon, Philipp Kainz, Pierre Geurts & Louis Wehenkel 2016. Collaborative analysis of multi-gigapixel imaging data using Cytomine. Bioinformatics, 32, (9), 1395-1401, doi:10.1093/bioinformatics/btw013.
Google Scholar Profile of Raphael Maree:
Homepage of Raphael Maree:
Yoshua BENGIO from the Canadian Institute for Advanced Research (CIFAR) emphasized during his workshop talk entitled “towards disentangling underlying explanatory factors” (cool title) at the ICML 2018 in Stockholm, that the key for success in AI/machine learning is to understand the explanatory/causal factors and mechanisms. This means generalizing beyond identical independent data (i.i.d.); current machine learning theories are strongly dependent on this iid assumption, but applications in the real-world (we see this in the medical domain!) often require learning and generalizing in areas simply not seen during the training epoch. Humans interestingly are able to protect themselves in such situations, even in situations which they have never seen before. See Yoshua BENGIO’s awesome talk here:
and here a longer talk (1:17:04) at Microsoft Research Redmond on January, 22, 2018 – awesome – enjoy the talk, I recommend it cordially to all my students!
Prof. Dr. Klaus-Robert MÜLLER from the TU Berlin was our keynote speaker on Tuesday, August, 28th, 2018 during our CD-MAKE conference at the University of Hamburg, see:
Klaus-Robert emphasized in his talk the “right of explanation” by the new European Union General Data Protection Regulations, but also shows some diffulties, challenges and future research directions in the area what is now called explainable AI. Here you find his presentation slides with friendly permission from Klaus-Robert MÜLLER:
Here some snapshots from the keynote:
Thanks to Klaus-Robert for his presentation!
The IEEE DISA 2018 World Symposium on Digital Intelligence for Systems and Machines was organized by the TU Kosice:
Here you can download my keynote presentation (see title and abstract below)
a) 4 Slides per page (pdf, 5,280 kB):
b) 1 slide per page (pdf, 8,198 kB):
c) and here the link to the paper (IEEE Xplore)
From Machine Learning to Explainable AI
d) and here the link to the video recording
Title: Explainable AI: Augmenting Human Intelligence with Artificial Intelligence and v.v
Abstract: Explainable AI is not a new field. Rather, the problem of explainability is as old as AI itself. While rule‐based approaches of early AI are comprehensible “glass‐box” approaches at least in narrow domains, their weakness was in dealing with uncertainties of the real world. The introduction of probabilistic learning methods has made AI increasingly successful. Meanwhile deep learning approaches even exceed human performance in particular tasks. However, such approaches are becoming increasingly opaque, and even if we understand the underlying mathematical principles of such models they lack still explicit declarative knowledge. For example, words are mapped to high‐dimensional vectors, making them unintelligible to humans. What we need in the future are context‐adaptive procedures, i.e. systems that construct contextual explanatory models for classes of real‐world phenomena.
Maybe one step is in linking probabilistic learning methods with large knowledge representations (ontologies), thus allowing to understand how a machine decision has been reached, making results re‐traceable, explainable and comprehensible on demand ‐ the goal of explainable AI.
Machine learning is the fastest growing field in computer science, and Health Informatics is amongst the greatest application challenges, providing benefits in improved medical diagnoses, disease analyses, and pharmaceutical development – towards future precision medicine.
Talk announcement: Friday, 12th May, 2017, 10:00, Seminaraum 137, Parterre, Inffeldgasse 16c
Integrated interactomes and pathways in precision medicine
by Igor Jurisica, University of Toronto and Princess Margaret Cancer Center Toronto
Abstract: Fathoming cancer and other complex disease development processes requires systematically integrating diverse types of information, including multiple high-throughput datasets and diverse annotations. This comprehensive and integrative analysis will lead to data-driven precision medicine, and in turn will help us to develop new hypotheses, and answer complex questions such as what factors cause disease; which patients are at high risk; will patients respond to a given treatment; how to rationally select a combination therapy to individual patient, etc.
Thousands of potentially important proteins remain poorly characterized. Computational biology methods, including machine learning, knowledge extraction, data mining and visualization, can help to fill this gap with accurate predictions, making disease modeling more comprehensive. Intertwining computational prediction and modeling with biological experiments will lead to more useful findings faster and more economically.
Short Bio: Igor Jurisica is Tier I Canada Research Chair in Integrative Cancer Informatics, Senior Scientist at Princess Margaret Cancer Centre, Professor at University of Toronto and Visiting Scientist at IBM CAS. He is also an Adjunct Professor at the School of Computing, Pathology and Molecular Medicine at Queen’s University, Computer Science at York University, scientist at the Institute of Neuroimmunology, Slovak Academy of Sciences and an Honorary Professor at Shanghai Jiao Tong University in China. Since 2015, he has also served as Chief Scientist at the Creative Destruction Lab, Rotman School of Management. Igor has published extensively on data mining, visualization and cancer informatics, including multiple papers in Science, Nature, Nature Medicine, Nature Methods, Journal of Clinical Oncology, and received over 9,960 citations since 2012. He has been included in Thomson Reuters 2016, 2015 & 2014 list of Highly Cited Researchers, and The World’s Most Influential Scientific Minds: 2015 & 2014 Reports.
Jurisica Lab, IBM Life Sciences Discovery Center:
Canada Tier I Research Chair: http://www.chairs-chaires.gc.ca/chairholders-titulaires/profile-eng.aspx?profileId=2347
On Nutrigenomics : http://www.uhn.ca/corporate/News/Pages/Igor_Jurisica_talks_nutrigenomics.aspx
 Nutrigenomics tries to define the causality or relationship between specific nutrients and specific nutrient regimes (diets) on human health. The underlying idea is in personalized nutrition based on the *omics background, which may help to foster personal dietrary recommendations. Ultimately, nutrigenomics will allow effective dietary-intervention strategies to recover normal homeostasis and to prevent diet-related diseases, see: Muller, M. & Kersten, S. 2003. Nutrigenomics: goals and strategies. Nature Reviews Genetics, 4, (4), 315-322.
The Machine Learing Guide by Tyler RENELLE (Tensor Flow, O-C-Devel) is highly recommendable to my students! This series aims to teach the high level fundamentals of machine learning with a focus on algorithms and some underlying mathematics, which is really great.
Data Skeptic is a weekly podcast that is skeptical of and with data. They explain methods and algorithms that power our world in an accessible manner through short mini-episode discussions and longer interviews with experts in the field, see:
Machine Learning for Health Informatics
Machine learning is a large and rapidly developing subfield of computer science that evolved from artificial intelligence (AI) and is tightly connected with data mining and knowledge discovery. The ultimate goal of machine learning is to design and develop algorithms which can learn from data. Consequently, machine learning systems learn and improve with experience over time and their trained models can be used to predict outcomes of questions based on previously seen knowledge. In fact, the process of learning intelligent behaviour from noisy examples is one of the major questions in the field. The ability to learn from noisy, high dimensional data is highly relevant for many applications in the health informatics domain. This is due to the inherent nature of biomedical data, and health will increasingly be the focus of machine learning research in the near future.