Marta Milo and Neil Lawrence in Reggio di Calabria at CD-MAKE 2017
The CD-MAKE 2017 in the context of the ARES conference series was a full success in beautiful Reggio di Calabria.
The CD-MAKE 2017 in the context of the ARES conference series was a full success in beautiful Reggio di Calabria.
https://bmcmedinformdecismak.biomedcentral.com/articles/collections/odds
Note: Excellent submissions to the IFIP Cross Domain Conference on Machine Learning and Knowledge Discovery (CD-MAKE), (Submission due to May, 15, 2017) relevant to the topics described below, will be invited to expand their work into this thematic series:
The use of open data for discovery science has gained much attention recently as its full potential is unfolding and being explored in projects spanning all areas of healthcare research. A plethora of data sets are now available thanks to drives to make data universally accessible and usable for discovery science. However, with these advances come inherent challenges with the processing and management of ever expanding data sources. The computational and informatics tools and methods currently used in most investigational settings are often labor intensive and rely upon technologies that have not been designed to scale and support reasoning across multi-dimensional data resources. In addition, there are many challenges associated with the storage and responsible use of open data, particularly medical data, such as privacy, data protection, safety, information security and fair use of the data. There are therefore significant demands from the research community for the development of data management and analytic tools supporting heterogeneous analytic workflows and open data sources. Effective anonymisation tools are also of paramount importance to protect data security whilst preserving the usability of the data.
The purpose of this thematic series is to bring together articles reporting advances in the use of open data including the following:
Submission is open to everyone, and all submitted manuscripts will be peer-reviewed through the standard BMC Medical Informatics and Decision Making review process. Manuscripts should be formatted according to the submission guidelines and submitted via the online submission system. Please indicate clearly in the covering letter that the manuscript is to be considered for the ‘Open data for discovery science’ collection. The deadline for submissions will be 31 July 2017.
For further information, please email the editors of the thematic series:
Andreas HOLZINGER a.holzinger@human-centered.ai,
Philip PAYNE prpayne@wustl.edu ,or the BMC in-house editor
Emma COOKSON at emma.cookson@biomedcentral.com
Link to the IFIP Cross-Domain Conference on Machine Learning and Knowledge Extraction (CD-MAKE):
https://cd-make.net
Machine learning is the fastest growing field in computer science, and Health Informatics is amongst the greatest application challenges, providing benefits in improved medical diagnoses, disease analyses, and pharmaceutical development – towards future precision medicine.
Talk announcement: Friday, 12th May, 2017, 10:00, Seminaraum 137, Parterre, Inffeldgasse 16c
by Igor Jurisica, University of Toronto and Princess Margaret Cancer Center Toronto
Abstract: Fathoming cancer and other complex disease development processes requires systematically integrating diverse types of information, including multiple high-throughput datasets and diverse annotations. This comprehensive and integrative analysis will lead to data-driven precision medicine, and in turn will help us to develop new hypotheses, and answer complex questions such as what factors cause disease; which patients are at high risk; will patients respond to a given treatment; how to rationally select a combination therapy to individual patient, etc.
Thousands of potentially important proteins remain poorly characterized. Computational biology methods, including machine learning, knowledge extraction, data mining and visualization, can help to fill this gap with accurate predictions, making disease modeling more comprehensive. Intertwining computational prediction and modeling with biological experiments will lead to more useful findings faster and more economically.
Short Bio: Igor Jurisica is Tier I Canada Research Chair in Integrative Cancer Informatics, Senior Scientist at Princess Margaret Cancer Centre, Professor at University of Toronto and Visiting Scientist at IBM CAS. He is also an Adjunct Professor at the School of Computing, Pathology and Molecular Medicine at Queen’s University, Computer Science at York University, scientist at the Institute of Neuroimmunology, Slovak Academy of Sciences and an Honorary Professor at Shanghai Jiao Tong University in China. Since 2015, he has also served as Chief Scientist at the Creative Destruction Lab, Rotman School of Management. Igor has published extensively on data mining, visualization and cancer informatics, including multiple papers in Science, Nature, Nature Medicine, Nature Methods, Journal of Clinical Oncology, and received over 9,960 citations since 2012. He has been included in Thomson Reuters 2016, 2015 & 2014 list of Highly Cited Researchers, and The World’s Most Influential Scientific Minds: 2015 & 2014 Reports.
Jurisica Lab, IBM Life Sciences Discovery Center:
Canada Tier I Research Chair: https://www.chairs-chaires.gc.ca/chairholders-titulaires/profile-eng.aspx?profileId=2347
On Nutrigenomics [1]: https://www.uhn.ca/corporate/News/Pages/Igor_Jurisica_talks_nutrigenomics.aspx
[1] Nutrigenomics tries to define the causality or relationship between specific nutrients and specific nutrient regimes (diets) on human health. The underlying idea is in personalized nutrition based on the *omics background, which may help to foster personal dietrary recommendations. Ultimately, nutrigenomics will allow effective dietary-intervention strategies to recover normal homeostasis and to prevent diet-related diseases, see: Muller, M. & Kersten, S. 2003. Nutrigenomics: goals and strategies. Nature Reviews Genetics, 4, (4), 315-322.
https://www.wikicfp.com/cfp/servlet/event.showcfp?eventid=61244©ownerid=17803
Call for Papers due to May, 15, 2017
International IFIP Cross Domain Conference for Machine Learning & Knowledge Extraction CD-MAKE
in Reggio di Calabria (Italy) August 29 – September 1, 2017
CD stands for Cross-Domain and means the integration and appraisal of different fields and application domains (e.g. Health, Industry 4.0, etc.) to provide an atmosphere to foster different perspectives and opinions. The conference is dedicated to offer an international platform for novel ideas and a fresh look on the methodologies to put crazy ideas into Business for the benefit of the human. Serendipity is a desired effect, and shall cross-fertilize methodologies and transfer of algorithmic developments.
MAKE stands for MAchine Learning & Knowledge Extraction.
CD-MAKE is a joint effort of IFIP TC 5, IFIP WG 8.4, IFIP WG 8.9 and IFIP WG 12.9 and is held in conjunction with the International Conference on Availability, Reliability and Security (ARES).
Keynote Speakers are Neil D. LAWRENCE (Amazon) and Marta MILO (University of Sheffield).
IFIP is the International Federation for Information Processing and the leading multi-national, non-governmental, apolitical organization in Information & Communications Technologies and Computer Sciences, is recognized by the United Nations and was established in the year 1960 under the auspices of the UNESCO as an outcome of the first World Computer Congress held in Paris in 1959.
Papers are sought from the following seven topical areas (see image below). Papers which deal with fundamental questions and theoretical aspects in machine learning are very welcome.
❶ Data science (data fusion, preprocessing, data mapping, knowledge representation),
❷ Machine learning (both automatic ML and interactive ML with the human-in-the-loop),
❸ Graphs/network science (i.e. graph-based data mining),
❹ Topological data analysis (i.e. topology data mining),
❺ Time/entropy (i.e. entropy-based data mining),
❻ Data visualization (i.e. visual analytics), and last but not least
❼ Privacy, data protection, safety and security (i.e. privacy aware machine learning).
Proposals for Workshops, Special Sessions, Tutorials: April, 19, 2017
Submission Deadline: May, 15, 2017
Author Notification: June, 14, 2017
Camera Ready Deadline: July, 07, 2017
Special Session on September, 1, 2017, organized by Andreas HOLZINGER, Peter KIESEBERG, Edgar WEIPPL and A Min TJOA in the context of the 12th International Conference on Availability, Reliability and Security (ARES and CD-ARES), Reggio di Calabria, Italy, August 29 – September, 2, 2017
supported by the International Federation of Information Processing IFIP > TC5 and WG 8.4 and WG 8.9
https://cd-ares-conference.eu
https://www.ares-conference.eu
Keynote Talk by Neil D. LAWRENCE, University of Sheffield and Amazon
With the new European data protection and privacy regulations coming into effect with January, 1, 2018 issues having been nice to have so far are becoming a must have. Privacy aware machine learning will be one of the most important fields for the European research community and the IT business in particular. Most affected is the whole area of biology, medicine and health, partiuclarly driven by the fact that health sciences are becoming a more and more data intensive science.
This special session will bring together scientists with diverse background, interested in both the underlying theoretical principles as well as the application of such methods for practical use in the biomedical, life sciences and health care domain. The cross-domain integration and appraisal of different fields will provide an atmosphere to foster different perspectives and opinions; it will offer a platform for novel crazy ideas and a fresh look on the methodologies to put these ideas into business.
All paper will be peer-reviewed by three members of the international PAML-commitee. Paper acceptance rate of the last session was 35 %. Accepted papers will be published in a Springer Lecture Notes in Computer Science (LNCS) Volume and excellent contributions will be invited to be extented in a special issue of a journal (planned Springer MACH and/or BMC MIDM).
Research topics covered by this special session include but are not limited to the following topics:
– Production of Open Data Sets
– Synthetic data sets for learning algorithm testing
– Privacy preserving machine learning, data mining and knowledge discovery
– Data leak detection
– Data citation
– Differential privacy
– Anonymization and pseudonymization
– Securing expert-in-the-loop machine learning systems
– Evaluation and benchmarking
This picture was taken by our local host, Francesco Buccafurri on January, 3, 2017: from the conference venue you have a direct view to the Aetna volcano:
We are organizing a special session on Privacy Aware Machine Learning for Health Data Science at the 11th international Conference on Availability, Reliability and Security (ARES and CD-ARES), Salzburg, Austria, August 29 – September, 2, 2016
supported by the International Federation of Information Processing IFIP > TC5 and WG 8.4 and WG 8.9
https://cd-ares-conference.eu
https://www.ares-conference.eu
Keynote Talk by Bernhard SCHÖLKOPF, Max Planck Institute for Intelligent Systems, Empirical Inference Department
Machine learning is the fastest growing field in computer science [Jordan, M. I. & Mitchell, T. M. 2015. Machine learning: Trends, perspectives, and prospects. Science, 349, (6245), 255-260], and it is well accepted that health informatics is amongst the greatest challenges [LeCun, Y., Bengio, Y. & Hinton, G. 2015. Deep learning. Nature, 521, (7553), 436-444 ], e.g. large-scale aggregate analyses of anonymized data can yield valuable insights addressing public health challenges and provide new avenues for scientific discovery [Horvitz, E. & Mulligan, D. 2015. Data, privacy, and the greater good. Science, 349, (6245), 253-255]. Privacy is becoming a major concern for machine learning tasks, which often operate on personal and sensitive data. Consequently, privacy, data protection, safety, information security and fair use of data is of utmost importance for health data science.
The amount of patient-related data produced in today’s clinical setting poses many challenges with respect to collection, storage and responsible use. For example, in research and public health care analysis, data must be anonymized before transfer, for which the k-anonymity measure was introduced and successively enhanced by further criteria. As k-anonymity is an NP-hard problem, which cannot be solved by automatic machine learning (aML) approaches we must often make use of approximation and heuristics. As data security is not guranteed given a certain k-anonymity degree, additional measures have been introduced in order to refine results (l-diversity, t-closeness, delta-presence). This motivates methods, methodologies and algorithmic machine learning approaches to tackle the problem. As the resulting data set will be a tradeoff between utility, usability and individual privacy and security, we need to optimize those measures to individual (subjective) standards. Moreover, the efficacy of an algorithm strongly depends on the background knowledge of an potential attacker as well as the underlying problem domain. One possible solution is to make use of interactive machine learning (iML) approaches and put a human-in-the-loop where the central question remains open: “could human intelligence lead to general heuristics we can use to improve heuristics?”
Research topics covered by this special session include but are not limited to the following topics:
– Production of Open Data Sets
– Synthetic data sets for learning algorithm testing
– Privacy preserving machine learning, data mining and knowledge discovery
– Data leak detection
– Data citation
– Differential privacy
– Anonymization and pseudonymization
– Securing expert-in-the-loop machine learning systems
– Evaluation and benchmarking
This special session will bring together scientists with diverse background, interested in both the underlying theoretical principles as well as the application of such methods for practical use in the biomedical, life sciences and health care domain. The cross-domain integration and appraisal of different fields will provide an atmosphere to foster different perspectives and opinions; it will offer a platform for novel crazy ideas and a fresh look on the methodologies to put these ideas into business.
Accepted Papers will be published in a Springer Lecture Notes in Computer Science LNCS Volume.
Schedule:
I) Deadline for submissions: April, 30, 2016
Paper submission via:
https://cd-ares-conference.eu/?page_id=43
II) Camera Ready deadline: July, 4, 2016
III) Special Session: August, 30, 2016
> Conference Venue
> Travel Information Salzburg
> Lonely Planet Salzburg
The International Scientific Committee – consisting of experts from the international expert network HCI-KDD dealing with area (7), privacy, data protection, safety and security and additionally invited international experts will ensure the highest possible scientific quality, each paper will be reviewed by at least three reviewers (the paper acceptance rate of the last special session was 35 %).
Date: Tuesday, 26th January 2016, Start: 10:00, End: 17:00; Venue: Graz University of Technology,
Institute of Computer Graphics and Knowledge Visualization CGV, hosted by Prof. Tobias SCHRECK
Address: Inffeldgasse 16c, A-8010 Graz <maps and directions>
Machine learning is the most growing field in computer science [Jordan, M. I. & Mitchell, T. M. 2015. Machine learning: Trends, perspectives, and prospects. Science, 349, (6245), 255-260], and it is well accepted that health informatics is amongst the greatest challenges [LeCun, Y., Bengio, Y. & Hinton, G. 2015. Deep learning. Nature, 521, (7553), 436-444 ].
Sucessful Machine Learning for Health Informatics requires a comprehensive understanding of the data ecosystem and a multi-disciplinary skill-set, from seven specializations: 1) data science, 2) algorithms, 3) network science, 4) graphs/topology, 5) time/entropy, 6) data visualization and visual analytics, and 7) privacy, data protection, safety and security – as supported by the international expert network HCI-KDD.
Program see: https://human-centered.ai/machine-learning-for-biomedicine-tugraz/
Machine learning is a large and rapidly developing subfield of computer science that evolved from artificial intelligence (AI) and is tightly connected with data mining and knowledge discovery. The ultimate goal of machine learning is to design and develop algorithms which can learn from data. Consequently, machine learning systems learn and improve with experience over time and their trained models can be used to predict outcomes of questions based on previously seen knowledge. In fact, the process of learning intelligent behaviour from noisy examples is one of the major questions in the field. The ability to learn from noisy, high dimensional data is highly relevant for many applications in the health informatics domain. This is due to the inherent nature of biomedical data, and health will increasingly be the focus of machine learning research in the near future.
https://human-centered.ai/machine-learning-for-health-informatics/
Title: Using Deep Learning for Discovering Knowledge from Images: Pitfalls and Best Practices
Lecturer: Marcus BLOICE <expertise>
Abstract: Neural networks have been shown to be adept at image analysis and image classification. Deep layered neural networks especially so. However, deep learning requires two things in order to work proficiently: large amounts of data and lots of processing power. In this talk both aspects are covered, allowing you to maximise the potential of deep learning. Firstly, we will learn how the computational power of GPUs can be used to speed up learning by orders of magnitude, making it possible to learn from very large datasets on commodity hardware. Thanks to software such as Theano, Caffe, and Pylearn2, the GPU can be leveraged without needing to be an expert in parallel programming. This talk will discuss how. Secondly, data preprocessing, data augmentation, and artificial data generation are discussed. These methods allow you to ensure you are making the most of the data you possess, by expanding your dataset and preparing your data properly before analysis. This means discussing best practices in data preparation, using methods such as histogram equalisation, contrast stretching or normalisation, and discussing artificial data generation in detail. The tools you require to do so are described, using multi-platform software that is freely available. Finally, the talk will touch on hyper-parameters and the best practices and pitfalls of hyper-parameter choice when training deep neural networks.
Title: Pitfalls for applying Machine Learning in HCI-KDD: Things to be aware of and how to avoid them
Lecturer: Christof STOCKER <expertise>
Abstract: When dealing with big and unstructured data sets, we often try to be creative and to experiment with a number of different approaches for the purpose of knowledge discovery. This can lead to new insights and even spark novel ideas. However, ignorant application of algorithms to unknown data is dangerous and can lead to false conclusions – with high statistical significance. In finite data sets, structure can emerge from sheer randomness. Furthermore, hidden variables can lead to significant correlations that in turn might result in wrong conclusions. Beyond this, data science as a discipline has developed into a complex area in which mistakes can occur with ease and even lead experienced scientists astray. In this talk we will investigate these pitfalls together on simple examples and discuss how we can address these concerns with manageable effort.
Title: Topological analysis of text data.
Lecturer: Hubert WAGNER <expertise>
Abstract: In this talk an ongoing effort will be described to apply persistent homology in the area of text data mining. Persistent homology is the main tool of topological data analysis. In essence, it allows to robustly describe the shape of a data set, and compare the shapes of different data sets.
First, persistent homology will be explained, emphasizing its intuitive side.
Then, it will be demonstrated how persistent homology can be applied in the context of analyzing sets of text documents. Using the vector space model interpretation, each document becomes a point in a high-dimensional space, and it is intuitive to ask about the shape of such a point cloud. It wil be discussed, how this information can be used for knowledge discovery. Finally, an algorithmic aspect is emphasized, which is crucial if industrial applications are to be tackled.
Biography: Hubert Wagner is a computer scientist, currently working as a Postdoc at the Institute of Science and Technology Austria (IST-Austria) at the Edelsbrunner Group. Having worked as a software engineer, he moved towards science and obtained a PhD degree in 2014 from the Jagiellonian University in Krakow, Poland. Hubert is interested in the application of computational geometry and topology and related algorithmic questions. He is convinced that tools such as persistent homology may offer novel and robust solutions to many problems he encountered as an engineer, including e.g. problems in text mining. This line of his research was supported by a Google Research Grant from 2011 to 2012 (with Prof. Marian Mrozek and Dr. Paweł Dłotko) and is now continued within the Topological Complex Systems (TOPOSYS) grant. Efficient algorithms and their implementations are an important part of his work.
More Information: https://publist.ist.ac.at/ist/people/180-Hubert_Wagner/works
This site uses cookies. By continuing to browse the site, you are agreeing to our use of cookies.
Accept all cookies and servicesDo not acceptLearn moreWe may request cookies to be set on your device. We use cookies to let us know when you visit our websites, how you interact with us, to enrich your user experience, and to customize your relationship with our website.
Click on the different category headings to find out more. You can also change some of your preferences. Note that blocking some types of cookies may impact your experience on our websites and the services we are able to offer.
These cookies are strictly necessary to provide you with services available through our website and to use some of its features.
Because these cookies are strictly necessary to deliver the website, refusing them will have impact how our site functions. You always can block or delete cookies by changing your browser settings and force blocking all cookies on this website. But this will always prompt you to accept/refuse cookies when revisiting our site.
We fully respect if you want to refuse cookies but to avoid asking you again and again kindly allow us to store a cookie for that. You are free to opt out any time or opt in for other cookies to get a better experience. If you refuse cookies we will remove all set cookies in our domain.
We provide you with a list of stored cookies on your computer in our domain so you can check what we stored. Due to security reasons we are not able to show or modify cookies from other domains. You can check these in your browser security settings.
These cookies collect information that is used either in aggregate form to help us understand how our website is being used or how effective our marketing campaigns are, or to help us customize our website and application for you in order to enhance your experience.
If you do not want that we track your visit to our site you can disable tracking in your browser here:
We also use different external services like Google Webfonts, Google Maps, and external Video providers. Since these providers may collect personal data like your IP address we allow you to block them here. Please be aware that this might heavily reduce the functionality and Menus of our site. Changes will take effect once you reload the page.
Google Webfont Settings:
Google Map Settings:
Google reCaptcha Settings:
Vimeo and Youtube video embeds:
The following cookies are also needed - You can choose if you want to allow them:
You can read about our cookies and privacy settings in detail on our Privacy Policy Page.
Legal Information – Impressum