Plenary Sessions

Keynotes

Materials Science Data Management Initiatives at NIST

Dr. Robert J. Hanisch, NIST

Abstract

NIST's Office of Data and Informatics (ODI, http://www.nist.gov/mml/odi/) is a premier, pioneering resource for researchers and institutions in the biological, chemical, and materials sciences. The ODI supports National needs such as the Materials Genome Initiative (MGI) and biological and chemical data integration, as well as the modernization of current NIST reference data services for use in state-of-the-art computer paradigms and the development of next generation NIST reference data services. The ODI also facilitates the Material Measurement Laboratory (MML, http://www.nist.gov/mml/organization.cfm) as the major operating unit within NIST focused on measurement research, standards and data in the chemical, biological and materials science.

The keynote will show how a service-oriented organization, the ODI adds value to data activities (including Big Data and data.gov) by providing guidance, assistance and resources for optimizing the discoverability, usability, and interoperability of data products in ways that support NIST scientists and stakeholders. In addition, by fostering collaboration and coordination among MML domain experts and other data specialists at NIST, the ODI supports MML research programs where advanced manipulation, visualization, and analysis of large data sets are needed to advance knowledge. The keynote will include examples showing how joint work of ODI and MML in data analysis and management is organized in the research programs of NIST.

The keynote will show also the progress in a number of national and international initiatives supported by ODI in the area of materials science:

  • A shared data repository for collaborations related to the Materials Genome Initiative, deployed on the DSpace data management platform.
  • The US National Data Services Consortium and the Materials Data Facility MGI related pilot project (I. Foster, U. Chicago, PI).
  • The Research Data Alliance Materials Data, Infrastructure, and Interoperability Interest Group, from which a Working Group is being formed to develop an international materials science data resource registry.
  • ODI works with the NIST Information Technology Laboratory in the development of the Materials Data Curation System, a tool that supports the detailed annotation of datasets.

Dr. Robert J. Hanisch Bio

Dr. Robert J. Hanisch has been the director of NIST's Office of Data and Informatics (ODI) within the Material Measurement Laboratory since July 2014. The role of ODI is to provide data management support and guidance for the one thousand researchers in the Laboratory, to manage NIST’s Standard Reference Data program, and to be a resource for NIST staff in the areas of informatics and analytics. Dr. Hanisch was previously a Senior Scientist at the Space Telescope Science Institute (STScI), Baltimore, Maryland, and the Director of the U.S. Virtual Astronomical Observatory, a program funded by the National Science Foundation and the National Aeronautics and Space Administration. Dr. Hanisch led many efforts in the astronomy community in the area of information systems and services, focusing particularly on efforts to improve the accessibility and interoperability of data archives and catalogs. He was the first chair of the International Virtual Observatory Alliance Executive Committee (2002-2003). From 2000 to 2002 he served as Chief Information Officer at STScI, overseeing all computing, networking, and information services for the Institute. Prior to that he had oversight responsibility for the Hubble Space Telescope Data Archive and led the effort to establish the Multimission Archive at Space Telescope—MAST—as the optical/UV archive center for NASA astrophysics missions. He completed his Ph.D. in Astronomy in 1981 at the University of Maryland, College Park.


The Big Mechanism Program: Changing How Science Is Done

Dr. Andrey Rzhetsky, University of Chicago

Abstract

DARPA is funding the Big Mechanism program (http://www.darpa.mil/program/big-mechanism) in order to study large, explanatory models of complicated systems in which interactions have important causal effects. The program’s aim is to develop technology used to read research abstracts and papers and extract pieces of causal mechanisms, assemble these pieces into more complete causal models, and reason over these models to produce explanations. The program’s domain is cancer biology, with an emphasis on signaling pathways; this is just one example of causal, explanatory models.

The program is currently organized into three consortia, all of which have different views of causal models, different reading technologies, and different use cases. The largest consortium, called FRIES, includes groups at CMU, SRI, University of Arizona, Oregon Health Sciences University, and others. This consortium’s main focus is to explain signaling pathway behaviors. For instance, why is the expression of a gene ephemeral ? Technologically, FRIES focuses on information extraction over deep reading, simulation, and even FPGA acceleration of systems biology simulators.

The second consortium (“UChicago”), in which the author of this keynote acts as the PI, is composed of researchers at the University of Chicago, the United Kingdom’s National Center for Text Mining at the University of Manchester, along with participants from the Brunel University in London, all of whom collaborate on developing robotic platforms for experiment design and analysis.

The third consortium, called CURE, consists of two groups from Harvard Medical School, IHMC in Florida, and SIFT. Their focus is on deep reading, fine-grained modeling, and simulation of cell signaling’s underlying biochemistry.

This talk will provide an overview of the objectives and results related mostly to the work of the second consortium, including issues surrounding the machine reading integration of the cancer literature with: (1) probabilistic reasoning across cancer claims culled from literature which uses custom-designed ontologies; (2) the computational modeling of cancer mechanisms and pathways to automatically predict therapeutic clues; (3) automated hypothesis generation to strategically extend this knowledge, and; (4) a ‘Robot Scientist’ that performs experiments to test hypotheses probabilistically, then feeding those results back to the system.

A comparison with other cognitive technologies presently used in cancer biology (e.g., with those provided by IBM) will also be provided. How this research and its influence is changing science will also be discussed.

Dr. Andrey Rzhetsky Bio

Andrey Rzhetsky is an Edna K. Papazian Professor of Medicine and Human Genetics, at the University of Chicago. He is also a Pritzker Scholar, and a Senior Fellow of both the Computation Institute, and the Institute for Genomics and Systems Biology at the University of Chicago. He graduated from the Novosibirsk State University with specialization in mathematical biology, and received his Ph.D. in the same area from the Institute of Cytology and Genetics of the USSR Academy of Sciences.

His research is focused on computational analysis of complex human phenotypes in context of changes and perturbations of underlying molecular networks. The input data for these studies is supplied by large-scale mining of free text, computation over clinical records, and high-throughput systems biology experiments. In 2011, Rzhetsky was awarded a $13.7 million grant from the National Institutes of Health titled “Conte Center for Computational Systems Genomics of Neuropsychiatric Phenotypes.” This five-year project involves investigators from seven institutions across the United States and Israel who will study new computational methodology for the analysis of multiple complex mental health disorders, such as autism and schizophrenia. Another large project Rzhetsky is leading is a DARPA-funded Big Mechanism “UChicago” consortium ($4.5 million).

Overview of the European Strategy in Research Infrastructures

Dr. Dimitrios Tzovaras, CERTH/ITI

Abstract

The European Strategy Forum on Research Infrastructures (ESFRI) has recently presented its updated 2016 Roadmap which demonstrates the dynamism of the European scientific community and the commitment of Member States to develop new research infrastructures (RI) at the European level. Horizon 2020 i.e. the biggest EU Research and Innovation Programme, has as its main objective to ensure the implementation and operation of the ESFRI and other worldclass research infrastructures, including the development of regional partner facilities; integration of and access to national research infrastructures; and the development, deployment and operation of e-infrastructures.

The keynote will start with an overview of the European strategy for RI, with a special emphasis in e-Infrastructures, as defined in the ESFRI Strategy report published in March 2016. An analysis of the impact of research infrastructures on structuring the European Research Area as well as the global research scene, and of the overall contribution to European competitiveness will also follow. A further analysis of the ESFRI Projects and ESFRI Landmarks will also be provided focusing on RI and e-Infrastructure projects that are data intensive. A Landscape Analysis will also be presented that provides the current context, in each domain, of the operational national and international research infrastructures open to European scientists and technology developers through peer-review of competitive science proposals. The e-infrastructures landscape, transversal to all domains, will be also elaborated as approached by the e-Infrastructure Reflection Group (e-IRG). A brief introduction of the major pan-European horizontal data intensive e-infrastructure initiatives, under FP7 and H2020, will be provided and some examples of services provided will be given.

The keynote will also focus on recent initiatives and activities supporting the e-infrastructure activities in Horizon 2020:

  • The European Open Science Cloud initiative activities, towards facilitating integration in the area of European e-Infrastructures and connected services between the member states, at the European level, and internationally.
  • Activities of the e-Infrastructure Reflection Group (e-IRG), focusing on presenting the e-IRG Roadmap 2016 to be published this Summer 2016, which will give guidance and recommendations for policy and technical discussions on the main European Open Science Cloud topics.
  • Activities involving EU - Russian Federation cooperation in the area of RI in FP7 and H2020. Furthermore, the major Russian Federation Research Infrastructure initiatives will be presented as well as examples of the services they provide. Opportunities for enhancing the collaboration with the EU in the area of data-intensive e-Infrastructures will be examined and analyzed.

Dr. Dimitrios Tzovaras Bio

Dr. Dimitrios Tzovaras is a Senior Researcher Grade A’ (Professor) and Director at CERTH/ITI (the Information Technologies Institute of the Centre for Research and Technology Hellas). He received the Diploma in Electrical Engineering and the Ph.D. in 2D and 3D Image Compression from the Aristotle University of Thessaloniki, Greece in 1992 and 1997, respectively. Prior to his current position, he was a Senior Researcher on the Information Processing Laboratory at the Electrical and Computer Engineering Department of the Aristotle University of Thessaloniki. His main research interests include network and visual analytics for network security, computer security, data fusion, biometric security, virtual reality, machine learning and artificial intelligence. He is author or co-author of over 110 articles in refereed journals and over 300 papers in international conferences.

Since 2004, he has been Associate Editor in the following International journals: Journal of Applied Signal Processing (JASP) and Journal on Advances in Multimedia of EURASΙP. Additionally, he is Associate Editor in the IEEE Signal Processing Letters journal (since 2009) and Senior Associate Editor in the IEEE Signal Processing Letters journal (since 2012), while since mid-2012 he has been also Associate Editor in the IEEE Transactions on Image Processing journal. Over the same period, Dr. Tzovaras acted as ad hoc reviewer for a large number of International Journals and Magazines such as IEEE, ACM, Elsevier and EURASIP, as well as International Scientific Conferences (ICIP, EUSIPCO, CVPR, etc.).

Since 1992, Dr. Tzovaras has been involved in more than 100 European projects, funded by the EC and the Greek Ministry of Research and Technology. Within these research projects, he has acted as the Scientific Responsible of the research group of CERTH/ITI, but also as the Coordinator and/or the Technical/Scientific Manager of many of them (coordinator of technical manager in 19 projects – 7 H2020, 1 FP7 ICT IP, 7 FP7 ICT STREP, 3 FP6 IST STREP and 1 Nationally funded project).

Invited Talks

Text Mining bridging the gap between knowledge and text

Dr. Sophia Ananiadou, NaCTeM, University of Manchecter

Abstract

Text mining plays a key role in automatic semantic metadata extraction, driving the extraction of structured information from unstructured documents in the form of named entities or fine-grained and often complex relations between them (events). Event extraction techniques are used in biomedicine to extract structured representations for applications such as semantic search and pathway construction. Increasingly it becomes important not only to extract events but also to capture their contextual interpretation such as certainty, negation, source, etc. Current methods for adaptive and configurable event extraction in different domains and applications: a) biomedicine for pathway construction; b) history and newswire for semantic search will be described. How we address issues of incompatibility encountered by different subject domains, input/output formats, language and semantic representation will be discussed also. How such issues can be alleviated by NaCTeM's text mining infrastructure, Argo which fosters interoperability of resources by giving its users access to an environment that supports the integration of diverse corpora, terminologies and tools into unified solutions will be explained.

Dr. Sophia Ananiadou Bio

Sophia Ananiadou received her PhD in Natural Language Processing (NLP) from the University of Manchester. She is Professor of Computer Science in the School of Computer Science, University of Manchester and has led the National Centre for Text Mining (NaCTeM) since 2004. Her main areas of research are semantic text mining and semantic search techniques for applications in domains such as medicine, systems biology, public health, chemistry, biodiversity and history of medicine. She is also involved in developing large-scale resources (terminological resources, BioLexicon) and interoperable text mining platforms (OpenMinted). Her current and recent projects include semantic search for Europe PubMedCentral, supporting evidence-based systematic reviews in collaboration with NICE, supporting the development of biomarker tests in the Manchester Molecular Pathology Innovation Centre (MMPathIC), mining time sensitive information from historical medical documents and extracting complex claims for the development of networks, hypothesis generation and experimental testing in Cancer Biology (Big Mechanism). She has received the IBM UIMA innovation award three times (2006-2008) in recognition of her contribution towards the development of interoperable platforms for text mining and was also awarded the Daiwa Adrian prize (2004) for her collaboration efforts in biomedical text mining with Japan. She has authored over 300 publications and her h-index is 44.


Important dates

Conference
Paper submission 05.06.2016
Tutorial application submission 31.05.2016
Notification of acceptance 11.07.2016
Camera-ready papers 05.08.2016
PhD Workshop
Paper submission 10.06.2016
Notification of acceptance 30.06.2016
Camera-ready papers 05.08.2016
Satellite Events
Satellite Event submission 01.04.2016
Notification of acceptance 15.04.2016