Welcome to the Spring 2015 UK-QSAR Newsletter!
The Spring UK-QSAR meeting will be hosted by Lhasa at their offices in Leeds on 21 May. The meeting will have a strong toxicity modelling theme with seven excellent speakers, posters and a panel discussion. Lhasa will also host an evening meal the night before. UK-QSAR meetings are often oversubscribed so do register early to avoid disappointment. If you have registered already but can no longer attend please let us know so we can release your place. You can find the agenda online and some suggested pre-meeting reading material is available should you wish to explore some of the upcoming topics in more detail ahead of the meeting. Many thanks to Lhasa for hosting and sponsoring the meeting. You can find more information on Lhasa below.
In this edition we include an article from George Papadatos of the EBI at Hinxton about the use of the freely available KNIME data pipelining tool which is now being widely used throughout the academic and industrial communities to assist data processing/handling. You’ll also find the regular articles on Jobs and Upcoming Meetings.
Finally, another date for your diaries. Our Autumn Meeting will be hosted by Cresset at Duxford near Cambridge, on 6th October 2015.
As ever, please send any feedback or suggestions you have for future newsletters to Susan Boyd at firstname.lastname@example.org.
George Papadatos, ChEMBL group, EMBL-EBI, European Bioinformatics Institute, Hinxton, UK
Workflow systems have now become an important everyday tool, which allows both experimental and computational scientists to deal with the current data explosion. Such specialised workflow systems include both commercial implementations, such as BioVia’s Pipeline Pilot, as well as freely available or Open Source ones, such as Taverna, Kepler, Galaxy, Orange and KNIME. The versatility of these tools make them ideal for the inherently decentralised world of contemporary life science research, including drug discovery, systems biology and the various “-informatics” and “-omics” disciplines.
The main advantage of KNIME (KoNstanz Information MinEr) is that it provides a user-friendly and intuitive graphical user interface, which enables the user to generate and store complex workflows for data mining, analytics and decision making, with little or even no need for computer programming skills. Instead, the user can opt to select standardised nodes from a node repository and connect them together, in a process also known as “visual programming”. The application domains vary widely from standard data manipulation to sophisticated text, image and graph mining, along with machine learning, computational chemistry and chemoinformatics. In the context of industrial drug discovery, this makes KNIME an attractive platform for computational chemists, who use it to glue together diverse and disparate types of available resources, such as databases, flat files, web services, third party software applications and legacy command line tools, and deploy prototype workflows and tools to medicinal chemists. These end users can then easily modify, review and run these workflows on their own, thus making KNIME a “common ground” between the two disciplines; a currently well-established paradigm in several pharmaceutical companies.
Another use of KNIME is that it serves as a regulated framework where academic, industrial and not-for-profit groups can develop their tools and algorithms and share them freely with the scientific community in the form of node collection contributions. Novartis (RDKit), Eli Lilly (Erl Wood KNIME Posts), Vernalis (Vernalis KNIME nodes), Max Planck Institute (MPI tools), OpenPHACTS (OpenPHACTS KNIME Resource) and EMBL-EBI (EBI KNIME Extensions) are such examples of popular community contributions in the life sciences domain. At the same time, commercial software vendors often wrap their algorithms as KNIME nodes and license them to their customers. Having standardised nodes together with easily shared and transparent workflows on a freely available platform evidently boosts scientific collaboration and reproducibility (see here, for an excellent example).
Are there any disadvantages? Of course! KNIME has a rather steep learning curve and beginners usually need to spend some time with it before they can appreciate how it all works. Furthermore, nodes are often treated as “black boxes” and thus users can be disconnected from the underlying algorithms, which may lead to misuse of some methodology. Despite the very large number of nodes available, there are times when the functionality that you are looking for, e.g. manipulation and standardisation of chemical structures, is simply not there. In such cases, one may have to develop proprietary nodes. However, as KNIME integrates nicely with popular scripting languages such as Perl, Python and R, developing new functionality is not that difficult.
At the ChEMBL group, we provide several examples of workflows that combine the open data analysis capabilities of KNIME with our open data, resources and tools, such as the ChEMBL web services, UniChem and myChEMBL. Please get in touch, if you’d like more information on these. We also run a drug discovery course every year which includes a hands-on workshop on KNIME.
Compiled by Thierry Hanser of Lhasa UK Ltd with contributions from the speakers listed below. The Committee would like to extend their thanks to everyone for their input.
Combining an expert system with machine learning to rank metabolites.
Edward Rosser (Lhasa Limited)
- Testa, B.; Balmat, A. L.; Long, A.; Judson, P. Predicting drug metabolism – an evaluation of the expert system METEOR. Biodivers. 2005, 2(7), 872-885.
- Button, W. G; Judson, P. N.; Long, A.; Vessey, J. D. Using absolute and relative reasoning in the prediction of the potential metabolism of xenobiotics. Chem. Inf. Comput. Sci. 2003, 43(5), 1371-1377.
Using QSAR for metabolic network completion and metabolic engineering.
Jean-Loup Faulon (Institut of Systems & Synthetic Biology)
- Carbonell P, Parutto P, Herisson J, Pandit S.B, Faulon J.L. XTMS: pathway design in an eXTended metabolic space. Nucleic Acids Res. in press 2014, [PMID: 24792156].
- Faulon J.L., Misra M., Martin S., Sale, K., Sapra R.. Genome Scale Enzyme-metabolites and Drug-Target interaction predictions using the signature molecular descriptor, Bioinformatics, 24, 225-233, 2008.
- Fernández-Castané A, Fehér T, Carbonell P, Pauthenier C, Faulon J.L. Computer-aided design for metabolic engineering. J Biotechnol. in press 2014, [PMID: 24704607].
Drug Metabolism Prediction: Experiment and/or Computation?
Johannes Kirchmair (Center for Bioinformatics, University of Hamburg)
- Johannes Kirchmair, Andreas H. Göller, Dieter Lang, Jens Kunze, Bernard Testa, Ian D. Wilson, Robert C. Glen & Gisbert Schneider. Predicting drug metabolism: experiment and/or computation? Nature Reviews Drug Discovery (2015) doi:10.1038/nrd4581 Published online 24 April 2015
Use of Read-Across in Filling Data Gap for Assessments.
Nora Aptula (Safety and Environmental Assurance Centre)
Identification of novel anti-convulsant drugs using a larval zebrafish model
Simon Hand (University of Sheffield)
- Baxendale, C. J. Holdsworth, P. L. Meza Santoscoy, M. R. M. Harrison, J. Fox,C. A. Parkin, P. W. Ingham, and V. T. Cunliffe, Disease Models & Mechanisms, 2012, 01009, 1–12.
- C. Baraban, M. R. Taylor, P. A. Castro, and H. Baier, Neuroscience, 2005, 131, 759–768
Towards an MIE atlas – Tools for toxicity prediction
Tim Allen (Centre for Molecular Informatics, Department of Chemistry, University of Cambridge)
- T. Ankley, R.S. Bennett, R.J. Erickson, D.J. Hoff et al. (2007) Environ. Toxicol. Chem. 29; 730-741
- E.H Allen, J.M. Goodman, S. Gutsell, P. Russell. (2014) Chem. Res. Toxicol. 27 (12); 2100-2112
Thierry Hanser, Lhasa UK Ltd
Lhasa Limited is a not-for-profit, registered charity that promotes the use of computer aided reasoning in chemistry and the life sciences. We provide decision support tools for our members involved in the investigation of toxicology, metabolism and degradation.
A pioneer in the production of knowledge based systems for forward thinking scientists, Lhasa Limited continues to draw on over 30 years of experience to create user-friendly, state of the art in silico prediction and database systems.
Lhasa Limited’s software solutions include Derek Nexus and Sarah Nexus for predicting toxicity, Vitic Nexus for managing chemical information, Meteor Nexus for predicting metabolic fate and Zeneth for predicting forced degradation pathways. Lhasa also works with the wider scientific community on EU funded collaborative projects to advance understanding of in silico toxicology and metabolism. Lhasa actively seeks strategic projects with members and others, typically providing secure systems for the management of proprietary data, as well as the scientific expertise to interpret the data for computer modelling.
Throughout our history, Lhasa Limited has provided a framework to enable members to contribute knowledge to the development and refinement of structure-activity relationships in their field of interest, without compromising the confidentiality of their proprietary data. Today, Lhasa Limited continues to enable organisations to pool their resources, both financial and intellectual, for the benefit of the entire membership and the public at large. Lhasa Limited involves the membership as much as possible in all aspects of development; running frequent research activities that brings members together for discussion sessions or invites them individually to offer their opinions and preferences in regard to specific development issues. In this way, member needs and wants remain the driving force behind continued development.
Lhasa Limited was founded on the principle of collaboration and we strongly believe in ‘Shared Knowledge, Shared Progress’. Our not–for–profit, member driven status is designed to facilitate collaborative working and pre-competitive data sharing between organisations, achieving the greatest possible advantage from data that may otherwise be inaccessible. We run collaborative projects with industry, academia and regulatory bodies to continually advance predictive science for the public benefit.
Examples of our collaborations include:
- As a central contributor to eTOX through hosting and managing confidential data shared by the major pharmaceutical companies.
- With recognised expertise in chemical databases and data sharing projects, Lhasa Limited has contributed to the EU MIP-Drug Induced Liver Injury project through curation and hosting of commercially sensitive data.
- Leading and support of cross-company learning through pre-competitive data sharing. These include excipients, intermediates, aromatic amines and aryl boronic acids.
Such collaborations often involve the sharing of sensitive intellectual property which Lhasa Limited has consistently been trusted to hold and use in the by some of the largest companies in the scientific industry resulting in new knowledge that can be used by Lhasa’s entire membership.
Becoming a member of Lhasa Limited presents a valuable opportunity to enter into a dialogue with other member organisations sharing common goals towards mutual benefits. Members are recognised as part of an internationally respected collaboration supporting the global scientific community in its efforts to improve human health, protect the environment and reduce animal testing.
Head Offices: Granary Wharf House 2 Canal Wharf LS11 5PS, Leeds, United Kingdom
Web site: http://www.lhasalimited.org
Senior Chemoinformatics Scientist, UCB, Slough
The following meetings may be of interest to our readers:
SCI ADMET 2015, 13th May 2015, London
CCG European User Group Meeting, 12-15th May 2015, London
Cambridge Chemoinformatics Network Meeting, 27th May 2015, Cambridge, UK
Cresset European User Group Meeting, 18-19th June 2015, Cambridge, UK
18th SCI/RSC Medicinal Chemistry Symposium, 13-16th September, Cambridge UK
UK QSAR & Chemoinformatics Group Autumn 2015 Meeting, 6th October 2015, Duxford, Cambridge (Hosted by Cresset)