Your browser doesn't support the features required by impress.js, so you are presented with a simplified version of this presentation.

For the best experience please use the latest Chrome or Safari browser.

ChemicalToolBoX Cheminformatics inside the Galaxy

October 2012 • Björn Grüning

"Enable accessible, reproducible, and transparent computational research."
Galaxy :: worflow managment system
transparent
Galaxy :: main view
transparent
reproducibility is key
transparent
"Cheminformatics is the use of computer and informational techniques, applied to a range of problems in the field of chemistry. These in silico techniques are used in pharmaceutical companies in the process of drug discovery."
Wikipedia

  • paint your structure

  • upload molecule files

  • download pubchem


SMILES: CN1CCC23C4C1CC5=C2C(=C(C=C5)O)OC3C(C=C4)O
InChI-Key: BQJCRHHNABKAKU-KBQPJGBKSA-N
molecule datatypes are first class citizen

  • SMILES

  • InChI

  • MOL/SDF

  • MOL2

  • CML

fingerprint generation

  • Daylight Fingerprint

  • PubChem Fingerprint

  • MACCS

  • :: insert your chemical fingerprint here ::

large scale
postgresql :: pgchem
filtering
  • predefined rules (drug-like, lead-like, lipinski-conform ...)

  • physicochemical properties

  • Similarity Search

  • Substructure Search

  • Pharmacophore Search

  • Spectrophore Search

the big picture
  • 32 different tools

  • powered by open source projects

    • Open Babel

    • RDKit

    • Galaxy

    • numpy, matplotlib ...

  • offer a community starting point

some use cases from our lab
  • building comprehensive and specialised compound libraries

  • screening libraries for new drug candidates

  • exploring of the chemical universe

ChemicalBoX
"... is a comprehensive collection of small compounds, consisting of data from other repositories. The focus of ChemicalBoX is to offer a ready- and easy-to-use compound library that extends current freely available compilations."

  • reproducibility/transparency

  • completeness

  • free availability

  • automatization

  • 39 million filtered unique compounds

ftp://pharmaceutical-bioinformatics.org/chemicalbox/

git clone http://chemicalbox.pharmaceutical-bioinformatics.org/chemicalbox.git

PurchasableBoX
"... is a comprehensive collection of small compounds,
consisting of data from various vendors."

  • 44,6 million filtered unique compounds

  • 43,6 million drug like compounds (QED > 0.2)

  • 1,2 million cluster centers (similarity > 0.85)

QED: G. Richard Bickerton et al. "Quantifying the chemical beauty of drugs";
Nature Chemistry 2012

Merging ChemicalBoX and PurchasableBoX offers a compilation
of ~70 million of unique filtered compounds.
Merging ChemicalBoX and PurchasableBoX offers a compilation
of ~70 million of unique filtered compounds.


transparent
  CAS, Chemical Abstracts Service:
  "65 Million: It's more than just a number."
Exploring new areas of the chemical universe.
  • fragmentation of 1.425 drugs

  • 1.895 fragments

  • 182.360 unique, merged compounds

  • 150.642 are druglike (QED > 0,3)

QED: G. Richard Bickerton et al. "Quantifying the chemical beauty of drugs";
Nature Chemistry 2012

Other topics we are working on ...
  • methylation analysis of human immune cells

  • in silico screening (Bromodomain inhibitors)

  • genome analysis

  • text mining and data warehousing

  • systems-biology of streptomycetes

Questions?

Thank You!
transparent
contact