Search





Molsoft's Technology


Molsoft has developed new technology and proprietary algorithms for molecular modeling with applications to protein and small molecule structure prediction, docking and structure based drug design; molecular visualization and animation, bioinformatics; cheminformatics; and intranet development.

Importance of protein homology modeling and structure prediction

With 200 bacterial genomes almost completed, human, mouse, yeast and other genomes essentially completed, the main obstacle to rational drug design is insufficient structural information in the Protein Data Bank (PDB). Only 0.1-1% of proteins has their three dimensional structure determined and the growth rates of sequence and structure entries dramatically different. PDB doubles only every three years, while the sequence banks double in size every 17 months. In addition, each structure undergoes many essential big and small rearrangements upon binding to other proteins or chemical substrates. While ab initio protein structure prediction at a reasonable accuracy is still beyond reach the good news is that partial structure prediction can already help to answer numerous questions in biology and rational drug design. Homology modeling and structure prediction technology starts from the 0.1-1% proteins with known structures and builds usable structural models up for to 30-50% of all proteins. The models can be used for decision support in drug discovery, e.g. prioritizing targets by 'drugability', for docking and virtual ligand screening, for directing chemistry in lead optimization, for directing protein functional studies via mutagenesis, as search models for molecular replacement, etc. The molecular environment is implemented as modules of the ICM program. ICM stands for Internal Coordinate Mechanics and includes hundreds of algorithms unified by a common scripting language and graphics-user interface.

Homology Modeling

For about 30% of all protein sequences a good structural model can be built, and for another 20% a partial model can be built. Molsoft developed proprietory technologies for

  • template finding: sensitive sequence search (or threading) to identify one or several structural templates for further homology modeling using full alignments with zero-end-gaps (ZEGA) and empirical structural statistical significance [Abagyan, Batalov J.Mol.Biol. 1997]
  • accurate treading or sequence-structure alignment using the ICM alignSS algorithm that optimizes the sequence-structure match using residue accessibilities, secondary structures and functional sites of the template and sequence plus predicted secondary structure of the query sequence.
  • fast homology model building and database loop searches with the build model function. This algorithm builds a full model with all the loops in seconds. Each loop searched in a full PDB database and selected on the basis of its interaction energy with the loop environment.
  • loop prediction through local global optimization
  • model refinement using ICM global optimization algorithm
  • local reliability prediction To assign a reliability value to each residue in the model we developed algorithms including statistical potential or full residue energies after refinement, plus by the local properties of the alignments.

The ICM homology modeling algorithms have been successfully used in modeling competitions [e.g. car95, hom97 ], benchmarks [ ras97 ], and in many research projects [ sch01, nor01, tom00, sch00, kel00, gan00, car98, pat98, sri98, yud97, yui97, mat97, etc.]

Global energy optimization

The core technology used in most of our structure prediction algorithms is global free energy optimization in a subset of internal coordinates that describes inter or inter-molecular geometry. For structure prediction and large scale conformational sampling ICM employs a family of new global optimization techniques such as: Biased Probability Monte Carlo ( Abagyan and Totrov, 1994 ), pseudo-Brownian docking algorithm ( Abagyan et al., 1994 ) and local deformation loop movements (Abagyan and Mazur, 1989 ).

Receptor structure based prioritization of protein targets

The icmPocketFinder procedure identifies the substrate binding pockets in 98% of all the cases (tested on over 10,000 pockets). This procedure is based on calculating the drug-binding density field and contouring it at a certain level. In 2001 [ tar01 ] we published a fast procedure for accurate electrostatic calculation using the boundary element algorithm . A combination of "pocket-density" with other physical properties such as electrostatic potential, hydrophobicity, hydrogen bonds is used to evaluate if a particular protein target or protein-protein interface is "drugable" and prioritize the targets. We developed a special procedure to improve the pocket models by co-optimization of flexible pockets with some of the know ligands.

Accurate fully flexible compound docking to receptor pockets

We developed a fast and accurate algorithm for docking a continuously flexible ligand in represented to a receptor pocket. In a benchmark study on 11 different receptors, the ICM flexible docking algorithm correctly docked 93% of all ligand receptor pairs! There are two versions of the algorithm: with receptor represented by a series of grid potentials, and with both ligand and receptor represented as flexible explicit molecules. The ICM docking has been used extensively in many research projects and drug design projects.

Virtual ligand docking and screening of millions of compounds

A particularly fast implementation of the flexible docking algorithm is used to screen millions of compounds from vendor databases or in-house libraries. Our technology allows to index and convert to 3D any chemical database in .sdf, .mol or mol2 formats, then dock all the molecules and score them by estimated binding affinity. The main purpose of this procedure is to separate binders and non-binders and eliminate at least 99% of compounds which do not fit the pocket and do not need to be experimentally tested. We have several different scoring functions including a score based on the potential of mean force. The consensus scoring reduces the number of false positives. The Molsoft-ICM docking and virtual ligand screening was tested in benchmarks, competitions and, most importantly, in several experimental lead discovery projects, including discovery of novel RAR agonists [ sch01 ], antagonists [ sch00 ], RNA binders [ fil02 ], FGFR tyrosine kinase inhibitors, Thyroid hormone receptor antagonists, and PTB1B inhibitors.

Global optimization of compound geometries

In addition to an internal coordinate force field, Molsoft-ICM platform allows to perform global optimization and analysis of small molecule geometries by performing free geometry optimization in Cartesian space using the MMFF94 force field including fully automated atom type assignments. The conformational generation procedure accumulates a non-redundant set of representative molecular geometries.

Molsoft-ICM scripting language and molecular environment

Molsoft has developed more than several focused applications, we designed and developed the whole computational environment for bioinformatics, cheminformatics, protein modeling, protein design, docking and screening. The environment is tied together by a common scripting language for molecules, numbers, strings, vectors, matrices, tables, sequences, alignments, profiles and maps This environment covers molecular graphics and production of molecular animations.

Copyright © 2005 Molsoft LLC.
All rights reserved.
Legal Notices