ICM GUI Manual
PrevICM User's Guide
10.36 SAR Analysis
Next

[ R-Group Decomposition | Free Wilson | SAR Table | Plot R Groups | SALI | Matched Pair Analysis ]

10.36.1 R-Group Decomposition


Tutorial | Video

To decompose a library into fragments based on a Markush scaffold (opposite of R-group (Markush) enumeration ):

10.36.2 Free Wilson Regression Analysis


The free wilson regression analysis can identify interesting combination of substituents which might have been missed by other SAR analysis methods.

The method is based on this paper by Free and Wilson J. Med. Chem 1964.

Building a Linear Model for R-Group Decomposition and Activity Prediction

Input:

Processing Steps:

1. Unique Numbering of R-Groups

Each R-group from the input set is assigned a unique identifier.

2. Descriptor Vector Formation

For each compound, a descriptor vector X[i] is created:

3. Linear Model Training

A Partial Least Squares (PLS) regression model is trained in MolSoft ICM.

The model uses the descriptor matrix (X) and the activity column to learn the relationship.

4. Equation Representation

The model outputs weights (W) and a bias (B) to define the equation:

Activity = X · W + B

How to run Free Wilson in ICM:

Results

The results comprise the following:

Guide to the Free Wilson Results Table

The Free Wilson Results table provides key statistical outputs from a Free Wilson analysis using PLS (Partial Least Squares) regression. The following columns are included in the output:

These results are used to interpret the model's performance and understand the contributions of individual features in predicting biological responses based on chemical structures.

10.36.3 SAR Table


| Tutorial |

To generate a SAR table you first need to decompose the table into the R-Groups.

Decompose your library. Decompose the library to extract the R-Groups or read in a decomposed SDF file.

Generate SAR table. Chemistry/SAR Analysis/Generate SAR Tables. One or more columns can be chosen for analysis and multiple tables will be generated. Color the table by a selected column e.g. activity.

SAR table. The SAR table will be displayed with an activity scale and R-group scaffold in the first cell (A1) and the R groups along each axis. You can use the table to investigate which R-group(s) contribute to activity.

10.36.4 Plot R-Groups


You can make a plot of R-groups from library decomposition. This is described in the plots section of the manual.

10.36.5 Generate SALI Table


| Tutorial |

Structure--activity landscape index: identifying and quantifying activity cliffs.

This method produces a SALI score as described in the publications by John Van Drie and co-workers. It will identify "structure-activity cliffs": pairs of molecules which are most similar but have the largest change in activity.

10.36.6 Matched Pair Analysis


| Tutorial |

Matched Pair Analysis (MPA)can be used to study changes in chemical properties based on small well-defined structural modifications to the chemical structure. The method tries to find the Maximum Common Substructure and clusters by similarity. The analysis is performed in two stages: clustering and then Maximum common substructure on each pair in each cluster. The output is similar to SALI output plus it gives the actual R-group pair variation. The pair table is sorted by score so top hits should show small variations and big score difference. To run MPA:

  • Read in a chemical spreadsheet containing a 'mol' column and activity data.
  • Chemistry/SAR Analysis/Matched Pair Analysis.
  • Enter the name of the Input Table.
  • If the table has activity data or some other property use the option use Activity and select the column name from the drop down button.
  • Enter a cluster split distance for determining similarity.
  • The size of the R-groups for comparison can be set (default is 12 atoms).
  • R-groups can be filtered from the analysis, or by a custom filter using SMARTs.
  • Chirality can be ignored in the comparison if needed.

The results table (tableName_pairs) is fully interactive between the plot and the original table. The hits are sorted by the score column, so top hits should show biggest activity change with smaller groups. The plot is displayed activity 1 versus activity 2 colored and sized by score (red is better).
  • ix1 - index number (row number) of first chemical which is hyperlinked to original table.
  • mol1 - 2D structure of first chemical.
  • R1_1 - group difference between first (ix1) and second chemical (ix2).
  • ix2 - index number (row number) of second chemical which is hyperlinked to original table.
  • mol2 2D structure of second chemical.
  • R1_2 - group difference between second (ix2) and first chemical (ix1).
  • scfl - common scaffold
  • dvisible ('d' delta 'visible' - delta of your activity column: 'd' + original_column_name)
  • score is calculated as normalized dActivity + (1 - normalized weight of both groups).


Prev
Enumerate by Reaction
Home
Up
Next
Chemical Superposition