ICM Manual v.3.8
by Ruben Abagyan,Eugene Raush and Max Totrov
Copyright © 2018, Molsoft LLC
Apr 11 2018

Contents
 
Introduction
Reference Guide
Command Line User's Guide
References
Glossary
 A
 B
 C
 D
 E-H
 I-N
 O-R
 S
  sarray
  script
  sequence
  segment
  (ICM)-shell
  site
  skin
  sln
  stack
  stick
  string
  Svariable
  Factor
  Slide
  surface area
 T
 U-Z
 
Index
PrevICM Language Reference
S
Next

[ sarray | script | sequence | segment | (ICM)-shell | site | skin | sln | stack | stick | string | Svariable | Factor | Slide | surface area ]

sarray


Array of strings: {"one","two","five","minus ten"}

To show them use: show sarray [simple] (option simple to skip the header information) or simply type their name and hit 'Enter'.

See also: Sarray , Tostring , read sarray

script


(or ICM script) means a collection of ICM commands stored in a file (or string) which can be called from ICM-shell.

Example:

 
 call _demo_fold # find demo_fold file and start the script 

ICM scripts as any other scripts can also be called directly from shell if they have a correct header, e.g.


> cat icmscript
#!/home/icm/icm -s
print "This script name is " + File(last) 
quit

> icmscript 
 This script is icmscript

Scripts as ICM shell stringsA script can also be stored in a string and run from GUI. Use carriage return "\n" to enter several commands. Example:


scr = "print 333\nprint 444"
call string scr
 333
 444

To make it executable from GUI by double click use:


set property command s2  # now s2 will be visible in Workspace as script. double click on it or right-click for menu

Script files with Arguments

ICM scripts also understand arbitrary name=value arguments. Some of them can be used in a script and others automatically passed to other scripts. For example:


icm n=2 t="01:20"
This arguments can be extracted inside a script with the Getarg command. Also after certain arguments are extracted all other arguments can be passed along to another script. See this example:

#!/home/ruben/icm/icm -s
#
macro bye
 print " Help: "+ File(last) +" a1=.. file=.. [a2=s (3.14)]"
 print "  a sample script to get three args and pass along remaining arguments"
 quit
endmacro

if Nof(Getarg(name))==0  bye
if Nof(NotInList(Getarg(name),{"a1","file"}))!=0 bye # must have a1 and file
a1=Getarg("a1" delete)  # a must-have string argument, no defaults
file=Getarg("file" delete )
if !Exist(file) then
  print " Error> File "+file+" not found" 
  bye
endif
a2=Getarg("a2",3.14, delete) # an optional real argument
#
# check the sanity of values and exit if anything is wrong:  if .. bye
#
show "These arguments have been extracted and tested: " a1,a2,file
show " unix IcmScript " + Getarg()  # all other arguments are passed along
quit

If the script runs during an interactive graphics session, the graphics can be used concurrently with the script execution.

Teaching your vim editor to highlight ICM syntax:Linux/Mac1. create ~/.vim/syntax in your HOME


mkdir -p ~/.vim/syntax
2. create a link or copy icm.vim file there

ln -s $ICMHOME/icm.vim  ~/.vim/syntax/icm.vim
3. paste the line below into ~/.vimrc (create one if it's not there) au BufRead,BufNewFile *.icm,_* set filetype=icm

vi/vim/gvim myscript.icm # enjoy syntax highlighting

Vim syntax highlighting for Windows:1. In you home folder ( '\Documents and Settings\' on XP or '\Users\' on Vista and Win7 ) create folder 'vimfiles' and folder 'syntax' inside it. 2. copy icm.vim (\Program Files\Molsoft LLC\ICM-Pro\icm.vim) there 3. In you home folder create (or edit exiting one) file '_vimrc' and paste line below: au BufRead,BufNewFile *.icm,_* set filetype=icm

gvim myscript.icm # enjoy syntax highlighting

Running Scripts from ICM molecular documents

To call a script from an html-document inside ICM, use the #icm/script/scriptName path in href :


script1 = "print 1"
set property command script1

MyDocWithLink2Script  = "<html><a href=\"#icm/script/script1\">click to run script1</a></html>"
set property html MyDocWithLink2Script  

Now the document will contain a highlighted link. Clicking on that link will run script1 .



sequence


an ICM-shell object containing an amino-acid or DNA sequence. The ICM-shell is tuned to work with very large sets sets of millions of genomic sequences at once. One can read a sequence from a sequence file in different formats, create it with the Sequence() function, make sequence command, or by assignment (e.g., aseq = bseq [2:18], new sequence aseq is a 2:18 fragment of sequence bseq). A valid amino-acid sequence contains an uppercase string of one-characters amino-acid names. Please distinguish this ICM-shell object from the "sequence" in the ICM-sequence file which contains detailed 3 (or 4)-character notations of residues from the icm residue library. One can concatenate two sequences ( seq1 // seq2 ) and extract a part of it ( seq[15:67] ). Sequence object may contain the secondary structure string (e.g. EEE___HHH_) of the same length as the sequence. It is automatically created by the make sequence command and the Sequence( ) function or can be directly set with the set sstructure command. If logical l_showSstructure is set to yes, the secondary structure string will be shown in alignments.
Examples:
 
 aseq=Sequence("ASSAARTYIP") 
 read sequences "aa.seq" 
 aseq[3:4]="WW" 
 
 read object "crn" 
 crn_seq = Sequence(a_/*) 

Resetting sequence type
ICM is trying to guess sequence type. To set sequence type explicitly, use the set type [protein|nucleotide] command. E.g.
 
  a=Sequence("AAAATAAAA") 
  set type a protein  # or if you change your mind 
  set type a nucleotide   

Properties of a sequence can be projected to an alignment in which the sequence participates with the r property transfer via alignment {Rarray}( R_property,seq_,ali_,r_gapDefault ) function. The opposite action, i.e. projecting from alignment to a particular sequence can be achieved with another form of the Rarray function: map aa property to sequence {Rarray}( R_ali ali_from seq_ | i_seqNumber )

Functions return sequence, operating on sequences or related to sequences:

  • Align ( seq seq ) : returns alignment
  • Area ( seq ) : returns standard accessible area of a linear chain of those residues
  • Distance( seq seq .. ) and Score : derived from pairwise alignment
  • IcmSequence ( seq [ s_nter ] [ s_cter ] ) : returns 3-letter sequence for the icm build command.
  • Index( seq s_substr ) : position of a subsequence
  • Length( seq )
  • Mass( seq ) : array of amino acid masses.
  • Mol ( seq ) -> ms_ : molecular selection of 3d molecules with identical sequence
  • Namex( seq ) : sequence comment/description
  • Pattern( seq disulfide ) : Cys pattern eg "ACAACFW" -> "C??C?"
  • Reference( seq ) : returns the swissprot database reference
  • Rarray( seq , R_26aa_prop ) → R : use Smooth( R window ) if needed.
  • Sequence( rs ) : sequence of selected residues
  • Sequence( seq reverse ) : sequence in the opposite direction
  • Sequence( rs_ ) returns seq_
  • Sequence( seq [reverse] ) returns seq_ of DNA reverse complement
  • Sequence( ali_Mult ) seq_
  • Sequence( ali_Mult i_seq ) = seq_
  • Sequence( profile ) = seq_
  • Sequence( i_len R26_resProb ) = seq_
  • Sequence( I[n]_len R26_resProb ) = seqArr[n]_
  • Sstructure ( seq ) : returns or predicts the secondary structure
  • Shuffle ( seq ) : randomized sequence of the same length
  • Temperature( dnaSeq [ r_DNA_C_nM (0.25) r_Salt_C_mM(50.)] ) ⇒ r_meltingT
  • Tostring( seq )
  • Tr123( seq | s ) : translate one letter code to three letter code
  • Tr321( seq | s ) : the inverse to Tr123
  • Trans( seq_nucl [all|frame] ) : T_with_seq_translated_protein_seq
  • Trim( seq S_proteinTags ) : extract protein tags described in S_proteinTags sarray.
  • Turn ( seq | s ) : R_n_predictedProbOfTurn
  • Type ( seq 1) : sequence type



segment


an element of the simplified representation of a protein topology in terms of its secondary structure elements ( Abagyan and Maiorov, 1988). One element (referred to as a segment) is a vector of the best axis of the element. Loop segments are represented by a straight line between the end of the previous segment and the beginning of the next one. This representation can be used for a fold search through a library of pre-calculated segment descriptions of the protein topologies (foldbank.seg). See also ribbonStyle.

(ICM)-shell


user-friendly, high-level command interpreter combined with a collection of tools allowing you to interact conveniently with the kernel of the ICM software.

The shell consists of commands separated by a carriage return or a semicolon, e.g.


read pdb "2ins"
a=1;b=2


site


[ Site table ]

ICM sequences and molecular objects may contain specific information about local sequence features, such as location of binding sites, disulfide bonds etc. These information is stored in the feature table (FT) section of the Swissprot protein sequence entries or after the SITE fields of pdb files. The sites in the feature table may look like this:

 
FT   ACT_SITE     15     15       ACTIVE SiTE HIS 
FT   TRANSMEM    309    332       PROBABLE 
FT   DOMAIN      333    362       CYTOPLASMIC TAIL. 
FT   DISULFID    125    188       BY SIMILARITY. 

We use one letter code (the second column) to specify the site type. The first column shows the priority value which is used by the display site command and the selection by site residue selection (e.g. a_/F ).

PriorityCharSWISSPROT def.Description
4 A ACT_SITE Amino acid(s) involved in the Activity of an enzyme.
2 B BINDING Binding site for any chem.group(co-enzyme,prosthetic group...)
5 C CA_BIND Extent of a Calcium-binding region.
5 D DNA_BIND Extent of a DNA-binding region.
4 F SITE Any other Feature on the sequence (i.e. SITE records in PDB).
2 G CARBOHYD Glycosylation site.
7 I INIT_MET The sequence is known to start with an initiator methionine.
2 L LIPID Covalent binding of a Lipidic moiety
2 M METAL Binding site for a Metal ion.
5 N NP_BIND Extent of a Nucleotide phosphate binding region.
6 O PROPEP Extent of a prOpeptide.
6 P PEPTIDE Extent of a released active Peptide.
5 R REPEAT Extent of an internal sequence Repetition.
6 S SIGNAL Extent of a Signal sequence (prepeptide).
5 T TRANSMEM Extent of a Transmembrane region.
1 V VARIANT Authors report that sequence Variants exist.
1 X CONFLICT Different papers report differing sequences.
5 Z ZN_FING Extent of a Zinc finger region.
6 c CHAIN Extent of a polypeptide Chain in the mature protein.
5 d DOMAIN Extent of a Domain of interest on the sequence.
3 e THIOLEST ThiolEster bond.
1 m MUTAGEN Site which has been experimentally altered.
2 p MOD_RES Post-translational modification of a residue.
3 s DISULFID DiSulfide bond.
3 t THIOETH Thioether bond.
1 v VARSPLIC Sequence Variants produced by alternative splicing.
6 z TRANSIT Transit peptide(mitochondrial,chloroplastic,cyanelle,microbody)
5 ~ SIMILAR Extent of a similarity with another protein sequence.
4 - NON_CONS Non consecutive residues.
7 + NON_TER The residue at an extremity of seq.is not the terminal res.
4 ? UNSURE Uncertainties in the sequence

The sites can be

  • read from a swissprot entry with the read sequence swiss command
  • set to a sequence or a molecular object with the set site [seq_from [ali_] {seq_|ms_} [only] command , or a copy site command
  • a new site can be set with the set site s_siteString {seq_|ms_} [only] command (e.g. set site a_1.1 "FT SITE 15 15 important residue") .
  • and delete with the delete site {seq_|ms_} i_siteNumber command (e.g. delete site a_mol1 1) .
  • To show sequence sites use the show sequence swiss command, and in objects: show site {seq_|ms_} command.
  • Sites assigned to molecular objects can be selected (and thereby visualized) with the a_/ F SiteString selection
  • Sites will be written to an object and restored upon reading under the OBJECT.site or OBJECT.auto preference.
The ICM-shell variable l_showSites toggles the appearance of the site information in the show sequence command.
The sites can be colored with the
 
color site rs_   
command, e.g.
 
color site a_/FA red # features/sites from the active site 

Example:
 
 read pdb "1hla"  # this object Ca atoms of 2 molecules 
 make bond chain a_//ca # link them into a chain 
 read sequence swiss web "1A02_HUMAN" 
 read sequence swiss web "B2MG_HUMAN" 
 set site a_1 1A02_HUMAN 
 set site a_2 B2MG_HUMAN 
 show site B2MG_HUMAN 
 ds wire a_ 
 ds cpk magenta a_/FV # display variants 
 ds cpk yellow a_/Fs  # display disulfides 

The following functions work with the sites of sequences:


Table( seq site ) gives you a table with ll sites.
Index( seq site iSite )


#>T
#>-key---------fr----------to----------list--------desc-------
   TRAMSMEM    10          20          ""          "predicted tm"

Nof(site seq ) returns the number of sites associated with the sequence. The same number can be returned by Nof(Table( seq , site ))


Index( site seq iSiteNum ) returns an iarray of the site limits, e.g. Index(site,a,1) returns {10,20}



Retrieving sequence site information


Table( site seq ) → T_allSites

Example:


read pdb "1f88"
make sequence
Table( 1f88_a site)
 #>T
 #>-key---------fr----------to----------list--------desc-------
    CARBOHYD    15          15          ""          "glycosylation site                                 "
    CARBOHYD    2           2           ""          "glycosylation site                                 "

skin


a solid graphical representation of the molecular surface, also referred to as the Connolly surface. It is a smooth envelope touching the van der Waals surface of atoms as the solvent probe of the waterRadius size rolls over the molecule. "Skin" is important for analysis of recognition, electrostatics, energetics, ligand binding and protein cavities. The surface is calculated with a new fast analytical contour-buildup algorithm ( Totrov and Abagyan, 1996) and can be generated as a general graphics object with the make grob skin command. 'Skin' consists of three types of elements: convex spherical elements, concave spherical elements, and torus-shaped elements. ICM allows the calculation of the volume confined by the 'skin' and its surface area. In a general case skin is defined by two atom-selections:

  1. atoms the skin is calculated for
  2. atoms surrounding the atoms from the previous selection
One can calculate/display only a patch within a context of the rest (as_part a_*), or skin around one molecule as the rest does not exist (as_part as_part):
 
 read object "complex" 
 display a_//ca,c,n  
 pocket = a_1//!h* & Sphere(a_2//!h*) 
 display skin pocket a_1//!h* # 5A sphere around the second subunit 
 set plane 2                  # or F2 : to avoid deletion of the previous patch  
 display skin a_2//!h* a_2//!h* green # ignore everything but the second molecule 
Colored molecular surface can be saved as:
ICM can also generate smooth Gaussian surfaces with the following commands:
 
 make map potential Box( a_ 3. )   # build Gaussian map 
 make grob m_atoms solid exact 0.5 # contour it 
 display g_atoms                   # display the envelope grob 

sln


Sybyl line notation, a string representation of molecular structure similar to Smiles. The sln string is returned by the String( as_ sln ) function.

stack


a set of conformations of a particular object. Two types of stacks are supported in ICM:

  • stacks of conformations of an ICM object stored as sets of internal coordinates
  • stacks of conformations of a non-ICM (PDB) object stored as a set of cartesian coordinates.
The stack can be just a place to store (with the store conf command) a number of complete descriptions of different conformations regardless of the way they have been created. The properties of stack conformations are either set by the search procedure or can be manually set with the set stack energy|number|all|align Array commands. The maximal number of stack conformations is determined by the mnconf parameter. The stack conformations can be created manually in the course of interactive procedure, or created automatically as a result of a montecarlo run. The energies of stack conformations can be shown with the show stack [all] command. The stack can be saved into a .cnf file, and you can also read stack. Stack in Biased Probability Monte Carlo procedure represents best energy representatives of different conformational families (see Abagyan and Argos, 1992). Measure of difference (or distance) is defined by the compare command and vicinity parameter. Stack can influence the search via the following variables: mnvisits, mnhighEnergy, mnreject, visitsAction, highEnergyAction and rejectAction .
Stack stored in an object The stack can be assigned to an object and saved/retrieved with the object with the store stack object and load stack object commands, copied with copy os stack command and deleted with the delete stack object command.

Cartesian stacks The sets of coordinates from multiple models can be also stored in a special stack with the read pdb all stack s_multiModelPdbFile command.

The stack conformations can be pushed to a trajectory file with the store frame [ append ] command. Then the trajectory can be displayed in interpolated smooth fashion with the display trajectory command.

See also:



stick


graphical representation of a covalent bond as a solid cylinder. Its radius is defined by the GRAPHICS.stickRadius ICM-shell variable.

string


may exist in the ICM-shell as a named variable or a constant (e.g. "1crn", "A b\n c" ). There is a number of predefined string variables in the ICM-shell. You can concatenate strings ( "aaa" +"bbb" or "aaa" //"bbb" -> "aaabbb"), sum a string and a number ("aaa"+4.5 -> "aaa4.5" ), compare them ( if ( s_pdbDir == "/data/pdb/", or if ( s1 > s2 ) ). Strings may be used in arithmetic expressions, commands and functions.
Examples:
 
 s  = "1crn" 
 s1 = s1 + ".brk" 
 if (s != "2ins") print "wrong protein" 

converting a string into an executable command-file To make an internally stored script


 s = "print 'hello'"
 set property command s
# or
 set property command s auto  # for autoexec status

svariable , or ICM-shell variable


a named object stored in the program memory of one of the following types: integer (i), real (r), string (s), logical (l), preference (p), iarray (I), rarray (R), sarray (S), matrix (M), sequence (seq), profile (prf). alignments (ali), maps (m), graphics objects (grob) (g) . They can be created by direct assignment to a constant (e.g. a={1 4 3 8} , to a function (e.g. a=Iarray(4) ) or read from a disk file (e.g. read iarray "a" ) Most of ICM-shell variables can also be written to a disk file, and shown. They can take part in the arithmetic and logical expressions. For some of the variable types, subsets are defined (e.g. a[2:4]).



structure factor (factor)


a named ICM-shell table containing information about reflections. A structure factor table header may contain maximal absolute values of h k and l.
 
#>I igd.HKL 
 31 36 37 
It will be calculated on the fly if absent and is important for Fourier transformation. You may also have any number of additional members in the header section for your convenience. For example, real values for the minimal and maximal resolution, etc.
The "column" part of a table contains mandatory integer arrays of h,k and l. Some of the other arrays with fixed names may be necessary for specific operations. They are:
  • fo : real array of observed amplitudes (used by the "xr" term)
  • fc : real array of calculated amplitudes. They are added and updated automatically by the "xr" term calculations.
  • ac and bc : real array of Real and Imaginary components of calculated structure factors. ac and bc may be read from a file, calculated in the ICM-session, and/or added and updated automatically by the "xr" term calculations. These two arrays are used as the input arrays for the make map factor command.
  • w : real array of weights of individual reflections which are used if defined in the "xr" term calculations. Note, that multiplicity will be automatically taken into account, do not multiply your weights by it to avoid double counting.
  • free : integer array of 0 and non-zeros to mark reflections for R-free calculations. Reflections marked with non-zeros will not be used in the "xr" term calculations. They will be used instead by the Rfree( T_factor) function.

One can add any number of additional arrays to the factor-table. Of course, the table can be read, written, sorted, shown, etc. You may also use powerful table arithmetics and expressions to generate new columns and specify subsets.
Examples:
 
                                   # new columns 
 group table append F Sqrt(F.ac*F.ac+F.bc*F.bc) \ 
       "fc" Atan2(F.bc,F.ac) "ph_calc"  
 
 F.ac = (2*F.fo-F.fc)*Cos(F.ph_calc) 
 F.bc = (2*F.fo-F.fc)*Sin(F.ph_calc) 
 make map factor F    # 2Fo - Fc map is ready 
 
 F1= F.fc > 1. # another table of strong reflections 
 F2= F.h < 20 & F.k < 30 & F.l < 20 # another subset 
See also: How to manipulate with structure factors
The command word "factor" serves to read/write the XPLOR formatted structure-factor-files.

Slide


Slide is a recorded state of the Graphics window and other GUI windows. The slides are added to an table with a single array called slides . Each slide becomes an element


 slideshow.slides[1] 
 slideshow.slides[2] 
  ..
The slides just contain the display attributes and need the full objects compatible with the them to be present in the ICM shell. The matching occurs by name and number of elements in the object.

Here are the main operations on slides and related parameters.

add slide arguments # add new slide

store slide # replace the current slide

display slide [ i_slideNumber ]

set slide name slideArray s_oldname s_newname

SLIDE.ignoreBackgroundColor - a user preference to ignore the slide background color or background image. May be useful if you do not like the background of somebody else's slideshow.

SLIDE.ignoreFog - a user preference to skip enforcement of the fog setting (is useful for some graphics cards).

Changing object names in slides. This may be needed if a molecular object, a grob, a table, etc. changed their names in the shell. set slide name oldname newname E.g. set slide name "1crn" "1abc"

Replacing other properties from a script.The simplest way to replace properties of a slide is to run a for loop like this:


for i=1,Nof(slideshow.slides)
  display slide i
  color residue label black 
endfor

Compressing the data and the file for fast network transfer:Do the following:

  • compress your grobs (meshes/surfaces) with the compress grob command.
  • delete unnecessary objects from the session with the delete all compress command.
  • write binary # to gzip the .icb file
  • compress the files externally with gzip and rename them from .icb.gz to .icb (ICM will recognize the need to uncompress on the fly).

See also: Image

surface area


in the ICM-shell means a solvent-accessible surface (center of water-sphere). Important: Do not confuse this surface with the molecular or Connolly surface which is referred to as skin . (see also Acc function, Area function, display skin,display surface, show area surface,show area skin, show volume surface "sf" term set color surface).
Important: There are two ways to calculate the surface area: via the show area surface or the show energy "sf" commands. In both cases individual atomic accessibilities are calculated and assigned to individual atoms. These accessibilities can be shown with the show as_ command, or can be accessed with the Area( as_) function. However, the two commands use different atomic radii:

  • show area surface
    • uses van der Waals radii as defined in the icm.vwt file
    • calculates areas for all atoms including hydrogens
  • show energy "sf"
    • uses special radii designed for calculations of the solvation energy. The radii are defined in the icm.hdt file ;
    • employs a united atom model, in which hydrogens are ignored and radii increased accordingly;
    • calculates areas only for non-hydrogen atoms, ignores hydrogens.

Examples:
 
                    # dipeptide  
 build string "se nter ala his cooh"   
                    # fill out individual accessibilities 
                    # (incl. hydrogens) 
 show area surface  # takes all atoms w. vdWaals radii into account         
 show a_//*         # look at the accessibilities  
 show Area(a_//n*)  # extract atomic accessibilities for all nitrogens  
# 
 show energy "sf"   # only heavy atom accessibilities used in energy calc.  
 show a_//*         # look at these new accessibilities  
 show Area(a_//n*)  # "energy" accessibilities for nitrogens  


Prev
ribbon
Home
Up
Next
T

Copyright© 1989-2018, Molsoft,LLC - All Rights Reserved. Copyright© 1989-2018, Molsoft,LLC - All Rights Reserved. This document contains proprietary and confidential information of Molsoft, LLC. The content of this document may not be disclosed to third parties, copied or duplicated in any form, in whole or in part, without the prior written permission from Molsoft, LLC.