|
This program performs hierarchical clustering on a set of
files. In the MMTSB Tool Set it is used by the
Cluster.pm package but can also be
called separately.
The command line parameters consist of a number of
options and a file name containing a list of files for
which the clustering algorithm should be applied. If
no file name is given the file names are read from
standard input.
The following options are available: -mode
select the clustering mode (rmsd: RMSD based or
contact: contact map based).
The maximum residue distance up to which
residue contacts are considered in the comparison can be
changed from the default value of 12 A with -maxdist.
For RMSD-based clustering the options -ca,
-cb, -cab, -heavy, and
-all are used to specify which atoms are used
for the RMSD calculation. The option -lsqfit requests
a least squares fit before RMSD values are calculated. For fragment/loop
modeling a residue subset may be given with -l and
-fitxl selects that the proteins structures sourrounding
the given residue subset is fit rather than the residues themselves.
Alternatively a different residue subset used only for fitting may
also be given with the -fit option.
The format of the input files can be either PDB format (selected
with -pdb) or SICHO chain format (-sicho).
A maximum number of clusters can be set with -maxnum.
Otherwise the maximum number is set to half of the number of input
structures.
By default the output does not contain the centroids for each cluster. This
may be requested with -centroid. Finally, -ref is
available to provide a reference RGB map which is used for writing out
centroid in contact-based clustering as a difference map with respect to that
reference.
An alternative clustering program is available in kclust.
|