Rst-wham

From PMFLib Wiki
Revision as of 12:27, 19 April 2016 by Xstepan3 (talk | contribs)
Jump to navigation Jump to search

Navigation: Documentation / Methods / Restrained Dynamics / RST:Utilities / rst-wham


RST Utilities

Name

rst-wham

Synopsis

wham [P|Ppi|Pval] hist_min hist_max num_bins tol temperature numpad metadatafile freefile [num_MC_trials randSeed]


Description These programs (wham and wham-2d) implement the Weighted Histogram Analysis Method of Kumar, et al ("Multidimensional free-energy calculations using the weighted histogram analysis method", J. Comput. Chem., 16:1339-1350, 1995). The code generally follows the notation used by Benoit Roux ("The calculation of the potential of mean force using computer simulations", Comput. Phys. Comm., 91:275-282, 1995). Consult these papers for the theoretical background and justification for the method.


Arguments

P|Ppi|Pval

The first (optional) argument specifies the periodicity of the reaction coordinate. For a nonperiodic reaction coordinate (a distance, for example), it should be left out. "P" means that the reaction coordinate has a periodicity of 360, appropriate for angles. "Ppi" specifies a periodicity of 2*pi, appropriate for angles measured in radians. "Pval" specifies periodicty of some arbitrary amount, val, which should be an integer or floating point number. For example, "P180.0" would be appropriate for an angle with twofold symmetry.

hist_min hist_max

Specify the boundaries of the histogram. As a rule, all data points outside the range (hist_min, hist_max) are silently ignored. The only exception is that if an entire trajectory is outside the range, the program halts with an error message. The solution is to remove that file from the metadata file. hist_min and hist_max should be floating point numbers.

num_bins

Specifies the number of bins in the histogram, and as a result the number of points in the final PMF. It should be an integer.

tol

tol is the convergence tolerance for the WHAM calculations. Specifically, the WHAM iteration is considered to be converged when no Fi value for any simulation window changes by more than tol on consecutive iterations. As the program runs, it prints the average change in the F values for the most recent iteration. Obviously, this number will be smaller than tol before the computation converges, because convergence is triggered by the largest change as opposed to the average.

temperature

temperature is a floating point number representing the temperature in Kelvin at which the weighted histogram calculation is performed. This does not have to be the temperature at which the simulations were performed (see below for discussion).

numpad

numpad specifies the number of "padding" values that should be printed for periodic PMFs. This number should be set to 0 for aperiodic reaction coordinates. It doesn't actually affect the calculation in any way. Rather, it just alters the final printout of the free energy, to make plotting of periodic reaction coordinates simpler. This is more important for wham-2d than wham.

metadatafile

metadatafile specifies the name of the metadata file. The format of this file is described below.

freefile

freefile is the name used for the file containing the final PMF and probability distribution.

num_MC_trials randSeed

num_MC_trials and randSeed (Optional) are both related to the performance of Monte Carlo bootstrap error analysis. If these values are not supplied, error analysis is not performed. num_MC_trials should be an integer specifying the number of fake data sets which should be generated. randSeed is an integer which controls the random number seed - the value you pick should be irrelevant, but I let the user set it primarily for debugging purposes.

File formats

metadata
Each line of the metadata file should be blank, begin with a "#" (marking a comment), or have the following format:
/path/to/timeseries/file loc_win_min spring [correl time] [temperature]

This first field is the name of one of the time series files (more on this in a moment). The second field, loc_win_min, is the location of the minimum of the biasing potential for this simulation, a floating point number. The third field, spring, is the spring constant for the biasing potential used in this simulation, assuming the biasing potential is of the format

Many simulation packages, including TINKER, AMBER, and CHARMM, do not include the [1/2] when they specify spring constants for their restraint terms. This is a common source of error (I'd love to change my code to match the other packages' behavior, but then experienced users who don't read the manual would get messed up). Also, the units for the spring constant must match those for the time series. So, if your time series is a distance recorded in Ångstroms, the spring constant must be in kcal/mol-Å2. AMBER users should take care when using angular restraints: the specification and output of angles is in degrees, but AMBER's spring constants use kcal/mol-rad^2.

The fourth argument ("correl time") specifies the decorrelation time for your time series, in units of time steps. It is only used when generating fake data sets for Monte Carlo bootstrap error analysis, where it modulates the number of points per fake data set. This argument is optional, and is ignored if you don't do error analysis. If you're doing multiple temperatures but not bootstrapping, set it to any integer value as a placeholder, and it'll be ignored. See section for more discussion about how to do bootstrapping.

Finally, the last (optional) field is the temperature for this simulation. If not supplied, the temperature specified on the command line is used. In the present version of the code, you must either leave the temperature unspecified for all simulations or specify it for all simulations.

The time series files must follow one of two formats, depending on whether the temperature was specified in the metadata file. If no temperature was specified, the file should contain two columns, where the first is the time (which isn't actually used), and the second is the position of the system along the reaction coordinate. Both numbers should be in floating point format. Lines beginning with "#" are ignored as comments. Additional columns of data are ignored. If the simulation temperature is specified, there must be a third column of data, containing the system's potential energy at that time point. It should be a floating point value.