Error Estimation and Pattern Recognition Techniques in protein Crystallography.
Thesis
by Drs. Peter Zwart
Summary
Automated model building techniques in macromolecular crystallography have
substantially reduced the need of manual intervention in the construction
of atomic models of complex macromolecules. The availability of these methods
plays a crucial role in the full automatisation of macromolecular crystallography,
a task considered necessary by structural genomics initiatives. The research
presented in this thesis has been carried out in order to enhance the performance
of model building routines available in the ARP/wARP suite. In the second
chapter a distribution of the distance between two atoms given a Gaussian
error of the positional parameters was derived, denoted as the non-central
Maxwell distribution. The non-central Maxwell distribution is used to obtain
error dependent distributions of the set of distances between nearest neighbours
in a protein structure and the set of interatomic distances smaller a certain
value. These two sets of distances can be obtained from an atomic model regardless
whether a chemical interpretation is available for each of the atoms within
the model. These sets of distances can be used to estimate parameters of the
error model using the derived distributions. It is shown that the error of
the positional parameters is related to the quality of the electron density
calculated with phases from the erroneous model. A key parameter in this relation
is the width of the electron density of an atom. An empirical relation between
this quantity and some global characteristics of the X-ray data set is presented.
The width of the electron density of an atom is related to the width of the
Patterson origin peak and can be used as a single number classifying an X-ray
data set. In the third chapter the effect of geometrical features in real-space
on the shape of the Wilson plot is discussed. When an error on the positional
parameters is present, the geometrical features change and so does the shape
of the Wilson plot. Relating this change of shape of a Wilson plot to the
variance of a Gaussian error model, the quality of an atomic model and the
corresponding phases can be estimated on the basis of a comparison between
the Wilson plot from calculated structure factor amplitudes and the Wilson
plot calculated from experimental data. This method is, in effect, a reciprocal
space extension of the method presented in the second chapter. In the last
chapter of this thesis, a novel method for building molecular fragments in
electron density maps is proposed. The developed procedures are based on
the developments first chapters of this thesis. The error model is assumed
to be approximately known in advance and therefore inferences on error-free
geometrical features given a set of erroneous atoms are carried out. This
results in a flexible model building paradigm
Overzicht
proefschriften kristallografie Amsterdam.
Hoofdpagina van Kristallografie.