Error Estimation and Pattern Recognition Techniques in protein Crystallography.


Thesis by Drs. Peter Zwart 

Summary



Automated model building techniques in macromolecular crystallography have substantially reduced the need of manual intervention in the construction of atomic models of complex macromolecules. The availability of these methods plays a crucial role in the full automatisation of macromolecular crystallography, a task considered necessary by structural genomics initiatives. The research presented in this thesis has been carried out in order to enhance the performance of model building routines available in the ARP/wARP suite. In the second chapter a distribution of the distance between two atoms given a Gaussian error of the positional parameters was derived, denoted as the non-central Maxwell distribution. The non-central Maxwell distribution is used to obtain error dependent distributions of the set of distances between nearest neighbours in a protein structure and the set of interatomic distances smaller a certain value. These two sets of distances can be obtained from an atomic model regardless whether a chemical interpretation is available for each of the atoms within the model. These sets of distances can be used to estimate parameters of the error model using the derived distributions. It is shown that the error of the positional parameters is related to the quality of the electron density calculated with phases from the erroneous model. A key parameter in this relation is the width of the electron density of an atom. An empirical relation between this quantity and some global characteristics of the X-ray data set is presented. The width of the electron density of an atom is related to the width of the Patterson origin peak and can be used as a single number classifying an X-ray data set. In the third chapter the effect of geometrical features in real-space on the shape of the Wilson plot is discussed. When an error on the positional parameters is present, the geometrical features change and so does the shape of the Wilson plot. Relating this change of shape of a Wilson plot to the variance of a Gaussian error model, the quality of an atomic model and the corresponding phases can be estimated on the basis of a comparison between the Wilson plot from calculated structure factor amplitudes and the Wilson plot calculated from experimental data. This method is, in effect, a reciprocal space extension of the method presented in the second chapter. In the last chapter of this thesis, a novel method for building molecular fragments in electron density maps is proposed. The developed procedures are based on the developments first chapters of this thesis. The error model is assumed to be approximately known in advance and therefore inferences on error-free geometrical features given a set of erroneous atoms are carried out. This results in a flexible model building paradigm

Overzicht proefschriften kristallografie Amsterdam.

Home Hoofdpagina van Kristallografie.