Contribution to the

Direct Phase Determination in Protein Crystallography

Thesis by Dr. C. E. Kyriakidis

Summary

In the case of small molecules, a direct solution of the phase problem is possible from statistical and algebraic relations among the intensities. These direct methods alone until now proved to be inconclusive for macromolecules. The reason for this is clear: the joint probability distribution of three structure factors depends in first approximation on N"1/2 (N: number of atoms in the unit cell), so the distribution gets increasingly flattened if N becomes large. On the other hand, large structures such as proteins have been solved by the method of isomorphous replacement with heavy atoms. Differences in diffracted intensities from the derivative and native crystals are used to locate the heavy atoms. The calculated contributions from these atoms serve as a reference for evaluation of phases for the native crystal structure. In general, the phase information from a single isomorphous pair is insufficient. Thus multiple isomorphous replacements or supplementary information such as from anomalous scattering, is used to resolve the ambiguity. Anomalous scattering measurements when made at appropriate multiple wavelengths can also provide a definitive. experimental solution for the phase problem. This raises the question why direct methods fail while other techniques succeed. An efficient way to improve the applicability of direct methods is to reduce the number of variables (N) involved. This reduction can be achieved for instance by using data from isomorphous replacement and/or anomalous scattering.
In chapter 2 it is shown that the concept of isomorphous structure factors can be useful for estimating the doublet and triplet phase-sums present amongst them. A diffraction ratio is proposed that predicts whether a structure solution via a pair of isomorphously related data sets may in
principle be feasible. A lower limit to the diffraction ratio turns out to be required in order to get a triplet phase sum error-level comparable to that of small structures which are solved routinely by Direct Methods. The diffraction ratio can be used to maximize the triplet phase sum reliability before collecting the data, by choosing the optimal wavelength in a single anomalous scattering experiment, by selecting the most suitable heavy-atom derivative in a single isomorphous replacement experiment or by selecting the optimal wavelength-combination in a multiwavelength experiment. Another remarkable characteristic of the diffraction ratio is its linear relationship to the average doublet phase sum. It is argued that the doublets are essential for an accurate estimation of the triplet phase sums.
In chapter 3 several probabilistic and algebraic techniques are discussed to estimate the doublets. The combination of an algebraic estimation technique and a new difference Patterson synthesis, the maxima of which are used to improve iteratively the doublet phase sums, is shown to be successful. In case of too low diffraction ratios no useful estimates can be obtained from the joint probability distribution of isomorphous structure factors, even for small structures. If the differences between isomorphous structure factors become too small, the normal mathematical procedure turns out to be inadequate because the very small quantities cannot be expressed in terms of the usual variables. This led to the introduction of a different type of random variable: the single difference of isomorphous structure factors.
In chapter 4, based on the use of the single differences as random variables, an efficient procedure is presented for the derivation of joint probability distributions of the triplet phase sum present among two isomorphous data sets. It is shown that the usual probabilistic techniques, applied to these random variables, finally results in th.e joint probability distribution of three isomorphous structure factors comprising three doublet and eight triplet phase sums respectively. A major advantage of the new technique is that the inherent correlation between the isomorphous data sets is removed if the mathematical procedure is set up for the small differences themselves.
In chapter 5 the technique of single differences has been applied to the estimation of quartet phase sums leading to very reliable results even without the use of the cross terms. The error level reduction for the triplet and quartet phase sums leads to a phase error small enough for direct-method applications without knowledge of the heavy atom substructure.
In chapter 6 the doublet phase sums are re-examined in order to solve

their sign ambiguity. Despite the success of the single-difference procedure in solving the problem of the negative doublets in the case of two isomorphous data sets including anomalous scattering effects and data sets collected at two different wavelengths, in single isomorphous replacement case the problem cannot be solved directly, because of the large number of negative doublets. However, a method is developed which solves the problem if three isomorphous data sets are available. Also methods are proposed in the case of only two isomorphous data sets.
The available evidence suggests that the structure solution, even of large molecules, is possible by the work described in this thesis. Our next step will be a new version of the SIMPEL program set which incorporates the described theoretical development.



 

Overzicht proefschriften kristallografie Amsterdam.

Home Hoofdpagina van Kristallografie.