Identifying functional non synonymous Single Nucleotide Variants
We have recently developed VarMod a method for assessing the functional effects of non-synonymous single nucleotide variants (SNVs).
VarMod uses structural modelling and analysis of protein-protein interfaces and protin-ligand binding sites to identify SNVs that have functional effects.
This build upon our recent research demonstrating that disease associated SNVs frequently occur at protien-protein interfaces -
David et al., 2012, Human Mutation,33, 359–363
Analysis of disease causing non synonymous SNPs
In recent yeasrs genome wide association studies have identified many SNPs that may be associated with diease. These studies tend to identify regions of the
genome that linked to disease and therefore it is posisble that many SNPs in such regions may be associated with disease. With collaborators at Imperial College London we
focussed on identifying
functional effects of the disease associated SNPs identified in genome wide association studies using structual modelling and the prediction of ligand binding sites.
Our interests are now in using genomic features for to assess the functional affect of SNPs and the application to the understanding of disease and its treatment.
(Refs Chambers et al., Nature Genetics, 2009;2009;2010;2011).
Prediction of Protein-Protein interactions using protein docking
Working with Alfonso Valencia we have demonstrated the ability to use a protein docking method to predict protein interaction partners (pairs of
proteins that interact). Protein docking programs are generally used to predict the shape of the complex formed between pairs of proteins that are known to interact.
This is a difficult problem and docking methods often fail to generate accurate models of complexes. It had therefore been widely thought that it was beyond the scope
of such program to predict IF two proteins interact.
For a set of known complexes we demonstrated that it is possible to distinguish the docking scores of the real complex from the individual proteins
docked with a large set of decoy proteins. For more details see Wass et al., 2011,
Mol Syst Biol 7:469.
Modelling of Ligand binding sites
Our work on predicting ligand binding sites started with participation in CASP8 (while based at Imperial College). We made manual predictions of ligand binding sites by identifying homologous
structures and superimposing them on models of the predicted structure of the target protein. This effectively superimposed the ligands of the homologues onto the model
of the target protein from which we predicted the binding site of the target protein. More details are available in our paper in the CASP8 Special Issue of Proteins -
Wass and Sternberg, M.J. (2009) Proteins, 77 Suppl 9:147-51.
After CASP8 we automated the manual predictive approach in the 3DLigandSite server - http://www.sbg.bio.ic.ac.uk/3dligandsite.
3DLigandSite combines protein modelling and structural searches to predict ligand binding sites. The approach identifies structures homologous to a query protein that have bound ligands and superimposes their ligands onto a model of the query protein. The superimposed ligands are used to predict a binding site on the qeury protein. 3DLigandSite was published in the 2010 Nucleic Acids Research Webserver edition - Wass et al., (2010). NAR 38, W469-73.
Protein Function Prediction - CombFunc & ConFunc
The thoushands of seqeucned genomes and millions of sequences identified by metagenomics projects make the prediction of protein function an important problem. While function prediction can be relatively simple when sequences share high levels of similarity, it is cases where sequences only have more remote homologues that current function prediction methods are ineffective. Therefore the aim of my PhD has been the development of function prediction methods that complement existing tools by performing well for these more difficult cases.
We have more recently developed CombFunc - http://www.sbg.bio.ic.ac.uk/combfunc/ which uses multiple data sources to predict protein function. Data used includes sequence, interaction, domain and expression data -
Wass M.N.(2012) Nucleic Acids Research,40, W466-470.
Convergent Evolution of Enzyme active sites
The phenomenon of Convergent evolution of enzyme active sites demonstrates the remarkable ability for unrelated enzymes to evolve identical catalytic machinery to perform the same reaction. There are a number of well known cases of convergent evolution, particularly the serine proteases. We have performed a systematic analysis to identify cases of convergent evolution of enzyme active sites using the pdb, catalytic site atlas and the E.C. classification. Our analysis demonstrates that convergent evoltion is not a rare phenomenon as it is present in approximately 15% of 3 digit E.C. numbers. Gherardini et a., JMB 2007.