Why Kolabtree
Getting started is quick and easy. No upfront fees
It’s free to request a service and invite bids from experts
Discuss requirements with the expert in detail before accepting statement of work from Kolabtree
Collaborate with the expert directly to get your work done the right way
Fund project when you hire the expert, but approve the deliverables only once work is done
Want to hire this expert for a project? Request a quote for free.
Profile Details
Create Project
★★★★★
☆☆☆☆☆
USD 50 /hr
Hire Dr. Reuben P.
United Kingdom
USD 50 /hr

Expert in genomic data anlysis - making data biologically meaningful

Profile Summary
Subject Matter Expertise
Services
Writing Technical Writing
Research Feasibility Study
Consulting Scientific and Technical Consulting
Data & AI Statistical Analysis, Data Visualization
Work Experience

Lecturer in Computational Genetics

University of Southampton

September 2017 - Present

Research Fellow

University of Southampton

January 2015 - September 2017

Postgraduate Researcher

University of Southampton

September 2012 - September 2015

Intern Biologist

Selcia Ltd

June 2010 - July 2011

Education

PhD Human Genetics (Faculty of Medicine)

University of Southampton

September 2012 - December 2015

MBiol (hons) Molecular and Cellular Biology (Department of Biology and Biochemistry)

University of Bath

September 2008 - July 2012

Certifications
  • Certification details not provided.
Publications
JOURNAL ARTICLE
zalpha: an R package for the identification of regions of the genome under selection @article{1b694222df814f23b4dc45e76722661e, title = "zalpha: an R package for the identification of regions of the genome under selection", abstract = "Detecting evidence of selection and evolution in population genomes is crucial to understanding the history and the selective pressures experienced by a population. While there are many statistics for identifying regions of the genome under selection, there is a need for software to enable reproducible, standardised results. The statistics implemented in the zalpha R package use the relationships and correlations in genetic variation to find patterns that could be indicative of a selective sweep.The methods contained within this R package are a development of the statistics published by Jacobs et al. (2016). This package allows users to run a range of selection statistics on genetic data, which previously were not made publicly available in software. The software is designed to be flexible to allow users to efficiently combine statistics and is open source.The package also allows users to utilise a linkage disequilibrium (LD) profile, taking into ac-count expected relationships between alleles, ultimately increasing the power of the statistics. This is important as LD varies immensely along the genome, with recombination the biggest contributor to LD fluctuations (Jeffreys et al., 2001)", author = "Clare Horscroft and Reuben Pengelly and T.J. Sluckin and Andrew Collins", year = "2020", month = dec, day = "5", doi = "10.21105/joss.02638", language = "English", volume = "5", journal = "The Journal of Open Source Software", number = "56", } . The Journal of Open Source Software.
A CRISPR and high-content imaging assay compliant with ACMG/AMP guidelines for clinical variant interpretation in ciliopathies @article{b4e4dec2c67b42fd9b70e9f1448d9fbe, title = "A CRISPR and high-content imaging assay compliant with ACMG/AMP guidelines for clinical variant interpretation in ciliopathies", abstract = "Ciliopathies are a broad range of inherited developmental and degenerative diseases associated with structural or functional defects in motile or primary non-motile cilia. There are around 200 known ciliopathy disease genes and whilst genetic testing can provide an accurate diagnosis, 24-60% of ciliopathy patients who undergo genetic testing do not receive a genetic diagnosis. This is partly because following current guidelines from the American College of Medical Genetics and the Association for Molecular Pathology it is difficult to provide a confident clinical diagnosis of disease caused by missense or non-coding variants, which account for more than one third of cases of disease. Mutations in PRPF31 are the second most common cause of the degenerative retinal ciliopathy autosomal dominant retinitis pigmentosa. Here we present a high-throughput high content imaging assay providing quantitative measure of effect of missense variants in PRPF31 which meets the recently published criteria for a baseline standard in vitro test for clinical variant interpretation. This assay utilizes a new PRPF31+/- human retinal cell line generated using CRISPR gene editing to provide a stable cell line with significantly fewer cilia in which novel missense variants are expressed and characterised. We show that high content imaging of cells expressing missense variants in a ciliopathy gene on a null background can allow characterisation of variants according to the cilia phenotype. We hope that this will be a useful tool for clinical characterisation of PRPF31 variants of uncertain significance and can be extended to variant classification in other ciliopathies. ", keywords = "CRISPR/Cas gene editing, Screening, cilia, ciliopathies, genetics", author = "Liliya Nazlamova and Thomas, {N. Simon} and Man-Kim Cheung and Jelmer Legebeke and Jenny Lord and Reuben Pengelly and William Tapper and Gabrielle Wheway", year = "2020", month = oct, day = "23", doi = "10.1007/s00439-020-02228-1", language = "English", journal = "Human Genetics", issn = "0340-6717", publisher = "Springer Verlag", } . Human Genetics.
Comparison of Mendeliome exome capture kits for use in clinical diagnostics @article{a9bc3716ad084c6b8c2c456e6d64609b, title = "Comparison of Mendeliome exome capture kits for use in clinical diagnostics", abstract = "Next generation sequencing has disrupted genetic testing, allowing far more scope in the tests applied. The appropriate sections of the genome to be tested can now be readily selected, from single mutations to whole-genome sequencing. One product offering within this spectrum are focused exomes, targeting ~5,000 genes know to be implicated in human disease. These are designed to offer a flexible platform offering high diagnostic yield with a reduction in sequencing requirement compared to whole exome sequencing. Here, we have undertaken sequencing of control DNA samples and compare two kits, the Illumina TruSight One and the Agilent SureSelect Focused Exome. Characteristics of the kits are comprehensively evaluated. Despite the larger design region of the Agilent kit, we find that the Illumina kit performs better in terms of gene coverage, as well as coverage of clinically relevant loci. We provide exhaustive coverage statistics for each kit to aid the assessment of their suitability and provide read data for control DNA samples to allow for bioinformatic benchmarking by users developing pipelines for these data.", author = "Reuben Pengelly and Daniel Ward and David Hunt and Christopher Mattocks and Sarah Ennis", year = "2020", month = feb, day = "24", doi = "10.1038/s41598-020-60215-y", language = "English", volume = "10", journal = "Scientific Reports", issn = "2045-2322", publisher = "Nature Publishing Group", number = "1", } . Scientific Reports.
Cold-induced urticarial autoinflammatory syndrome related to factor XII activation @article{d86298fd41b14cfba72c34f5366186f5, title = "Cold-induced urticarial autoinflammatory syndrome related to factor XII activation", abstract = "Hereditary autoinflammatory diseases are caused by gene mutations of the innate immune pathway, e.g. nucleotide receptor protein 3 (NLRP3). Here, we report a four-generation family with cold-induced urticarial rash, arthralgia, chills, headache and malaise associated with an autosomal-dominant inheritance. Genetic studies identify a substitution mutation in gene F12 (T859A, resulting in p.W268R) which encodes coagulation factor XII (FXII). Functional analysis reveals enhanced autocatalytic cleavage of the mutated protein and spontaneous FXII activation in patient plasma and in supernatant of transfected HEK293 cells expressing recombinant W268R-mutated proteins. Furthermore, we observe reduced plasma prekallikrein, cleaved high molecular weight kininogen and elevated plasma bradykinin. Neutrophils are identified as a local source of FXII. Interleukin-1β (IL-1β) is upregulated in lesional skin and mononuclear donor cells exposed to recombinant mutant proteins. Treatment with icatibant (bradykinin-B2-antagonist) or anakinra (interleukin-1-antagonist) reduces disease activity in patients. In conclusion, our findings provide a link between contact system activation and cytokine-mediated inflammation.", author = "J{\"o}rg Scheffel and Mahnke, {Niklas A.} and Zonne Hofman and {de Maat}, Steven and Jim Wu and Hanna Bonnekoh and Reuben Pengelly and Sarah Ennis and John Holloway and Marieluise Kirchner and Philipp Mertins and Martin Church and Marcus Maurer and Coen Maas and Karoline Krause", year = "2020", month = jan, day = "10", doi = "10.1038/s41467-019-13984-8", language = "English", volume = "11", journal = "Nature Communications", issn = "2041-1723", publisher = "Nature Publishing Group", number = "1", } . Nature Communications.
Gene-dense autosomal chromosomes show evidence for increased selection @article{48d586a3b20f47b3a27c1ef89fffee7a, title = "Gene-dense autosomal chromosomes show evidence for increased selection", abstract = "Purifying selection tends to reduce nucleotide and haplotype diversity leading to increased linkage disequilibrium. However, detection of evidence for selection is difficult as the signature is confounded by wide variation in the recombination rate which has a complex relationship with selection. The effective bottleneck time (the ratio of the linkage disequilibrium map to the genetic map in Morgans) controls for variability in recombination rate. Reduced effective bottleneck times indicate stronger residual linkage disequilibrium, consistent with increased selection. Using whole genome sequence data from one European and three Sub-Saharan African human populations we find, in the African samples, strong correlations between high gene densities and reduced effective bottleneck time for autosomal chromosomes. Although previous studies have shown unusually strong linkage disequilibrium for the sex chromosomes variation within the autosomes has not been recognised. The strongest relationship is between effective bottleneck time and the density of essential genes, which are likely targets of greater selective pressure (p = 0.006, for the 22 autosomes). The magnitude of the reduction in chromosome-specific effective bottleneck times from the least to the most gene-dense autosomes is ~17-21% for Sub-Saharan African populations. The effect size is greater in Sub-Saharan African populations, compared to a European sample, consistent with increased efficiency of selection in populations with larger effective population sizes which have not been subject to intense population bottlenecks as experienced by populations of European ancestry. The findings highlight the value of deeper analyses of selection within Sub-Saharan African populations.", author = "M.R. Jabalameli and Clare Horscroft and {Vergara Lope Gracia}, {Norma, Alejandra} and Reuben Pengelly and Andrew Collins", year = "2019", month = dec, day = "1", doi = "10.1038/s41437-019-0272-5", language = "English", volume = "123", pages = "774--783", journal = "Heredity", issn = "0018-067X", publisher = "Nature Publishing Group", number = "6", } . Heredity.
Sequencing-era methods for identifying signatures of selection in the genome @article{18194131999c4a3a9a062161815b25d0, title = "Sequencing-era methods for identifying signatures of selection in the genome", abstract = "Insights into genetic loci which are under selection and their functional roles contribute to increased understanding of the patterns of phenotypic variation we observe today. The availability of whole genome sequence data, for humans and other species, provides opportunities to investigate adaptation and evolution at unprecedented resolution. Many analytical methods have been developed to interrogate these large datasets and characterise signatures of selection in the genome. We review here recently developed methods and consider the impact of increased computing power and data availability on the detection of selection signatures. Consideration of demography, recombination and other confounding factors is important, and use of a range of methods in combination is a powerful route to resolving different forms of selection in genome sequence data. Overall, a substantial improvement in methods for application to whole genome sequencing is evident, although further work is required to develop robust and computationally efficient approaches which may increase reproducibility across studies. ", keywords = "natural selection, Machine Learning, selective sweep, genome sequence, recombination", author = "Clare Horscroft and Sarah Ennis and Pengelly, {Reuben J.} and T.J. Sluckin and Andrew Collins", year = "2019", month = nov, doi = "10.1093/bib/bby064", language = "English", volume = "20", pages = "1997--2008", journal = "Briefings in Bioinformatics", issn = "1467-5463", publisher = "Oxford University Press", number = "6", } . Briefings in Bioinformatics.
Norma Alejandra Vergara Lope Gracia, M.Reza Jabalameli, Clare Horscroft, Sarah Ennis, Andrew Collins, Reuben J. Pengelly(2019). Linkage disequilibrium maps for European and African populations constructed from whole genome sequence data . Scientific Data. 6. p. 1--4. Nature Publishing Group
GenePy - a score for estimating gene pathogenicity in individuals using next-generation sequencing data @article{94dbe55d8d9b4378bd3d6b7a685d68d0, title = "GenePy - a score for estimating gene pathogenicity in individuals using next-generation sequencing data", abstract = "BackgroundNext-generation sequencing is revolutionising diagnosis and treatment of rare diseases, however its application to understanding common disease aetiology is limited. Rare disease applications binarily attribute genetic change(s) at a single locus to a specific phenotype. In common diseases, where multiple genetic variants within and across genes contribute to disease, binary modelling cannot capture the burden of pathogenicity harboured by an individual across a given gene/pathway.We present GenePy, a novel gene-level scoring system for integration and analysis of next-generation sequencing data on a per-individual basis that transforms NGS data interpretation from variant-level to gene-level. This simple and flexible scoring system is intuitive and amenable to integration for machine learning, network and topological approaches, facilitating the investigation of complex phenotypes.ResultsWhole-exome sequencing data from 508 individuals were used to generate GenePy scores. For each variant a score is calculated incorporating: i) population allele frequency estimates; ii) individual zygosity, determined through standard variant calling pipelines and; iii) any user defined deleteriousness metric to inform on functional impact. GenePy then combines scores generated for all variants observed into a single gene score for each individual.We generated a matrix of ~ 14,000 GenePy scores for all individuals for each of sixteen popular deleteriousness metrics. All per-gene scores are corrected for gene length. The majority of genes generate GenePy scores < 0.01 although individuals harbouring multiple rare highly deleterious mutations can accumulate extremely high GenePy scores.In the absence of a comparator metric, we examine GenePy performance in discriminating genes known to be associated with three common, complex diseases. A Mann-Whitney U test conducted on GenePy scores for this positive control gene in cases versus controls demonstrates markedly more significant results (p = 1.37 × 10− 4) compared to the most commonly applied association tool that combines common and rare variation (p = 0.003).ConclusionsPer-gene per-individual GenePy scores are intuitive when assessing genetic variation in individual patients or comparing scores between groups. GenePy outperforms the currently accepted best practice tools for combining common and rare variation. GenePy scores are suitable for downstream data integration with transcriptomic and proteomic data that also report at the gene level.", author = "Enrico Mossotto and James Ashton and Luke O'Gorman and Reuben Pengelly and Beattie, {R. Mark} and Benjamin Macarthur and Sarah Ennis", year = "2019", month = may, day = "16", doi = "10.1186/s12859-019-2877-3", language = "English", volume = "20", pages = "1--15", journal = "BMC Bioinformatics", issn = "1471-2105", publisher = "BioMed Central", number = "1", } . BMC Bioinformatics.
Norma, Alejandra Vergara Lope Gracia, Sarah Ennis, Igor Vorechovsky, Reuben J. Pengelly, Andrew Collins(2019). Heterogeneity in extent of linkage disequilibrium amongst exonic, intronic, non-coding RNA and intergenic chromosome regions . European Journal of Human Genetics. 27. (9). p. 1436--1444. Nature Publishing Group
Linkage disequilibrium maps to guide contig ordering for genome assembly @article{a598947d0f3b4ea484b96932cf761cff, title = "Linkage disequilibrium maps to guide contig ordering for genome assembly", abstract = "Motivation: Efforts to establish reference genome sequences by \textit{de novo} sequence assembly have to address the difficulty of linking relatively short sequence contigs to form much larger chromosome assemblies. Efficient strategies are required to span gaps and establish contig order and relative orientation. We consider here the use of linkage disequilibrium (LD) maps of sequenced contigs and the utility of LD for ordering, orienting and positioning linked sequences. LD maps are readily constructed from population data and have at least an order of magnitude higher resolution than linkage maps providing the potential to resolve difficult areas in assemblies. We empirically evaluate a linkage disequilibrium map-based method using single nucleotide polymorphism genotype data in a ~216 kilobase region of human 6p21.3 from which three shorter contigs are formed.Results: LD map length is most informative about the correct order and orientation and is suggested by the shortest LD map where the residual error variance is close to one. For regions in strong LD this method may be less informative for correcting inverted contigs than for identifying correct contig orders. For positioning two contigs in linkage disequilibrium with each other the inter-contig distances may be roughly estimated by this method. Availability: The LDMAP program is written in C for a linux platform and is available at https://www.soton.ac.uk/genomicinformatics/research/ld.page", author = "Reuben Pengelly and Andrew Collins", year = "2019", month = feb, day = "15", doi = "10.1093/bioinformatics/bty687", language = "English", volume = "35", pages = "541--545", journal = "Bioinformatics", issn = "1367-4803", publisher = "Oxford University Press", number = "4", } . Bioinformatics.
Risk factors for situs defects and congenital heart disease in primary ciliary dyskinesia @article{0faad547819a4be4b4f1da9fc3e699b8, title = "Risk factors for situs defects and congenital heart disease in primary ciliary dyskinesia", abstract = "Primary ciliary dyskinesia (PCD) is associated with abnormal organ positioning (situs) and congenital heart disease (CHD). This study investigated genotype–phenotype associations in PCD to facilitate risk predictions for cardiac and laterality defects. This retrospective cohort study of 389 UK patients with PCD found 51% had abnormal situs and 25% had CHD and/or laterality defects other than situs inversus totalis. Patients with biallelic mutations in a subset of nine PCD genes had normal situs. Patients with consanguineous parents had higher odds of situs abnormalities than patients with non-consanguineous parents. Patients with abnormal situs had higher odds of CHD and/or laterality defects.", author = "Sunayna Best and Amelia Shoemark and Bruna Rubbo and Patel, {Mitali P.} and Fassad, {Mahmoud R.} and Mellisa Dixon and Rogers, {Andrew V.} and Hirst, {Robert A.} and Andrew Rutman and Sarah Ollosson and Jackson, {Claire L.} and Patricia, Goggin and Simon Thomas and Reuben Pengelly and Thomas Cullup and Eleni Pissaridou and Jane Hayward and Alexandros Onoufriadis and Christopher O'Callaghan and Loebinger, {Michael R.} and Robert Wilson and Chung, {Eddie MK} and Priti Kenia and Doughty, {Victoria L.} and Carvalho, {Julena S.} and Lucas, {Jane S.} and Mitchison, {Hannah M.} and Claire Hogg", year = "2019", month = feb, day = "1", doi = "10.1136/thoraxjnl-2018-212104", language = "English", volume = "74", pages = "203--205", journal = "Thorax", issn = "0040-6376", publisher = "BMJ Publishing Group", number = "2", } . Thorax.
Understanding the disease genome @article{eea602618c154c4b9ee583bf993fa779, title = "Understanding the disease genome: gene essentiality and the interplay of selection, recombination and mutation", abstract = "Despite the identification of many genetic variants contributing to human disease (the {\textquoteleft}disease genome{\textquoteright}) establishing reliable molecular diagnoses remain challenging in many cases. The ability to sequence the genomes of patients has been transformative but difficulty in interpretation of voluminous genetic variation often confounds recognition of underlying causal variants. There are numerous predictors of pathogenicity for individual DNA variants but their utility is reduced because many plausibly pathogenic variants are probably neutral. The rapidly increasing quantity and quality of information on the properties of genes suggests that gene-specific information might be useful for prediction of causal variation when used alongside variant-specific predictors of pathogenicity. The key to understanding the role of genes in disease relates in part to gene essentiality which has recently been approximated, for example, by quantifying the degree of intolerance of individual genes to loss-of-function variation. Increasing understanding of the interplay between genetic recombination, selection and mutation and their relationship to gene essentiality suggests that gene-specific information may be useful for the interpretation of sequenced genomes. Considered alongside additional distinctive properties of the disease genome, such as the timing of the evolutionary emergence of genes and the roles of their products in protein networks, the case for using gene-specific measures to guide filtering of sequenced genomes seems strong. ", keywords = "Disease genome, Gene essentiality, gene-specific filtering, next generation sequencing", author = "Pengelly, {Reuben J.} and {Vergara Lope}, Alejandra and Dareen Alyousfi and Jabalameli, {M. Reza} and Andrew Collins", year = "2019", month = jan, doi = "10.1093/bib/bbx110", language = "English", volume = "20", pages = "267--273", journal = "Briefings in Bioinformatics", issn = "1467-5463", publisher = "Oxford University Press", number = "1", } . Briefings in Bioinformatics.
Long term follow-up of a family with GUCY2D dominant cone dystrophy @article{b3079d1e21654f57859de3f4756aa0d1, title = "Long term follow-up of a family with GUCY2D dominant cone dystrophy", abstract = "The aim of this manuscript is to describe long term follow-up in a family with GUC2YD dominant cone dystrophy. OCT scans (Triton/OCT-2000 Topcon Ltd, Tokyo, Japan) and Fundus Autofluorescence (FAF) images (Spectralis Heidelberg Engineering, Heidelberg, Germany) were obtained. Goldmann Visual Field (GVF) testing was utilised to monitor the progression of central field loss. Flash and pattern electroretinograms (ERG) and occipital pattern reversal VEPs (VEP) were recorded in accordance with International Society for Clinical Electrophysiology of Vision (ISCEV) standards. Two members of the same family (father and son) were identified to have the heterozygous R838C mutation in the GUC2YD gene. The father presented at the age of 45 with bilateral bull{\textquoteright}s eye maculopathy and temporal disc pallor. Over 13 years of serial follow up visits, the bull{\textquoteright}s eye maculopathy progressed gradually into macular atrophy. The cone ERGs and visual evoked potentials (VEP) were significantly degraded suggesting poor macular function. Spectral Domain Optical Coherence Tomography (SD-OCT) scans showed progressive loss and disruption of the ellipsoid layer at the foveal level. Autofluorescence showed a central annular area of hypo-autofluorescence corresponding to macular atrophy and retinal pigment epithelial (RPE) loss with a surrounding ring of hyper-autofluorescence indicating the transitional zone between the abnormal-normal tissue. Goldmann Visual Fields (GVFs) showed enlargement of a central scotoma. His son presented at the age of 16 with bilateral granular RPE changes in both maculae. Electrophysiological testing was initially borderline normal but has gradually deteriorated to show reduced cone ERGs and macula function. SD-OCT demonstrated gradual macular thinning and atrophy bilaterally. Unlike his father, there was no disruption of the ellipsoid layer. GVFs showed progression of central visual field loss. In conclusion, both family members with cone dystrophy exhibited gradual changes in their fundi, electrophysiological testing and multimodal imaging. Changes were milder than those observed in other mutations of the same gene.", author = "Georgios Tsokolas and Hussein Almuhtaseb and Helen Griffiths and Fatima Shawkat and Reuben Pengelly and Sarah Ennis and Andrew Lotery", year = "2018", month = dec, day = "18", doi = "10.18240/ijo.2018.12.12", language = "English", volume = "11", pages = "1945--1950", journal = "International Journal of Ophthalmology", issn = "2222-3959", publisher = "Press of International Journal of Ophthalmology", number = "12", } . International Journal of Ophthalmology.
Autosomal dominant tubulointerstitial kidney disease-UMOD is the most frequent non polycystic genetic kidney disease @article{ec93775773e74f74a441d2b5c5f67ff1, title = "Autosomal dominant tubulointerstitial kidney disease-UMOD is the most frequent non polycystic genetic kidney disease", abstract = "BackgroundAutosomal dominant tubulointerstitial kidney disease (ADTKD) caused by mutations in the UMOD gene (ADTKD-UMOD) is considered rare and often remains unrecognised. We aimed to establish the prevalence of genetic kidney diseases, ADTKD and ADTKD-UMOD in adult chronic kidney disease (CKD) patients, and to investigate characteristic features.MethodsWe sent questionnaires on family history to all patients with CKD stages 3–5 in our tertiary renal centre to identify patients with inherited renal disease. Details on clinical and family history were obtained from patient interviews and clinical records. Sanger sequencing of the UMOD gene was performed from blood or saliva samples.Results2027 of 3770 sent questionnaires were returned. 459 patients reported a family history, which was consistent with inherited kidney disease in 217 patients. 182 non-responders with inherited kidney diseases were identified through a database search. Of these 399 individuals, 252 had autosomal dominant polycystic kidney disease (ADPKD), 28 had ADTKD, 25 had Alports, and 44 were unknown, resulting in 11% of CKD 3–5 patients and 19% of end-stage renal disease patients with genetic kidney diseases. Of the unknown, 40 were genotyped, of whom 31 had findings consistent with ADTKD. 30% of unknowns and 39% of unknowns with ADTKD had UMOD mutations. Altogether, 35 individuals from 18 families were found to have ten distinct UMOD mutations (three novel), making up 1% of patients with CKD 3–5, 2% of patients with end-stage renal disease, 9% of inherited kidney diseases and 56% with ADTKD. ADTKD-UMOD was the most common genetic kidney disease after ADPKD with a population prevalence of 9 per million. Less proteinuria and haematuria, but not hyperuricaemia or gout were predictive of ADTKD-UMOD. The main limitations of the study are the single-centre design and a predominantly Caucasian population.ConclusionsThe prevalence of genetic kidney diseases and ADTKD-UMOD is significantly higher than previously described. Clinical features poorly predicted ADTKD-UMOD, highlighting the need for genetic testing guided by family history alone.", author = "Christine Gast and Anthony Marinaki and Monica Arenas-Hernandez and Sara Campbell and Seaby, {Eleanor G.} and Reuben Pengelly and Gale, {Daniel P.} and Connor, {Thomas M.} and Bunyan, {David J.} and Katerina Hodanova and Martina Zivna and Sarah Ennis and G. Venkat-Raman", year = "2018", month = oct, day = "30", doi = "10.1186/s12882-018-1107-y", language = "English", volume = "19", pages = "1--11", journal = "BMC Nephrology", issn = "1471-2369", publisher = "BioMed Central", number = "301", } . BMC Nephrology.
Analysis of mutation and loss of heterozygosity by whole-exome sequencing yields insights into pseudomyxoma peritonei @article{475f5ae7f0214072b6c2db61f98e491e, title = "Analysis of mutation and loss of heterozygosity by whole-exome sequencing yields insights into pseudomyxoma peritonei", abstract = "Pseudomyxoma peritonei is a clinical syndrome characterized by gross mucinous ascites originating from a disseminated intraperitoneal neoplasm. Although typically confined to the abdomen, mortality is high if untreated. Biomarkers, including genetic mutation profiles, may aid treatment selection and decision making. In order to investigate this, we have applied whole-exome sequencing to five patients diagnosed with low grade appendiceal mucinous neoplasms, utilising paired tumour and germline samples. Multiple bioinformatic approaches were applied to these data and assess both somatic mutation profiles and loss of heterozygosity events. Mutation profiles of the tumours were consistent with deamination of methylcytosine being the prevailing mechanism. We identify pathogenic mutations in both KRAS and GNAS in all samples, as well as further mutations in genes implicated in PMP, namely FGFR2, APC, SMAD2 and FAT4. No TP53 somatic mutations were identified, matching expectations for low-grade tumours. Four of five samples exhibited clonal loss of heterozygosity; these regions were further examined and found to contain genes harbouring pathogenic somatic mutations in some samples. RNF43 was hereby implicated in the pathogenesis of PMP of appendiceal origin, having previously been demonstrated to increase sensitivity to Wnt signalling and have involvement in similar mucinous tumours. In conclusion, we have investigated the mutation profile of pseudomyxoma peritonei of appendiceal origin, and provided the first report of RNF43 involvement in its progression. ", author = "Reuben Pengelly and Babatunde Rowaiye and Karen Pickard and Moran, {Brendan J.} and Sanjeev Dayal and William Tapper and Alexander Mirnezami and Tom Cecil and Faheez Mohamed and Norman Carr and Sarah Ennis", year = "2018", month = sep, day = "1", doi = "10.1016/j.jmoldx.2018.05.002", language = "English", volume = "20", pages = "635--642", journal = "Journal of Molecular Diagnostics", issn = "1525-1578", publisher = "Association of Molecular Pathology", number = "5", } . Journal of Molecular Diagnostics.
Subclonal evolution of cancer-related gene mutations in p53 immunopositive patches in human skin @article{2900d0d459f54d1a9ba22ef8c26eb888, title = "Subclonal evolution of cancer-related gene mutations in p53 immunopositive patches in human skin", abstract = "Normal sun-exposed skin contains numerous epidermal patches that stain positive for p53 protein (p53 immunopositive patches; PIPs), which are considered potential early precursors of skin cancer. Whilst the TP53 gene is mutated in many PIPs, it is unclear whether PIPs contain any other cancer-related mutations. Here we report that PIPs, predominantly <3,000 p53 immunopositive cells in size, within normal chronically exposed skin contain mutations in multiple genes which are mutated in cutaneous squamous cell cancers. These mutations in the PIPs were not detected within the non-PIP epidermis of corresponding normal chronically exposed skin. Although some of these genetic alterations are clonal in the PIPs, many of the mutations are subclonal within these lesions. Similar mutations are seen in later precancers (actinic keratoses and Bowen{\textquoteright}s disease). Our results demonstrate that PIPs in chronically exposed skin contain multiple mutations in cancer-related genes. In addition, the results indicate that the clonal evolution of mutations which are seen within later precancerous lesions and in established malignancy can also occur in PIPs within normal human skin.", keywords = "p53 immunopositive patches, p53 mutant patches, p53 mutant clones, actinic keratosis, Bowen's disease", author = "Amel Albibas and Matthew Rose-Zerilli and Chester Lai and Pengelly, {Reuben J} and Gabrielle Lockett and Jeffery Theaker and Sarah Ennis and John Holloway and Eugene Healy", note = "Short title: Gene Mutations in p53 Immunopositive Patches", year = "2018", month = jan, doi = "10.1016/j.jid.2017.07.844", language = "English", volume = "138", pages = "189--198", journal = "Journal of Investigative Dermatology", issn = "0022-202X", publisher = "Nature Publishing Group", number = "1", } . Journal of Investigative Dermatology.
Clinical efficacy of a next-generation sequencing gene panel for primary immunodeficiency diagnostics @article{d50bbcb7d4e34623b822ea579106f7a2, title = "Clinical efficacy of a next-generation sequencing gene panel for primary immunodeficiency diagnostics", abstract = "Primary immunodeficiencies (PIDs) are rare monogenic inborn errors of immunity that result in impairment of functions of the human immune system. PIDs have a broad phenotype with increased morbidity and mortality and treatment choices are often complex. With increased accessibility of next-generation sequencing the rate of discovery of genetic causes for PID has increased exponentially. Identification of an underlying monogenic diagnosis provides important clinical benefits for patients with the potential to alter treatments, facilitate genetic counselling, and pre-implantation diagnostics. We investigated a next-generation sequencing PID panel of 242 genes within clinical care across a range of PID phenotypes. We also evaluated Phenomizer to predict causal genes from human phenotype ontology (HPO) terms. 27 participants were recruited and a total of 15 reportable variants were identified in 48% (13/27) of the participants. The panel results had implications for treatment in 37% (10/27) of participants. Phenomizer identified the genes harbouring variants from HPO terms in 33% (9/27) of participants. This study demonstrates the clinical efficacy that genetic testing has in the care of PID. However, it also highlights some of the disadvantages of gene panels in the rapidly moving field of PID genomics and current challenges in HPO term assignment for PID.", author = "William Rae and Daniel Ward and Christopher Mattocks and Pengelly, {Reuben J.} and Efrem Eren and Patel, {Sanjay V.} and Faust, {Saul N.} and David Hunt and Williams, {Anthony P.}", year = "2017", month = oct, day = "27", doi = "10.1111/cge.13163", language = "English", journal = "Clinical Genetics", issn = "0009-9163", publisher = "Wiley-Blackwell", } . Clinical Genetics.
Evaluating phenotype-driven approaches for genetic diagnoses from exomes in a clinical setting @article{ab616f768ef24cf9885d67809ec905c8, title = "Evaluating phenotype-driven approaches for genetic diagnoses from exomes in a clinical setting", abstract = "Next generation sequencing is transforming clinical medicine and genome research, providing a powerful route to establishing molecular diagnoses for genetic conditions; however, challenges remain given the volume and complexity of genetic variation. A number of methods integrate patient phenotype and genotypic data to prioritise variants as potentially causal. Some methods have a clinical focus while others are more research-oriented. With clinical applications in mind we compare results from alternative methods using 21 exomes for which the disease causal variant has been previously established through traditional clinical evaluation. In this case series we find that the PhenIX program is the most effective ranking the true causal variant at between 1 and 10 in 85% of these cases. This is a significantly higher proportion than the combined results from five alternative methods tested (P=0.003). The next best method is Exomiser (hiPHIVE), in which the causal variant is ranked 1-10 in 25% of cases. The widely different targets of these methods (more clinical focus, considering known Mendelian genes, in PhenIX, versus gene discovery in Exomiser) is perhaps not fully appreciated but may impact strongly on their utility for molecular diagnosis using clinical exome data.", author = "Pengelly, {Reuben J.} and Thahmina Alom and Zijian Zhang and David Hunt and Sarah Ennis and Andrew Collins", year = "2017", month = oct, day = "18", doi = "10.1038/s41598-017-13841-y", language = "English", volume = "7", journal = "Scientific Reports", issn = "2045-2322", publisher = "Nature Publishing Group", } . Scientific Reports.
Identification of a functionally significant tri-allelic genotype in the Tyrosinase gene (TYR) causing hypomorphic oculocutaneous albinism (OCA1B) @article{c99b19e1306b4d4aa3ebc32935a4f3e6, title = "Identification of a functionally significant tri-allelic genotype in the Tyrosinase gene (TYR) causing hypomorphic oculocutaneous albinism (OCA1B)", abstract = "Oculocutaneous albinism (OCA) and ocular albinism (OA) are inherited disorders of melanin biosynthesis, resulting in loss of pigment and severe visual deficits. OCA encompasses a range of subtypes with overlapping, often hypomorphic phenotypes. OCA1 is the most common cause of albinism in European populations and is inherited through autosomal recessive mutations in the Tyrosinase (TYR) gene. However, there is a high level of reported missing heritability, where only a single heterozygous mutation is found in TYR. This is also the case for other OCA subtypes including OCA2 caused by mutations in the OCA2 gene. Here we have interrogated the genetic cause of albinism in a well phenotyped, hypomorphic albinism population by sequencing a broad gene panel and performing segregation studies on phenotyped family members. Of eighteen probands we can confidently diagnose three with OA and OCA2, and one with a PAX6 mutation. Of six probands with only a single heterozygous mutation in TYR, all were found to have the two common variants S192Y and R402Q. Our results suggest that a combination of R402Q and S192Y with a deleterious mutation in a {\textquoteleft}tri-allelic genotype{\textquoteright} can account for missing heritability in some hypomorphic OCA1 albinism phenotypes.", keywords = "Albinism, Nystagmus, Sequencing, Diagnosis, Tyrosinase", author = "Norman, {Chelsea S.} and Luke O'Gorman and Jane Gibson and Pengelly, {Reuben J.} and Diana Baralle and Ratnayaka, {J. Arjuna} and Helen Griffiths and Matthew Rose-Zerilli and Megan Ranger and David Bunyan and Helena Lee and Rhiannon Page and Tutte Newall and Fatima Shawkat and Christopher Mattocks and Daniel Ward and Sarah Ennis and Self, {Jay E.}", year = "2017", month = jun, day = "30", doi = "10.1038/s41598-017-04401-5", language = "English", volume = "7", pages = "1--9", journal = "Scientific Reports", issn = "2045-2322", publisher = "Nature Publishing Group", } . Scientific Reports.
Unexpected findings in a child with atypical HUS: an example of how genomics is changing the clinical diagnostic paradigm @article{3aad7ce92fc143e488ea08c40be5219f, title = "Unexpected findings in a child with atypical HUS: an example of how genomics is changing the clinical diagnostic paradigm", abstract = "CBL is a tumour suppressor gene on chromosome 11 encoding a multivalent adaptor protein with E3 ubiquitin ligase activity. Germline CBL mutations are dominant, with pathogenic de novo mutations reported that can phenotypically overlap Noonan syndrome.1 Some patients with CBL mutations go on to develop juvenile myelomonocytic leukaemia (JMML), an aggressive malignancy that usually necessitates bone marrow transplantation. Using whole exome sequencing methods, we identified a known mutation in CBL in a 4-year-old Caucasian boy with atypical haemolytic uraemic syndrome (aHUS), moyamoya phenomenon and dysmorphology consistent with a mild Noonan-like phenotype. Exome data revealed loss of heterozygosity across chromosome 11q consistent with JMML but in the absence of clinical leukaemia. Our finding challenges conventional clinical diagnostics since we have identified a pathogenic variant in the CBL gene previously only ascertained in children presenting with leukaemia. The increasing affordability of expansive sequencing is likely to increase the scope of clinical profiles observed for previously identified pathogenic variants and calls into question the interpretability and indications for clinical management.", author = "Seaby, {Eleanor G.} and Gilbert, {Rodney D.} and Gaia Andreoletti and Pengelly, {Reuben J} and Catherine Mercer and David Hunt and Sarah Ennis", year = "2017", month = may, day = "22", doi = "10.3389/fped.2017.00113", language = "English", volume = "5", journal = "Frontiers in Pediatrics", issn = "2296-2360", publisher = "Frontiers in Bioscience", } . Frontiers in Pediatrics.
Mutations specific to the Rac-GEF domain of TRIO causes intellectual disability and microcephaly @article{db280a872a8d4b4da71129d2942b4d62, title = "Mutations specific to the Rac-GEF domain of TRIO causes intellectual disability and microcephaly", abstract = "Background: Neurodevelopmental disorders have challenged clinical genetics for decades, with over 700 genes implicated and many whose function remains unknown. The application of whole-exome sequencing is proving pivotal in closing the genotype/phenotype gap through the discovery of new genes and variants that help to unravel the pathogenic mechanisms driving neuropathogenesis. One such discovery includes TRIO, a gene recently implicated in neurodevelopmental delay. Trio is a Dbl family guanine nucleotide exchange factor (GEF) and a major regulator of neuronal development, controlling actin cytoskeleton dynamics by activating the GTPase Rac1.Methods: Whole-exome sequencing was undertaken on a family presenting with global developmental delay, microcephaly and mild dysmorphism. Father/daughter exome analysis was performed, followed by confirmatory Sanger sequencing and segregation analysis on four individuals. Three further patients were recruited through the deciphering developmental disorders (DDD) study. Functional studies were undertaken using patient-specific Trio protein mutations.Results: We identified a frameshift deletion in TRIO that segregated autosomal dominantly. By scrutinising data from DDD, we further identified three unrelated children with a similar phenotype who harboured de novo missense mutations in TRIO. Biochemical studies demonstrated that in three out of four families, the Trio mutations led to a markedly reduced Rac1 activation.Conclusions: We describe an inherited global developmental delay phenotype associated with a frameshift deletion in TRIO. Additionally, we identify pathogenic de novo missense mutations in TRIO associated with the same consistent phenotype, intellectual disability, microcephaly and dysmorphism with striking digital features. We further functionally validate the importance of the GEF domain in Trio protein function. Our study demonstrates how genomic technologies are yet again proving prolific in diagnosing and advancing the understanding of neurodevelopmental disorders.", author = "Reuben Pengelly and Stephanie Greville-Heygate and Susanne Schmidt and Seaby, {Eleanor G} and Jabalameli, {M Reza} and Mehta, {Sarju G} and Parker, {Michael J} and David Goudie and Christine Fagotto-Kaufmann and Catherine Mercer and Anne Debant and Sarah Ennis and Diana Baralle", year = "2016", month = nov, doi = "10.1136/jmedgenet-2016-103942", language = "English", volume = "53", pages = "735--742", journal = "Journal of Medical Genetics", issn = "0022-2593", publisher = "BMJ Publishing Group", number = "11", } . Journal of Medical Genetics.
Precision molecular diagnosis defines specific therapy in combined immunodeficiency with megaloblastic anaemia secondary to MTHFD1 deficiency @article{de677af362a34fda98400dfd9b9812fd, title = "Precision molecular diagnosis defines specific therapy in combined immunodeficiency with megaloblastic anaemia secondary to MTHFD1 deficiency", abstract = "BackgroundMethylenetetrahydrofolate dehydrogenase (MTHFD1) deficiency has recently been reported to cause a folate-responsive syndrome displaying a phenotype that includes megaloblastic anemia and severe combined immunodeficiency.ObjectiveTo describe our investigative approach to the molecular diagnosis and evaluation of immune dysfunction in a family with MTHFD1 deficiency.MethodsThe methods used were exome sequencing and analysis of variants in genes involved in the folate metabolic pathway in a family with 2 affected siblings. Routine laboratory and research data were analyzed to gain an in-depth understanding of innate, humoral, and cell-mediated immune function before and after folinic acid supplementation.ResultsInterrogation of exome data for concordant variants between the siblings in the genes involved in folate metabolic pathway identified a heterozygous mutation in exon 3 of the MTHFD1 gene that was shared with their mother. In view of highly suggestive phenotype, we extended our bioinformatics interrogation for structural variants in the MTHFD1 gene by manual evaluation of the exome data for sequence depth coverage of all the exons. A deletion involving exon 13 that was shared with their father was identified. Routine laboratory data showed lymphopenia involving all subsets and poor response to vaccines. In vitro analysis of dendritic cell and lymphocyte function was comparable to that in healthy volunteers. Treatment with folinic acid led to immune reconstitution, enabling discontinuation of all prophylactic therapies.ConclusionsExome sequencing demonstrated MTHFD1 deficiency as a novel cause of a combined immunodeficiency. Folinic acid was established as precision therapy to reverse the clinical and laboratory phenotype of this primary immunodeficiency.", author = "Kesava Ramakrishnan and Reuben Pengelly and Yifang Gao and Mary Morgan and Sanjay Patel and E.Graham Davies and Sarah Ennis and Saul Faust and Tony Williams", year = "2016", month = nov, day = "1", doi = "10.1016/j.jaip.2016.07.014", language = "English", volume = "4", pages = "1160--1166.e10", journal = "Journal of Allergy and Clinical Immunology: In Practice", issn = "2213-2198", publisher = "Elsevier", number = "6", } . The Journal of Allergy and Clinical immunology: In Practice.
Commercial chicken breeds exhibit highly divergent patterns of linkage disequilibrium @article{b112c73a831343269cd80ebe77ffa6a5, title = "Commercial chicken breeds exhibit highly divergent patterns of linkage disequilibrium", abstract = "The analysis of linkage disequilibrium (LD) underpins the development of effective genotyping technologies, trait mapping and understanding of biological mechanisms such as those driving recombination and the impact of selection. We apply the Mal{\'e}cot-Morton model of LD to create additive LD maps that describe the high-resolution LD landscape of commercial chickens. We investigated LD in chickens (Gallus gallus) at the highest resolution to date for broiler, white egg and brown egg layer commercial lines. There is minimal concordance between breeds of fine-scale LD patterns (correlation coefficient <0.21), and even between discrete broiler lines. Regions of LD breakdown, which may align with recombination hot spots, are enriched near CpG islands and transcription start sites (P<2.2 × 10?16), consistent with recent evidence described in finches, but concordance in hot spot locations between commercial breeds is only marginally greater than random. As in other birds, functional elements in the chicken genome are associated with recombination but, unlike evidence from other bird species, the LD landscape is not stable in the populations studied. The development of optimal genotyping panels for genome-led selection programmes will depend on careful analysis of the LD structure of each line of interest. Further study is required to fully elucidate the mechanisms underlying highly divergent LD patterns found in commercial chickens.", author = "Reuben Pengelly and Gheyas, {Almas A.} and Richard Kuo and Enrico Mossotto and Seaby, {Eleanor G.} and Burt, {David W.} and Sarah Ennis and Andrew Collins", year = "2016", month = nov, doi = "10.1038/hdy.2016.47", language = "English", volume = "117", pages = "375--382", journal = "Heredity", issn = "0018-067X", publisher = "Nature Publishing Group", number = "5", } . Heredity.
Exome sequencing explained: a practical guide to its clinical application @article{13c89d7320014ceaa2d7bc0477bf8cbe, title = "Exome sequencing explained: a practical guide to its clinical application", abstract = "Next-generation sequencing has catapulted healthcare into a revolutionary genomics era. One such technology, whole-exome sequencing, which targets the protein-coding regions of the genome, has proven success in identifying new causal mutations for diseases of previously unknown etiology. With a successful diagnostic rate approaching 25% for rare disease in recent studies, its clinical utility is becoming increasingly popular. However, the interpretation of whole-exome sequencing data requires expertise in genomic informatics and clinical medicine to ensure the accurate and safe reporting of findings back to the bedside. This is challenged by vast amounts of sequencing data harbouring approximately 25?000 variants per sequenced individual. Computational strategies and fastidious filtering frameworks are thus required to extricate candidate variants in a sea of common polymorphisms. Once prioritized, identified variants require intensive scrutiny at a biological level, and require judicious assessment alongside the clinical phenotype. In the final step, all evidence is collated and documented alongside pathogenicity guidelines to produce an exome report that returns to the clinic. This review provides a practical guide for clinicians and genomic informaticians on the clinical application of whole-exome sequencing. We address sequencing capture and methodology, quality control parameters at different stages of sequencing analysis and propose an exome data filtering strategy that includes primary filtering (for the removal of probable benign variants) and secondary filtering for the prioritization of remaining candidates.", keywords = "whole-exome sequencing, next-generation sequencing, clinical genomics", author = "Seaby, {Eleanor G.} and Reuben Pengelly and Sarah Ennis", year = "2016", month = sep, doi = "10.1093/bfgp/elv054", language = "English", volume = "15", pages = "374--384", journal = "Briefings in Functional Genomics", issn = "2041-2649", publisher = "Oxford University Press", number = "5", } . Briefings in Functional Genomics.
Deleterious coding variants in multi-case families with non-syndromic cleft lip and/or palate phenotypes @article{861fec2131c94169894bac448d5a6a78, title = "Deleterious coding variants in multi-case families with non-syndromic cleft lip and/or palate phenotypes", abstract = "Nonsyndromic Cleft Lip and/or Palate (NSCLP) is regarded as a multifactorial condition in which clefting is an isolated phenotype, distinguished from the largely monogenic, syndromic forms which include clefts among a spectrum of phenotypes. Nonsyndromic clefting has been shown to arise through complex interactions between genetic and environmental factors. However, there is increasing evidence that the broad NSCLP classification may include a proportion of cases showing familial patterns of inheritance and contain highly penetrant deleterious variation in specific genes. Through exome sequencing of multi-case families ascertained in Bogota, Colombia, we identify 28 non-synonymous single nucleotide variants that are considered damaging by at least one predictive score. We discuss the functional impact of candidate variants identified. In one family we find a coding variant in the MSX1 gene which is predicted damaging by multiple scores. This variant is in exon 2, a highly conserved region of the gene. Previous sequencing has suggested that mutations in MSX1 may account for ~2% of NSCLP. Our analysis further supports evidence that a proportion of NSCLP cases arise through monogenic coding mutations, though further work is required to unravel the complex interplay of genetics and environment involved in facial clefting.", author = "Reuben Pengelly and Liliana Arias and Julio Martinez and Rosanna Upstill-Goddard and Seaby, {Eleanor G.} and Jane Gibson and Sarah Ennis and Andrew Collins and Ignacio Briceno", year = "2016", month = jul, day = "26", doi = "10.1038/srep30457", language = "English", volume = "6", pages = "1--8", journal = "Scientific Reports", issn = "2045-2322", publisher = "Nature Publishing Group", } . Scientific Reports.
Collagen (COL4A) mutations are the most frequent mutations underlying adult focal segmental glomerulosclerosis @article{47766ff78be3421f9a70e0d8f3c699f7, title = "Collagen (COL4A) mutations are the most frequent mutations underlying adult focal segmental glomerulosclerosis", abstract = "Background Multiple genes underlying focal segmental glomerulosclerosis (FSGS) and/or steroid-resistant nephrotic syndrome (SRNS) have been identified, with the recent inclusion of collagen IV mutations responsible for Alport disease (AD) or thin basement membrane nephropathy (TBMN). We aimed to investigate the distribution of gene mutations in adult patients with primary FSGS/SRNS by targeted next generation sequencing (NGS). Methods Eighty-one adults from 76 families were recruited; 24 families had a history of renal disease. A targeted NGS panel was designed and applied, covering 39 genes implicated in FSGS/SRNS including COL4A3-5. Results Confirmed pathogenic mutations were found in 10 patients (6 with family history) from 9 families (diagnostic rate 12%). Probably pathogenic mutations were identified in an additional six patients (combined diagnostic rate 20%). Definitely pathogenic mutations were identified in 22% of patients with family history and 10% without. Mutations in COL4A3-5 were present in eight patients from six families, representing 56% of definitely pathogenic mutations, and establishing a diagnosis of AD in six patients and TBMN in two patients. Collagen mutations were identified in 38% of families with familial FSGS, and 3% with sporadic FSGS, with over half the mutations occurring in COL4A5. Patients with collagen mutations were younger at presentation and more likely to have family history, haematuria and glomerular basement membrane abnormalities. Conclusions We show that collagen IV mutations, including COL4A5, frequently underlie FSGS and should be considered, particularly with a positive family history. Targeted NGS improves diagnostic efficiency by investigating many candidate genes in parallel. ", author = "Christine Gast and Reuben Pengelly and Matthew Lyon and Bunyan, {David J.} and Seaby, {Eleanor G.} and Nikki Graham and Gopalakrishnan Venkat-Raman and Sarah Ennis", year = "2016", month = jun, doi = "10.1093/ndt/gfv325", language = "English", volume = "31", pages = "961--970", journal = "Nephrology, Dialysis, Transplantation", issn = "0931-0509", publisher = "Oxford University Press", number = "6", } . Nephrology, Dialysis, Transplantation.
Aarskog-Scott syndrome: phenotypic and genetic heterogeneity @article{9f084a6ecf8b4791a283ca074cb1d7b5, title = "Aarskog-Scott syndrome: phenotypic and genetic heterogeneity", abstract = "Aarskog-Scott syndrome (AAS) is a rare developmental disorder which primarily affects males and has a relative prevalence of 1 in 25,000 in the general population. AAS patients usually present with developmental complications including short stature and facial, skeletal and urogenital anomalies. The spectrum of genotype-phenotype correlations in AAS is unclear and mutations of the FGD1 gene on the proximal short arm of chromosome X account for only 20% of the incidence of the disorder. Failure to identify pathogenic variants in patients referred for FGD1 screening suggests heterogeneity underlying pathophysiology of the condition. Furthermore, overlapping features of AAS with several other developmental disorders increase the complexity of diagnosis. Cytoskeletal signaling may be involved in the pathophysiology of AAS. The FGD1 protein family has a role in activation of CDC42 (Cell Division Control protein 42 homolog) which has a core function in remodeling of extracellular matrix and the transcriptional activation of many modulators of development. Therefore, mutations in components in the EGFR1 (Epidermal Growth Factor Receptor 1) signaling pathway, to which CDC42 belongs, may contribute to pathophysiology. Parallel sequencing strategies (so-called next generation sequencing or high throughput sequencing) enables simultaneous production of millions of sequencing reads that enormously facilitate cost-effective identification of cryptic mutations in heterogeneous monogenic disorders. Here we review the source of phenotypic and genetic heterogeneity in the context of AAS and discuss the applicability of next generation sequencing for identification of novel mutations underlying AAS", author = "Jabalameli, {M. Reza} and Ignacio Briceno and Martinez Julio and Reuben Pengelly and Sarah Ennis and Andrew Collins", year = "2016", month = mar, day = "29", doi = "10.3934/genet.2016.1.49", language = "English", volume = "3", pages = "49--59", journal = "AIMS Genetics", issn = "2377-1143", number = "1", } . AIMS Genetics.
Resolving clinical diagnoses for syndromic cleft lip and/or palate phenotypes using whole-exome sequencing @article{d7c79105197047f7a80ee5e0a4e24931, title = "Resolving clinical diagnoses for syndromic cleft lip and/or palate phenotypes using whole-exome sequencing", abstract = "Individuals from three families ascertained in Bogota, Colombia, showing syndromic phenotypes, including cleft lip and/or palate, were exome sequenced. In each case sequencing revealed underlying causal variation confirming or establishing diagnoses. The findings include very rare and novel variants providing insights into genotype and phenotype relationships. These include the molecular diagnosis of an individual with Nager syndrome and a family exhibiting an atypical Incontinentia Pigmenti phenotype with a missense mutation in IKBKG. IKBKG mutations are typically associated with pre-term male death but this variant is associated with survival for 8–15 days. The third family exhibits unusual phenotypic features and the proband received a provisional diagnosis of Pierre Robin Sequence (PRS). Affected individuals share a novel deleterious mutation in IRF6. Mutations in IRF6 cause Van der Woude and Popliteal pterygium syndrome and contribute to nonsyndromic cleft lip phenotypes but have not previously been associated with a PRS phenotype. Exome sequencing followed by in silico screening to identify candidate causal variant(s), and functional assay in some cases, offers a powerful route to establishing molecular diagnoses. This approach is invaluable for conditions showing phenotypic and/or genetic heterogeneity including cleft lip and/or palate phenotypes where many underlying causal genes have not been identified.", keywords = "cleft lip and palate, exome sequencing, incontinentia pigmenti, nager syndrome, pierre robin sequence, syndromic disease", author = "Reuben Pengelly and Rosanna Upstill-Goddard and Liliana Arias and Julio Martinez and Jane Gibson and Marcin Knut and Collins, {Amanda L.} and Sarah Ennis and Andrew Collins and Ignacio Briceno", year = "2015", month = nov, doi = "10.1111/cge.12547", language = "English", volume = "88", pages = "441--449", journal = "Clinical Genetics", issn = "0009-9163", publisher = "Wiley-Blackwell", number = "5", } . Clinical Genetics.
Whole genome sequences are required to fully resolve the linkage disequilibrium structure of human populations @article{a137ea90f2d54427b9afa80ee1e1cd58, title = "Whole genome sequences are required to fully resolve the linkage disequilibrium structure of human populations", abstract = "BACKGROUND: An understanding of linkage disequilibrium (LD) structures in the human genome underpins much of medical genetics and provides a basis for disease gene mapping and investigating biological mechanisms such as recombination and selection. Whole genome sequencing (WGS) provides the opportunity to determine LD structures at maximal resolution.RESULTS: We compare LD maps constructed from WGS data with LD maps produced from the array-based HapMap dataset, for representative European and African populations. WGS provides up to 5.7-fold greater SNP density than array-based data and achieves much greater resolution of LD structure, allowing for identification of up to 2.8-fold more regions of intense recombination. The absence of ascertainment bias in variant genotyping improves the population representativeness of the WGS maps, and highlights the extent of uncaptured variation using array genotyping methodologies. The complete capture of LD patterns using WGS allows for higher genome-wide association study (GWAS) power compared to array-based GWAS, with WGS also allowing for the analysis of rare variation. The impact of marker ascertainment issues in arrays has been greatest for Sub-Saharan African populations where larger sample sizes and substantially higher marker densities are required to fully resolve the LD structure.CONCLUSIONS: WGS provides the best possible resource for LD mapping due to the maximal marker density and lack of ascertainment bias. WGS LD maps provide a rich resource for medical and population genetics studies. The increasing availability of WGS data for large populations will allow for improved research utilising LD, such as GWAS and recombination biology studies.", author = "Reuben Pengelly and William Tapper and Jane Gibson and Marcin Knut and Rick Tearle and Andrew Collins and Sarah Ennis", year = "2015", month = sep, day = "3", doi = "10.1186/s12864-015-1854-0", language = "English", volume = "16", pages = "1--10", journal = "BMC Genomics", issn = "1471-2164", publisher = "BioMed Central", number = "1", } . BMC Genomics.
Adult-onset leukoencephalopathy with axonal spheroids and pigmented glia caused by a novel R782G mutation in CSF1R @article{cdf7ac010419491aa1209cb290cf6fc7, title = "Adult-onset leukoencephalopathy with axonal spheroids and pigmented glia caused by a novel R782G mutation in CSF1R", abstract = "We report a new family with autosomal dominant inheritance of a late onset rapidly progressive leukodystrophy in which exome sequencing has revealed a novel mutation p.R782G in the Colony-Stimulating Factor 1 Receptor gene (CSF1R). Neuropathology of two affected family members showed cerebral white matter degeneration with axonal swellings and pigmented macrophages. The few recently reported families with CSF1R mutations had been previously labelled {"}hereditary diffuse leukencephalopathy with axonal spheroids{"} (HDLS) and {"}pigmentary orthochromatic leukodystrophy{"} (POLD), disorders which now appear to form a disease continuum. The term {"}adult-onset leukoencephalopathy with axonal spheroids and pigmented glia{"} (ALSP) has been proposed to encompass this spectrum. As CSF1R regulates microglia this mutation implies that dysregulation of microglia is the primary cause of the disease.", author = "Nicola Foulds and Reuben Pengelly and Hammans, {Simon R.} and Nicoll, {James A.R.} and Ellison, {David W.} and Adam Ditchfield and Sarah Beck and Sarah Ennis", year = "2015", month = may, day = "15", doi = "10.1038/srep10042", language = "English", volume = "5", pages = "10042", journal = "Scientific Reports", issn = "2045-2322", publisher = "Nature Publishing Group", } . Scientific Reports.
Exome sequence read depth methods for identifying copy number changes @article{956e6aa15bf44fe6b9282e4e7bf01f6d, title = "Exome sequence read depth methods for identifying copy number changes", abstract = "Copy number variants (CNVs) play important roles in a number of human diseases and in pharmacogenetics. Powerful methods exist for CNV detection in whole genome sequencing (WGS) data, but such data are costly to obtain. Many disease causal CNVs span or are found in genome coding regions (exons), which makes CNV detection using whole exome sequencing (WES) data attractive. If reliably validated against WGS-based CNVs, exome-derived CNVs have potential applications in a clinical setting. Several algorithms have been developed to exploit exome data for CNV detection and comparisons made to find the most suitable methods for particular data samples. The results are not consistent across studies. Here, we review some of the exome CNV detection methods based on depth of coverage profiles and examine their performance to identify problems contributing to discrepancies in published results. We also present a streamlined strategy that uses a single metric, the likelihood ratio, to compare exome methods, and we demonstrated its utility using the VarScan 2 and eXome Hidden Markov Model (XHMM) programs using paired normal and tumour exome data from chronic lymphocytic leukaemia patients. We use array-based somatic CNV (SCNV) calls as a reference standard to compute prevalence-independent statistics, such as sensitivity, specificity and likelihood ratio, for validation of the exome-derived SCNVs. We also account for factors known to influence the performance of exome read depth methods, such as CNV size and frequency, while comparing our findings with published results.", author = "L. Kadalayil and S. Rafiq and Rose-Zerilli, {M. J. J.} and R.J. Pengelly and H. Parker and D. Oscier and Strefford, {J. C.} and Tapper, {W. J.} and J. Gibson and S. Ennis and A. Collins", year = "2015", month = may, doi = "10.1093/bib/bbu027", language = "English", volume = "16", pages = "380--392", journal = "Briefings in Bioinformatics", issn = "1467-5463", publisher = "Oxford University Press", number = "3", } . Briefings in Bioinformatics.
Immuno-genomic profiling of patients with inflammatory bowel disease @article{3f3a01b7648748e892329cb2d1ebcfbf, title = "Immuno-genomic profiling of patients with inflammatory bowel disease", abstract = "BackgroundOver the last 2 decades, there has been an ever-expanding catalog of genetic variants implicated in inflammatory bowel disease (IBD) through genome-wide association studies and next generation sequencing. In this article, we highlight the remarkable developments in understanding the genetic and immunological basis of IBD. The main objective of the study was to perform a systematic review of published literature detailing functional/immunological studies in patients known to harbor genetic variations in the implicated genes.MethodsA panel of 71 candidate genes implicated in IBD was prioritized using 5 network connectivity in silico methods. An electronic search using MEDLINE and EMBASE from 1996 to February 2014 for each of the selected genes was conducted. Only studies describing genotyped IBD cohorts with concurrent in vivo functional studies were included.ResultsBetween the reviewers, a total of 35,142 potentially eligible publications were identified. Only 8 genes had publications meeting the inclusion criteria. A total of 67 studies were identified across the selected genes. The NOD2 gene had the most number with 41 studies followed by IL-10 with 11 eligible studies. A meta-analysis was not practical given the heterogeneity of the study design and the number of implicated genes with diverse immunological and physiological functions.ConclusionsThere is a clear lack of functional studies in humans to assess the in vivo impact of the various genetic variants implicated. A collaborative approach merging genomics and functional studies will help to unravel the obscure mechanisms involved in IBD.", author = "Tracy Coelho and Gaia Andreoletti and Ashton, {James J.} and Reuben Pengelly and Yifang Gao and Ananth RamaKrishnan and Akshay Batra and Beattie, {Robert M.} and Williams, {Anthony P.} and Sarah Ennis", year = "2014", month = oct, doi = "10.1097/MIB.0000000000000174", language = "English", volume = "20", pages = "1813--1819", journal = "Inflammatory Bowel Diseases", issn = "1536-4844", publisher = "John Wiley and Sons Inc.", number = "10", } . Inflammatory Bowel Diseases.
Collins, A and Arias, L and Pengelly, R and Mart\'\\inez, J and Brice\~no, I and Ennis, S(2013). The potential for next-generation sequencing to characterise the genetic variation underlying non-syndromic cleft lip and palate phenotypes . OA Genetics.
The potential for next generation sequencing to characterise the genetic variation underlying nonsyndromic cleft lip and palate phenotypes @article{3b6f23301d854e68af90ae9eebed0c49, title = "The potential for next generation sequencing to characterise the genetic variation underlying nonsyndromic cleft lip and palate phenotypes", abstract = "Introduction: Next-generation sequencing is revolutionising the study of genetic variation and its role in disease. Individual DNA samples can now be sequenced cost-effectively enabling analysis of the complete spectrum of genetic variation. This technology has the potential to contribute significantly to the understanding of non-syndromic cleft lip and/or palate. This condition occurs with relatively high frequency and only a proportion of the underlying genetic causal factors have been identified. Many of the genes implicated have been found through genome-wide association studies but further progress is limited because these approaches consider only common genetic variants and neglect rarer variations. Because many of the causal genetic variants remain unknown, the role of gene-environment and gene-gene interaction is difficult to characterise. The identification of novel, low frequency, variants will provide new insights into the biological mechanisms and pathways involved in the condition. Sequence-based analysis will also be invaluable for fine mapping causal variants in the larger regions already identified by linkage and association studies for which positive identification of causal genetic variants has proven difficult. This review considers the available evidence for the genes involved and current understanding of how genetic variation interacts with environmental factors known to influence risk. Only by characterising the underlying genetic factors will the effort to understand gene-environment interaction and underlying functional processes be successful.Conclusion: Success with next-generation sequencing will lead to improvements in prediction, prevention, and treatment for cleft lip and palate patients.", author = "A. Collins and L. Arias and R. Pengelly and I. Martinez and S. Ennis", year = "2013", month = sep, day = "1", language = "English", volume = "1", pages = "1--6", journal = "OA Genetics", issn = "2054-197X", number = "1", }. OA Genetics.
Fragment screening using capillary electrophoresis (CEfrag) for hit identification of heat shock protein 90 ATPase inhibitors @article{d3792495ee9b4bca8db2519020cca85e, title = "Fragment screening using capillary electrophoresis (CEfrag) for hit identification of heat shock protein 90 ATPase inhibitors", abstract = "CEfrag is a new fragment screening technology based on affinity capillary electrophoresis (ACE). Here we report on the development of a mobility shift competition assay using full-length human heat shock protein 90α (Hsp90α), radicicol as the competitor probe ligand, and successful screening of the Selcia fragment library. The CEfrag assay was able to detect weaker affinity (IC(50) >500 µM) fragments than were detected by a fluorescence polarization competition assay using FITC-labeled geldanamycin. The binding site of selected fragments was determined by co-crystallization with recombinant Hsp90α N-terminal domain and X-ray analysis. The results of this study confirm that CEfrag is a sensitive microscale technique enabling detection of fragments binding to the biological target in near-physiological solution.", keywords = "Adenosine Triphosphatases, Adenosine Triphosphate, Binding Sites, Crystallography, X-Ray, Drug Discovery, Drug Evaluation, Preclinical, Electrophoresis, Capillary, Electrophoretic Mobility Shift Assay, HSP90 Heat-Shock Proteins, Humans, Macrolides, Protein Binding, Protein Structure, Tertiary, Small Molecule Libraries, Journal Article", author = "Carol Austin and Pettit, {Simon N} and Magnolo, {Sharon K} and Jonathan Sanvoisin and Wenjie Chen and Wood, {Stephen P} and Freeman, {Lauren D} and Pengelly, {Reuben J} and Hughes, {Dallas E}", year = "2012", month = aug, doi = "10.1177/1087057112445785", language = "English", volume = "17", pages = "868--76", journal = "Journal of Biomolecular Screening", issn = "1087-0571", publisher = "SAGE Publications", number = "7", } . Journal of Biomolecular Screening.
CONFERENCE POSTER
Analysis of loss of heterozygosity by whole-exome sequencing yields insights into pseudomyxoma peritonei @conference{c4a428b3843e4ab594bf87559b67e4d8, title = "Analysis of loss of heterozygosity by whole-exome sequencing yields insights into pseudomyxoma peritonei", author = "Reuben Pengelly", year = "2019", month = oct, day = "4", language = "English", note = "International Workshop on Pseudomyxoma Peritonei ; Conference date: 04-10-2019 Through 04-10-2019", }. International Workshop on Pseudomyxoma Peritonei, Spain, 4/10/19.
Reuben J. Pengelly(2016). Linkage disequilibrium patterns for the identification of functional regions of the non-coding genome. Functional genetic variation in the non-coding genome, London, United Kingdom, 10/11/16.
Reuben J. Pengelly(2015). Primary immunodeficiency caused by a novel compound heterozygote mutation in MTHFD1. Stratified, personalised or P4 medicine: a new direction, Southampton, United Kingdom, 12/05/15.
Reuben J. Pengelly(2015). Linkage disequilibrium maps highlight differential selection pressures between three breeds of Gallus gallus. Breeding for Bacon, Beer and Biofuels, Edinburgh, United Kingdom, 16/04/15.
Reuben J. Pengelly(2014). High resolution linkage disequilibrium maps derived from whole-genome sequencing data. 1000 Genomes Project and Beyond, Cambridge, United Kingdom, 24/06/14.
OTHER
The contribution of common population variants to hypomorphic oculocutaneous albinism phenotypes: a novel tri-allelic TYR genotype @article{3aa3179ed9ae4385ac72ea81be25f960, title = "The contribution of common population variants to hypomorphic oculocutaneous albinism phenotypes: a novel tri-allelic TYR genotype", abstract = "Purpose : Oculocutaneous albinism type 1 (OCA1) is the most common cause of albinism in European populations and is inherited through autosomal recessive mutations in the Tyrosinase (TYR) gene. The high level of reported missing heritability in OCA1 can obstruct confident diagnoses. We have used next generation sequencing techniques to interrogate the genotype for a cohort of patients with hypomorphic albinism and examined the hypothesis that common population variants can contribute to the hypomorphic albinism phenotype.Methods : We have interrogated the genetic cause of albinism in a well phenotyped, hypomorphic albinism cohort of 18 patients. Hypomorphic phenotypes were diagnosed as those with at least two, but not all features of albinism. DNA was sequenced on the TruSight One {\textquoteleft}clinical exome{\textquoteright} panel to search for causal variants in all OCA genes, as well as being analysed through multiple ligation dependent probe amplification (MLPA) to determine the presence of any large deletions in the TYR or OCA2 gene. Further analysis of OCA1 genotypes was carried out through segregation studies in phenotyped family members. Probands and family members were sequenced for TYR rare variants as well as the common variants S192Y and R402Q, both of which have a population frequency greater than 20%.Results : Of eighteen probands we confidently diagnose one with ocular albinism, two with OCA type 2, and one with a PAX6 mutation. Of six probands with only a single heterozygous mutation in TYR, all were found to have the two common variants S192Y and R402Q, figure 1. The predicted frequency of p.[S192Y;R402Q] in cis is 1.1%, however a cis versus trans distinction cannot be made in all cases.Conclusions : Our segregation results suggest that a combination of R402Q and S192Y with a deleterious mutation in a {\textquoteleft}tri-allelic genotype{\textquoteright} can account for missing heritability in a substantial number of hypomorphic OCA1B albinism phenotypes. The work also describes subtle phenotypic features that could be missed when a patient presents with nystagmus, thus leading to improvements in the overall diagnosis of congenital nystagmus.", author = "Norman, {Chelsea Sarah} and Luke O'Gorman and Jane Gibson and Pengelly, {Reuben J.} and Diana Baralle and Ratnayaka, {J. Arjuna} and Sarah Ennis and Self, {James E.}", year = "2018", month = jul, day = "1", language = "English", volume = "59", pages = "5784", journal = "Investigative Ophthamology & Visual Science", number = "9", }. Investigative Ophthamology & Visual Science.
DISSERTATION THESIS
Genomic data analysis: populations, patients and pipelines @phdthesis{25216740e62e4c7386981eb7075c0a8b, title = "Genomic data analysis: populations, patients and pipelines", abstract = "Methods for the ascertainment of genotype data have become more cost efficient by orders of magnitude with the use of high-density genotyping arrays and the advent of next generation sequencing (NGS). The resulting deluge of data has required ever advancing analytical approaches in order for the maximal information to be gleaned from these extensive data.In this work, many application of NGS to clinical research are discussed. This includes the application of targeted gene sequencing to a cohort of 83 patients with chronic kidney disease, whole-exome investigations of eight families with cleft lip/palate phenotypes, as well as five cases where analytical lessons can be learned from exome sequenced cases harbouring pathogenic variants refractory to identification. Additionally, a novel QC tool for the unambiguous tracking of samples undergoing exome sequencing is presented.Furthermore, work is presented investigating the linkage disequilibrium (LD) patterns in populations applying the Malecot-Morton model. We demonstrate that array genotyping is insufficient for the accurate determination of ne LD patterns in the human genome, with whole-genome sequencing providing more representative LD maps. Finally, we apply similar methods to Gallus gallus, generating the highest resolution maps of LD presented to date, showing that the patterns are highly discordant between commercial lines, and define features associated with recombination.Overall, we highlight the diversity of ways in which genetic data can be utilised effectively in the age of genomic `big data', and present tools which may be of benefit to other researchers utilising these technologies", author = "Reuben Pengelly", year = "2015", month = oct, language = "English", school = "University of Southampton", }.