0 Datasets
0 Files
Get instant academic access to this publication’s datasets.
Join our academic network to download verified datasets and collaborate with researchers worldwide.
Get Free AccessYes. After verification, you can browse and download datasets at no cost. Some premium assets may require author approval.
Files are stored on encrypted storage. Access is restricted to verified users and all downloads are logged.
Yes, message the author after sign-up to request supplementary files or replication code.
Join 50,000+ researchers worldwide. Get instant access to peer-reviewed datasets, advanced analytics, and global collaboration tools.
✓ Immediate verification • ✓ Free institutional access • ✓ Global collaborationAbstract Background Recent epidemic of novel coronavirus (SARS-CoV-2) has triggered a rising global health emergency that demands urgent analysis of its genome and solutions for detection and therapy. Methods We used a comparative pangenomic analysis of Betacoronavirus sequenced thus far to detect the core and accessory gene clusters of this genus including SARS-CoV-2. We then annotate the functions, which are confirmed by structural analysis, and predict the potential location within the host cells of these proteins. Findings We found five accessory gene clusters common to the SARS clade, including SARS-CoV-2, that perform functions supporting their pathogenicity. Phylogenetic analysis showed one of the accessory gene clusters, the protein E, to be present across the inferred evolutionary pathway of the SARS clade, including that of the horseshoe bat virus (Hp-betacoronavirus/Zhejiang2013), inferred to be the parental member of the clade. The E protein is highly conserved in the clade, differing between SARS and SARS-Cov2 with a difference of single amino acid substitution and a single amino acid insertion present in SARS but absent from SARS-CoV-2. Betacoronavirus pangenomedb is available at http://pangenomedb.cbrc.kaust.edu.sa. Interpretation The characterization and functional assessment of SARS-CoV-2 envelope, E, protein in gene cluster 6, together with previous findings on this protein for SARS, leads us to recommend that detection of COVID-2019 be developed based on the SARS-CoV-2 E protein and that treatment using mutated SARS-Cov-2 lacking the E protein be explored as a promising candidate for a vaccine. Funding This research was funded by the King Abdullah University of Science and Technology (KAUST) through funding allocated to Computational Bioscience Research Center (CBRC) and another KAUST award under the award number FCC/1/1976-25-01. Research in context Evidence before this study The response to the recent epidemic of novel coronavirus 2019-nCoV, now named SARS-CoV-2, and its disease named COVID-2019, require elucidating the possible origin, gene functions and potential treatment options. The genome sequence released for SARS-CoV-2 is a key resource to these efforts. We used all available whole-virus genomes released for the Betacoronavirus to conduct a functional pangenomic analysis of these viruses, including that of SARS-CoV-2. SARS-CoV-2 was previously reported in publications as 2019-nCoV. NCBI produced a dedicated resource for SARS-CoV-2 related publications and sequence data (https://www.ncbi.nlm.nih.gov/genbank/SARS-CoV-2-seqs/). We searched this resource for terms “2019-nCoV AND envelope” (1 hit), or “2019-nCoV AND orf10” (0 hits) or “2019-nCoV pangenome” (0 hits) to find any published work related to these genes in SARS-CoV-2. We retrieved all Betacoronavirus (taxon id 694002) whole genome sequences from NCBI (January 26, 2020 from web page https://www.ncbi.nlm.nih.gov/assembly) using the search term “txid694002[Organism:exp]”. This search retrieved a total of 22 whole genomes, of which 18 were unique genomes, including 4 that for SARS-CoV-2, see supplementary table 1. Added value of this study We report and make available an interactive pangenome analysis resource (defining core and accessory genome components) for the genus Betacoronavirus (https://pangenomedb.cbrc.kaust.edu.sa), including SARS-CoV-2. We identify and define all core and accessory gene alignments and phylogenetic trees for this genus. We then explore these data to provide insights into potential functions for uncharacterized genes (ORFs 6,8,10) unique to SARS-like coronaviruses, and suggest potential options for detection and therapy of COVID-2019. This analysis points at protein E, highly conserved across all SARS-like coronaviruses, as a promising candidate for COVID-2019 detection and therapy. Implications of all the available evidence The Betacoronavirus pangenome data we made public allows exploration of core and accessory elements of these viruses to understand the pathogenicity of SARS-CoV-2, thereby expediting progress in responding to the pandemic. The characterization and functional assessment of SARS-CoV-2 envelope, E, protein, in gene cluster 6 together with previous findings on this protein for SARS, leads us to recommend that detection of COVID-2019 be developed based on the SARS-CoV-2 E protein and that treatment using mutated SARS-Cov-2 lacking the E protein be explored as a promising candidate for a vaccine.
Intikhab Álam, Allan Kamau, Maxat Kulmanov, Stefan T. Arold, Arnab Pain, Takashi Gojobori, Carlos M. Duarte (2020). Functional pangenome analysis provides insights into the origin, function and pathways to therapy of SARS-CoV-2 coronavirus.
Datasets shared by verified academics with rich metadata and previews.
Authors choose access levels; downloads are logged for transparency.
Students and faculty get instant access after verification.
Type
Preprint
Year
2020
Authors
7
Datasets
0
Total Files
0
Language
en
Access datasets from 50,000+ researchers worldwide with institutional verification.
Get Free Access