RDL logo
About
Aims and ScopeAdvisory Board Members
More
Who We Are?
User Guide
​
​
Sign inGet started
​
​

About
Aims and ScopeAdvisory Board Members
More
Who We Are?
User Guide

Sign inGet started
RDL logo

Verified research datasets. Instant access. Built for collaboration.

Navigation

About

Aims and Scope

Advisory Board Members

More

Who We Are?

Add Raw Data

User Guide

Legal

Privacy Policy

Terms of Service

Support

Got an issue? Email us directly.

Email: info@rawdatalibrary.netOpen Mail App
​
​

© 2025 Raw Data Library. All rights reserved.
PrivacyTerms
  1. Raw Data Library
  2. /
  3. Publications
  4. /
  5. Functional pangenome analysis provides insights into the origin, function and pathways to therapy of SARS-CoV-2 coronavirus

Verified authors • Institutional access • DOI aware
50,000+ researchers120,000+ datasets90% satisfaction
Preprint
en
2020

Functional pangenome analysis provides insights into the origin, function and pathways to therapy of SARS-CoV-2 coronavirus

0 Datasets

0 Files

en
2020

Get instant academic access to this publication’s datasets.

Create free accountHow it works
Access Research Data

Join our academic network to download verified datasets and collaborate with researchers worldwide.

Get Free Access
Institutional SSO
Secure
This PDF is not available in different languages.
No localized PDFs are currently available.

Frequently asked questions

Is access really free for academics and students?

Yes. After verification, you can browse and download datasets at no cost. Some premium assets may require author approval.

How is my data protected?

Files are stored on encrypted storage. Access is restricted to verified users and all downloads are logged.

Can I request additional materials?

Yes, message the author after sign-up to request supplementary files or replication code.

Advance your research today

Join 50,000+ researchers worldwide. Get instant access to peer-reviewed datasets, advanced analytics, and global collaboration tools.

Get free academic accessLearn more
✓ Immediate verification • ✓ Free institutional access • ✓ Global collaboration
Carlos M. Duarte
Carlos M. Duarte

King Abdullah University of Science and Technology

Verified
Intikhab Álam
Allan Kamau
Maxat Kulmanov
+4 more

Abstract

Abstract Background Recent epidemic of novel coronavirus (SARS-CoV-2) has triggered a rising global health emergency that demands urgent analysis of its genome and solutions for detection and therapy. Methods We used a comparative pangenomic analysis of Betacoronavirus sequenced thus far to detect the core and accessory gene clusters of this genus including SARS-CoV-2. We then annotate the functions, which are confirmed by structural analysis, and predict the potential location within the host cells of these proteins. Findings We found five accessory gene clusters common to the SARS clade, including SARS-CoV-2, that perform functions supporting their pathogenicity. Phylogenetic analysis showed one of the accessory gene clusters, the protein E, to be present across the inferred evolutionary pathway of the SARS clade, including that of the horseshoe bat virus (Hp-betacoronavirus/Zhejiang2013), inferred to be the parental member of the clade. The E protein is highly conserved in the clade, differing between SARS and SARS-Cov2 with a difference of single amino acid substitution and a single amino acid insertion present in SARS but absent from SARS-CoV-2. Betacoronavirus pangenomedb is available at http://pangenomedb.cbrc.kaust.edu.sa. Interpretation The characterization and functional assessment of SARS-CoV-2 envelope, E, protein in gene cluster 6, together with previous findings on this protein for SARS, leads us to recommend that detection of COVID-2019 be developed based on the SARS-CoV-2 E protein and that treatment using mutated SARS-Cov-2 lacking the E protein be explored as a promising candidate for a vaccine. Funding This research was funded by the King Abdullah University of Science and Technology (KAUST) through funding allocated to Computational Bioscience Research Center (CBRC) and another KAUST award under the award number FCC/1/1976-25-01. Research in context Evidence before this study The response to the recent epidemic of novel coronavirus 2019-nCoV, now named SARS-CoV-2, and its disease named COVID-2019, require elucidating the possible origin, gene functions and potential treatment options. The genome sequence released for SARS-CoV-2 is a key resource to these efforts. We used all available whole-virus genomes released for the Betacoronavirus to conduct a functional pangenomic analysis of these viruses, including that of SARS-CoV-2. SARS-CoV-2 was previously reported in publications as 2019-nCoV. NCBI produced a dedicated resource for SARS-CoV-2 related publications and sequence data (https://www.ncbi.nlm.nih.gov/genbank/SARS-CoV-2-seqs/). We searched this resource for terms “2019-nCoV AND envelope” (1 hit), or “2019-nCoV AND orf10” (0 hits) or “2019-nCoV pangenome” (0 hits) to find any published work related to these genes in SARS-CoV-2. We retrieved all Betacoronavirus (taxon id 694002) whole genome sequences from NCBI (January 26, 2020 from web page https://www.ncbi.nlm.nih.gov/assembly) using the search term “txid694002[Organism:exp]”. This search retrieved a total of 22 whole genomes, of which 18 were unique genomes, including 4 that for SARS-CoV-2, see supplementary table 1. Added value of this study We report and make available an interactive pangenome analysis resource (defining core and accessory genome components) for the genus Betacoronavirus (https://pangenomedb.cbrc.kaust.edu.sa), including SARS-CoV-2. We identify and define all core and accessory gene alignments and phylogenetic trees for this genus. We then explore these data to provide insights into potential functions for uncharacterized genes (ORFs 6,8,10) unique to SARS-like coronaviruses, and suggest potential options for detection and therapy of COVID-2019. This analysis points at protein E, highly conserved across all SARS-like coronaviruses, as a promising candidate for COVID-2019 detection and therapy. Implications of all the available evidence The Betacoronavirus pangenome data we made public allows exploration of core and accessory elements of these viruses to understand the pathogenicity of SARS-CoV-2, thereby expediting progress in responding to the pandemic. The characterization and functional assessment of SARS-CoV-2 envelope, E, protein, in gene cluster 6 together with previous findings on this protein for SARS, leads us to recommend that detection of COVID-2019 be developed based on the SARS-CoV-2 E protein and that treatment using mutated SARS-Cov-2 lacking the E protein be explored as a promising candidate for a vaccine.

How to cite this publication

Intikhab Álam, Allan Kamau, Maxat Kulmanov, Stefan T. Arold, Arnab Pain, Takashi Gojobori, Carlos M. Duarte (2020). Functional pangenome analysis provides insights into the origin, function and pathways to therapy of SARS-CoV-2 coronavirus.

Related publications

Why join Raw Data Library?

Quality

Datasets shared by verified academics with rich metadata and previews.

Control

Authors choose access levels; downloads are logged for transparency.

Free for Academia

Students and faculty get instant access after verification.

Publication Details

Type

Preprint

Year

2020

Authors

7

Datasets

0

Total Files

0

Language

en

Join Research Community

Access datasets from 50,000+ researchers worldwide with institutional verification.

Get Free Access