dbSNP (Single Nucleotide Polymorphism Database)

KaanBerkAkyüz
3 min readFeb 24, 2022

--

Single nucleotide polymorphisms (single nucleotide polymorphisms, SNPs) are the most common cause of genetic variation among humans. Each SNP represents a change in nucleotides, which are the building blocks of DNA. For example, an SNP can replace the cytosine (C) nucleotide with thymine (T) in a certain sequence of DNA. On average, a variation occurs every 1000 nucleotides. Sequence variations located in defined positions within genomes and are responsible for individual phenotypic traits, including a person’s propensity to complex diseases such as heart disease and cancer. Sequence variations can be used for gene mapping, description of population structure and performance of functional studies. The Single Nucleotide Polymorphism database (dbSNP) is the public archive for a large collection of genetic polymorphisms. dbSNP was established in August 1999 as a database of small-scale nucleotide variants. The size of dbSNP has grown exponentially since 1999. dbSNP is designed to support research and submissions on a wide range of biological problems. These include physical mapping, functional analysis, pharmacogenomics, association studies, and evolutionary studies. Although the name of the database refers only to a collection of single nucleotide polymorphisms, it actually includes several molecular variations;
(1) SNPs, (2) short deletion and insertion polymorphisms (indels), (3) microsatellite markers or short tandem repeats (STRs), (4) multinucleotide polymorphisms (MNPs), (5) heterozygous sequences

As of 2017, dbSNP now only accepts human data. There are nearly 2 billion submissions representing over 675 million different variants. dbSNP accepts submissions from a wide variety of sources, including individual research labs, large-scale genome sequencing centers, other SNP databases, and private businesses. Each variation submitted receives a submitted SNP ID (“ss#”) (Submitted SNP ID). This access number is a constant and unique identifier. Unique SNP records submitted also receive a reference SNP ID number (“rs#”; “refSNP set”). For a clinical variation, possibly more than one record will be sent to the dbSNP. dbSNP combines the same submitted SNP records into a single reference SNP record, which is a unique and stable identifier. To send variations to dbSNP, a dispatcher is required who first identifies the laboratory responsible for the submission. Next, the author has to fill out a submission file containing relevant information and data.

Genome sequences can be improved over time, so that reference SNPs (“refSNP”) from previous constructs, as well as newly submitted SNPs, are remapped to the current genome sequence. If two refSNP cluster records are found to be mapped to the same location (i.e. identical), dbSNP will merge these records. Smaller refSNP number ID (i.e. oldest record) now represents both records and larger refSNP number IDs become invalid. Merging records reduces redundancy in dbSNP. Clinically significant refSNPs referred to in the literature are called “valuable”; a merge that would remove such a refSNP would never be done as it could cause confusion later on. dbSNP can be searched using the Entrez SNP search tool. Various queries such as gene name, allele name, dec method can be used for the search. Many tools are available to delve deeper into a set of refSNPs. Map view shows the variation’s position in the genome and other variations nearby, another tool, gene view, reports the location of variation within a gene (if in one), the old and new codon, amino acids encoded by both. Sequence viewer shows variant position relative to introns, exons, and other distant and near variants. Of the 23.7 million refSNP entries for humans, 14.5 million were validated and the remaining 9.2 million remained as candidate SNPs. The dbSNP Reference SNP (rs or RefSNP) number is a locus access for the variant type assigned by the dbSNP.

If you want to get more detailed information and experience about dbSNP
You can visit official site “
https://www.ncbi.nlm.nih.gov/snp/ “.

-References

https://www.ncbi.nlm.nih.gov/snp/docs/RefSNP_about/ (17/02/2022)

https://ftp.ncbi.nlm.nih.gov/pub/factsheets/Factsheet_SNP.pdf (20/02/2022)

Sherry ST, Ward MH, Kholodov M, et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2001;29(1):308–311. doi:10.1093/nar/29.1.308

--

--

KaanBerkAkyüz
0 Followers

Moleküler Biyoloji ve Genetik bölümünde Lisans Derecem var. Gazi Üniversitesi Sağlık Bilişimi yüksek lisans öğrencisiyim.