Help Docs


SNPs and Genes


Opus23 Pro uses SNPs and genes from 23andMe raw data to display information about your client. Some technical information about genes and SNPs is provided in the user interface to give more information about mutations and their effects. This page offers some basic explanations about the relevant terms.

Genes
Genes are sections of DNA which provide instructions for making proteins or RNA. Protein-coding genes have instructions held in the form of codons, sequences of three nucleotides selected from adenine, cytosine, guanine or thymine (A,C,G,T). When transcribed to transfer RNA (tRNA) and read by RNA polymerase in frames of three nucleotides, the codons specify a particular amino acid to be constructed and inserted into a sequence of amino acids peptide chains. Messenger RNA (mRNA) reads the tRNA, and then carries the nuclear information to ribosomes, where it is assembled into polypeptides that can later fold into proteins.

Sections between coding genes were once called 'junk DNA', as they were identified as the residue of viral and bacterial DNA that had been injected into the nucleus during early evolution, and have become part of the human genome. These areas are now called "intergenic", comprising about 75% of the genome, and some parts are now known to be transcribed into non-coding RNA molecules (e.g. transfer RNA, ribosomal RNA, and regulatory RNAs). Intergenic DNA can also contain pseudogenes, and can provide transcriptional and translational regulation of protein-coding sequences, scaffold attachment regions, origins of DNA replication, centromeres and telomeres. Certain mutations in intergenic regions can therefore have significant consequences on metabolic function.

The 5′ (5 prime) phosphate group and the 3′ (3 prime) hydroxyl group are ends of the gene that determine the direction of transcription and translation. Nucleic acids can only be synthesized in the 5′-to-3′ direction in humans. On the other hand, many peptides can be transcribed from either direction during post-transcriptional modification.

A promoter is a region of DNA that initiates transcription of a particular gene. Promoter regions precede the DNA transcription codons, and are located upstream on the DNA strand (in front of the 5′ region). Promoter regions may also contain enhancers (transcription factors), which are proteins that bind to specific DNA sequences (DNA binding motifs), thereby controlling the rate of transcription of genetic information from DNA to mRNA.

The 5′-untranslated region (5′-UTR) is a region of a gene which is transcribed into mRNA, and is located at the 5′ end of the mRNA. This region of an mRNA may or may not be translated, but is usually involved in the regulation of translation. Some SNPs in 5′-UTR promoter regions can therefore affect the transcription of a gene.

The 3′ untranslated region (3′-UTR) is the section of mRNA that is found immediately after the translation stop codon. An mRNA molecule is transcribed from the DNA sequence and is later translated into protein. The 3′-UTR often contains regulatory regions that influence gene expression after transcription. Longer 3′-UTRs are associated with lower levels of gene expression. Sequences within the 3′-UTR also have the ability to degrade or stabilize the transcribed mRNA sequence.

3′-UTR mutations can have significant consequences because one nucleotide change can be responsible for the altered expression of many genes. Usually a mutation in a coding region of a gene would affect only the gene in which it is located. However, since 3′-UTR binding proteins also function in the processing and export of mRNA from the nucleus, a mutation in the 3′-UTR can also affect other unrelated genes.

SNPs
SNPs (single nucleotide polymorphisms) are specific positions at which changes in the spelling of the genetic code are known to occur. SNPs are also known as point mutations. A simple polymorphism is when a nucleotide is substituted for a different nucleotide, and the resulting codon can make a different amino acid. This may have an effect on the function of the enzyme that is produced. A synonymous substitution (silent mutation) changes the codon, but it still codes for the same amino acid, and so typically has no functional effect on the protein. Non-synonymous SNPs cause an amino acid change, but the effect on the protein can be highly variable. In general, the nearer to the start of the gene, the more influence a non-synonymous SNP can have on the enzyme function. The consequence of non-synonymous SNPs is that the changes in the peptide chain may affect the way the protein is folded. This can result in a protein shape that doesn't fit so well with its intended target, and therefore is a less efficient protein. For example, a non-synonymous SNP in the CBS gene may fold the cystathionine beta synthase enzyme in a way that can no longer fit its activator, SAMe. An amino acid substitution (non-synonymous mutation) can also occur in a region of the protein which does not significantly affect the secondary structure or function of the protein. Some non-synonymous SNPs may affect the functional temperature of the protein, and are known as thermolabile variants.

Mutations
Each genetic location (locus) usually has two positions (alleles), one inherited from each parent. This provides two copies of each gene, one on each chromosome pair. As males have only one X chromosome, SNPs on the X-chromosome affect the function of the sole copy of that gene. A single mutation at a locus may or may not have an effect on the protein encoded by the gene. This depends on many factors. Two mutations at an allele, a homozygous mutation (or a single mutation on the X chromosome in males, a hemizygous mutation) can mean that if the consequence of the mutation alters protein function, the gene may be partially or completely disabled. A mutation of only one allele in somatic (non-X or non-Y) chromosomes are known as heterozygous, and the influence on metabolic function depends on factors such as how much of each protein is needed.

The way mutations affect a codon sequence depends on the other nucleotides in the codon. Some examples of codon sequences are TGG (thymine, guanine, guanine), coding for tryptophan; and TAA (thymine, adenine, adenine), TAG (thymine, adenine, guanine), TGA (thymine, guanine, adenine), all making a stop codon, marking the end of the DNA portion that is read. A change in a single nucleotide from TGG (tryptophan) to TGA (stop) could make a significant difference to a gene function by prematurely ending the reading of the protein instructions (a 'nonsense' mutation), rather than making the amino acid tryptophan. Another example is the codons CCT, CCC, CCA, CCG, which all code for proline, so any change to the nucleotide after CC would make no difference to the amino acid produced (a synonymous mutation). A mutation in the middle of the CCC codon, changing it from CCC to CAC would make histidine instead of proline (a non-synonymous mutation) which have opposite charges, potentially affecting protein folding. Transfer RNA (tRNA) is the physical link between the mRNA and the amino acid sequence of proteins, carrying an amino acid to the ribosome. This becomes important during transcription, when different codons translate the same amino acid. This produces a different tRNA molecule for each codon, and in some cases results in silent or synonymous mutations which could affect tRNA function.



The RNA codon table is essentially identical to that for DNA codons, but with T (thymine) replaced by U (uracil).
The RNA codons are read by ribosomes and are translated into polypeptides, then become proteins. (Ref: Wikimedia Commons)


A nonsense mutation (see above) is a SNP in a sequence of DNA that results in a premature stop codon or a nonsense codon in the transcribed mRNA, which can be a truncated, incomplete, and usually nonfunctional protein product. It differs from a missense mutation, which is a SNP where a single nucleotide change causes the translation of a different amino acid.

A deletion mutation is a polymorphism in which a part of a chromosome or a sequence of DNA is lost during DNA replication. Any number of nucleotides can be deleted, from a single base to an entire piece of a chromosome. Deletions that do not occur in multiples of three nucleotides can cause a frameshift mutation by changing the 3-nucleotide (codon) protein reading frame of the genetic sequence.

An insertion mutation is the addition of one or more nucleotide base pairs into a DNA sequence. This can often happen in microsatellite regions, which is a section of DNA in which short DNA motifs (2–5 base pairs) are repeated a variable number of times. This can also cause a frameshift mutation.

OPUS 23™ IS A REGISTERED TRADEMARK ® OF DATAPUNK BIOINFORMATICS, LLC. COPYRIGHT © 2015-2023. ALL RIGHTS RESERVED.     |