Cytosine (/ˈsaɪtəˌsiːn, -ˌziːn, -ˌsɪn/; C) is one of the four main bases found in DNA and RNA, along with adenine, guanine, and thymine (uracil in RNA). It is a pyrimidine derivative, with a heterocyclic aromatic ring and two substituents attached (an amine group at position 4 and a keto group at position 2). The nucleoside of cytosine is cytidine. In Watson-Crick base pairing, it forms three(3) hydrogen bonds with guanine.
3D model (JSmol)
|Molar mass||111.10 g/mol|
|Density||1.55 g/cm3 (calculated)|
|Melting point||320 to 325 °C (608 to 617 °F; 593 to 598 K) (decomposes)|
|Acidity (pKa)||4.45 (secondary), 12.2 (primary)|
Except where otherwise noted, data are given for materials in their standard state (at 25 °C [77 °F], 100 kPa).
Cytosine was discovered and named by Albrecht Kossel and Albert Neumann in 1894 when it was hydrolyzed from calf thymus tissues. A structure was proposed in 1903, and was synthesized (and thus confirmed) in the laboratory in the same year.
In 1997 cytosine was used in an early demonstration quantum information processing when Oxford University researchers implemented the Deutsch-Jozsa algorithm on a two qubit nuclear magnetic resonance quantum computer (NMRQC).
In March 2015, NASA scientists reported the formation of cytosine, along with uracil and thymine, from pyrimidine under the space-like laboratory conditions, which is of interest because pyrimidine has been found in meteorites although its origin is unknown.
Cytosine can be found as part of DNA, as part of RNA, or as a part of a nucleotide. As cytidine triphosphate (CTP), it can act as a co-factor to enzymes, and can transfer a phosphate to convert adenosine diphosphate (ADP) to adenosine triphosphate (ATP).
In DNA and RNA, cytosine is paired with guanine. However, it is inherently unstable, and can change into uracil (spontaneous deamination). This can lead to a point mutation if not repaired by the DNA repair enzymes such as uracil glycosylase, which cleaves a uracil in DNA.
When found third in a codon of RNA, cytosine is synonymous with uracil, as they are interchangeable as the third base. When found as the second base in a codon, the third is always interchangeable. For example, UCU, UCC, UCA and UCG are all serine, regardless of the third base.
Cytosine can also be methylated into 5-methylcytosine by an enzyme called DNA methyltransferase or be methylated and hydroxylated to make 5-hydroxymethylcytosine. Active enzymatic deamination of cytosine or 5-methylcytosine by the APOBEC family of cytosine deaminases could have both beneficial and detrimental implications on various cellular processes as well as on organismal evolution. The implications of deamination on 5-hydroxymethylcytosine, on the other hand, remains less understood.
Cytosine has not been found in meteorites, which suggests the first strands of RNA and DNA had to look elsewhere to obtain this building block. Cytosine likely formed within some meteorite parent bodies, however did not persist within these bodies due to an effective deamination reaction into uracil.
5-Methylcytosine is a methylated form of the DNA base cytosine that may be involved in the regulation of gene transcription. When cytosine is methylated, the DNA maintains the same sequence, but the expression of methylated genes can be altered (the study of this is part of the field of epigenetics). 5-Methylcytosine is incorporated in the nucleoside 5-methylcytidine.
In 5-methylcytosine, a methyl group is attached to the 5th atom in the 6-atom ring (counting counterclockwise from the NH nitrogen at the six o'clock position, not the 2 o'clock). This methyl group distinguishes 5-methylcytosine from cytosine.Base pair
A base pair (bp) is a unit consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA and RNA. Dictated by specific hydrogen bonding patterns, Watson-Crick base pairs (guanine-cytosine and adenine-thymine) allow the DNA helix to maintain a regular helical structure that is subtly dependent on its nucleotide sequence. The complementary nature of this based-paired structure provides a redundant copy of the genetic information encoded within each strand of DNA. The regular structure and data redundancy provided by the DNA double helix make DNA well suited to the storage of genetic information, while base-pairing between DNA and incoming nucleotides provides the mechanism through which DNA polymerase replicates DNA and RNA polymerase transcribes DNA into RNA. Many DNA-binding proteins can recognize specific base-pairing patterns that identify particular regulatory regions of genes.
Intramolecular base pairs can occur within single-stranded nucleic acids. This is particularly important in RNA molecules (e.g., transfer RNA), where Watson-Crick base pairs (guanine-cytosine and adenine-uracil) permit the formation of short double-stranded helices, and a wide variety of non-Watson-Crick interactions (e.g., G-U or A-A) allow RNAs to fold into a vast range of specific three-dimensional structures. In addition, base-pairing between transfer RNA (tRNA) and messenger RNA (mRNA) forms the basis for the molecular recognition events that result in the nucleotide sequence of mRNA becoming translated into the amino acid sequence of proteins via the genetic code.
The size of an individual gene or an organism's entire genome is often measured in base pairs because DNA is usually double-stranded. Hence, the number of total base pairs is equal to the number of nucleotides in one of the strands (with the exception of non-coding single-stranded regions of telomeres). The haploid human genome (23 chromosomes) is estimated to be about 3.2 billion bases long and to contain 20,000–25,000 distinct protein-coding genes. A kilobase (kb) is a unit of measurement in molecular biology equal to 1000 base pairs of DNA or RNA. The total amount of related DNA base pairs on Earth is estimated at 5.0 × 1037 and weighs 50 billion tonnes. In comparison, the total mass of the biosphere has been estimated to be as much as 4 TtC (trillion tons of carbon).Cytarabine
Cytarabine, also known as cytosine arabinoside (ara-C), is a chemotherapy medication used to treat acute myeloid leukemia (AML), acute lymphocytic leukemia (ALL), chronic myelogenous leukemia (CML), and non-Hodgkin's lymphoma. It is given by injection into a vein, under the skin, or into the cerebrospinal fluid. There is a liposomal formulation for which there is tentative evidence of better outcomes in lymphoma involving the meninges.Common side effects include bone marrow suppression, vomiting, diarrhea, liver problems, rash, ulcer formation in the mouth, and bleeding. Other serious side effects include loss of consciousness, lung disease, and allergic reactions. Use during pregnancy may harm the baby. Cytarabine is in the antimetabolite and nucleoside analog families of medication. It works by blocking the function of DNA polymerase.Cytarabine was patented in 1960 and approved for medical use in 1969. It is on the World Health Organization's List of Essential Medicines, the most effective and safe medicines needed in a health system. The wholesale cost in the developing world is about US$4.27 to US$5.70 per 500 mg vial. This dose in the United Kingdom costs the NHS about GB£50.00 while the liposomal form is GB£1,223.75 per 50 mg vial.Cytidine
Cytidine is a nucleoside molecule that is formed when cytosine is attached to a ribose ring (also known as a ribofuranose) via a β-N1-glycosidic bond. Cytidine is a component of RNA.
If cytosine is attached to a deoxyribose ring, it is known as a deoxycytidine.Cytidine deaminase
Cytidine deaminase is an enzyme that in humans is encoded by the CDA gene.This gene encodes an enzyme involved in pyrimidine salvaging. The encoded protein forms a homotetramer that catalyzes the irreversible hydrolytic deamination of cytidine and deoxycytidine to uridine and deoxyuridine, respectively. It is one of several deaminases responsible for maintaining the cellular pyrimidine pool. Mutations in this gene are associated with decreased sensitivity to the cytosine nucleoside analogue cytosine arabinoside used in the treatment of certain childhood leukemias.A related activation-induced (cytidine) deaminase (AID) regulates antibody diversification, especially the process of somatic hypermutation.DNA methylation
DNA methylation is a process by which methyl groups are added to the DNA molecule. Methylation can change the activity of a DNA segment without changing the sequence. When located in a gene promoter, DNA methylation typically acts to repress gene transcription. In mammals DNA methylation is essential for normal development and is associated with a number of key processes including genomic imprinting, X-chromosome inactivation, repression of transposable elements, aging, and carcinogenesis.
Two of DNA's four bases, cytosine and adenine, can be methylated. Cytosine methylation is widespread in both eukaryotes and prokaryotes, even though the rate of cytosine DNA methylation can differ greatly between species: 14% of cytosines are methylated in Arabidopsis thaliana, 8% in Physarum, 4% in Mus musculus, 2.3% in Escherichia coli, 0.03% in Drosophila, 0.006% in Dictyostelium and virtually none (< 0.0002%) in Caenorhabditis or yeast species such as Saccharomyces cerevisiae and S. pombe (but not N. crassa). Adenine methylation has been observed in bacterial, plant, and recently in mammalian DNA, but has received considerably less attention.
Methylation of cytosine to form 5-methylcytosine occurs at the same 5 position on the pyrimidine ring where the DNA base thymine's methyl group is located; the same position distinguishes thymine from the analogous RNA base uracil, which has no methyl group. Spontaneous deamination of 5-methylcytosine converts it to thymine. This results in a T:G mismatch. Repair mechanisms then correct it back to the original C:G pair; alternatively, they may substitute G for A, turning the original C:G pair into an T:A pair, effectively changing a base and introducing a mutation. This misincorporated base will not be corrected during DNA replication as thymine is a DNA base. If the mismatch is not repaired and the cell enters the cell cycle the strand carrying the T will be complemented by an A in one of the daughter cells, such that the mutation becomes permanent. The near-universal replacement of uracil by thymine in DNA, but not RNA, may have evolved as an error-control mechanism, to facilitate the removal of uracils generated by the spontaneous deamination of cytosine.
DNA methylation as well as many of its contemporary DNA methyltransferases has been thought to evolve from early world primitive RNA methylation activity and is supported by several lines of evidence.In plants and other organisms, DNA methylation is found in three different sequence contexts: CG (or CpG), CHG or CHH (where H correspond to A, T or C). In mammals however, DNA methylation is almost exclusively found in CpG dinucleotides, with the cytosines on both strands being usually methylated. Non-CpG methylation can however be observed in embryonic stem cells, and has also been indicated in neural development. Furthermore, non-CpG methylation has also been observed in hematopoietic progenitor cells, and it occurred mainly in a CpApC sequence context.DNA methyltransferase
In biochemistry, the DNA methyltransferase (DNA MTase) family of enzymes catalyze the transfer of a methyl group to DNA. DNA methylation serves a wide variety of biological functions. All the known DNA methyltransferases use S-adenosyl methionine (SAM) as the methyl donor.DNMT3B
DNA (cytosine-5-)-methyltransferase 3 beta, is an enzyme that in humans in encoded by the DNMT3B gene. Mutation in this gene are associated with immunodeficiency, centromere instability and facial anomalies syndrome.Deamination
Deamination is the removal of an amine group from a molecule of amino acid. Enzyme that is responsible for this reaction are called deaminases.
In the human body, deamination takes place primarily in the liver, however glutamate is also deaminated in the kidneys. In situations of excess protein intake, deamination is used to break down amino acids for energy. The amine group is removed from the amino acid and converted to ammonia. The rest of the amino acid is made up of mostly carbon and hydrogen, and is recycled or oxidized for energy. Ammonia is toxic to the human system, and enzymes convert it to urea or uric acid by addition of carbon dioxide molecules (which is not considered a deamination process) in the urea cycle, which also takes place in the liver. Urea and uric acid can safely diffuse into the blood and then be excreted in urine.Erwin Chargaff
Erwin Chargaff (11 August 1905 – 20 June 2002) was an Austro-Hungarian biochemist who immigrated to the United States during the Nazi era and was a professor of biochemistry at Columbia University medical school. Through careful experimentation, Chargaff discovered two rules that helped lead to the discovery of the double helix structure of DNA.
The first rule was that in DNA the number of guanine units is equal to the number of cytosine units, and the number of adenine units is equal to the number of thymine units. This hinted at the base pair makeup of DNA.
The second rule was that the relative amounts of guanine, cytosine, adenine and thymine bases vary from one species to another. This hinted that DNA rather than protein could be the genetic material.GC-content
In molecular biology and genetics, GC-content (or guanine-cytosine content) is the percentage of nitrogenous bases on a DNA or RNA molecule that are either guanine or cytosine (from a possibility of four different ones, also including adenine and thymine in DNA and adenine and uracil in RNA). This may refer to a certain fragment of DNA or RNA, or that of the whole genome. When it refers to a fragment of the genetic material, it may denote the GC-content of section of a gene (domain), single gene, group of genes (or gene clusters), or even a non-coding region.Guanine
Guanine (; or G, Gua) is one of the four main nucleobases found in the nucleic acids DNA and RNA, the others being adenine, cytosine, and thymine (uracil in RNA). In DNA, guanine is paired with cytosine. The guanine nucleoside is called guanosine.
With the formula C5H5N5O, guanine is a derivative of purine, consisting of a fused pyrimidine-imidazole ring system with conjugated double bonds. Being unsaturated, the bicyclic molecule is planar.Mycobacterium chitae
Mycobacterium chitae is a species of the phylum Actinobacteria (Gram-positive bacteria with high guanine and cytosine content, one of the dominant phyla of all bacteria), belonging to the genus Mycobacterium.
Type strain: strain ATCC 19627 = CCUG 39504 = CIP 105383 = DSM 44633 = JCM 12403 = NCTC 10485.Mycobacterium gastri
Mycobacterium gastri is a species of the phylum Actinobacteria (Gram-positive bacteria with high guanine and cytosine content, one of the dominant phyla of all bacteria), belonging to the genus Mycobacterium.Nucleobase
Nucleobases, also known as nitrogenous bases or often simply bases, are nitrogen-containing biological compounds that form nucleosides, which in turn are components of nucleotides, with all of these monomers constituting the basic building blocks of nucleic acids. The ability of nucleobases to form base pairs and to stack one upon another leads directly to long-chain helical structures such as ribonucleic acid (RNA) and deoxyribonucleic acid (DNA).
Five nucleobases—adenine (A), cytosine (C), guanine (G), thymine (T), and uracil (U)—are called primary or canonical. They function as the fundamental units of the genetic code, with the bases A, G, C, and T being found in DNA while A, G, C, and U are found in RNA. Thymine and uracil are identical excepting that T includes a methyl group that U lacks.
Adenine and guanine have a fused-ring skeletal structure derived of purine, hence they are called purine bases. Similarly, the simple-ring structure of cytosine, uracil, and thymine is derived of pyrimidine, so those three bases are called the pyrimidine bases. Each of the base pairs in a typical double-helix DNA comprises a purine and a pyrimidine: either an A paired with a T or a C paired with a G. These purine-pyrimidine pairs, which are called base complements, connect the two strands of the helix and are often compared to the rungs of a ladder. The pairing of purines and pyrimidines may result, in part, from dimensional constraints, as this combination enables a geometry of constant width for the DNA spiral helix. The A-T and C-G pairings function to form double or triple hydrogen bonds between the amine and carbonyl groups on the complementary bases.
In August 2011, a report based on NASA studies of meteorites suggested that nucleobases such as adenine, guanine, xanthine, hypoxanthine, purine, 2,6-diaminopurine, and 6,8-diaminopurine may have formed in outer space as well as on earth.The origin of the term base reflects these compounds' chemical properties in acid-base reactions, but those properties are not especially important for understanding most of the biological functions of nucleobases.Pyrimidine
Pyrimidine is an aromatic heterocyclic organic compound similar to pyridine. One of the three diazines (six-membered heterocyclics with two nitrogen atoms in the ring), it has the nitrogen atoms at positions 1 and 3 in the ring. The other diazines are pyrazine (nitrogen atoms at the 1 and 4 positions) and pyridazine (nitrogen atoms at the 1 and 2 positions). In nucleic acids, three types of nucleobases are pyrimidine derivatives: cytosine (C), thymine (T), and uracil (U).Pyrimidine dimer
Pyrimidine dimers are molecular lesions formed from thymine or cytosine bases in DNA via photochemical reactions. Ultraviolet light (UV) induces the formation of covalent linkages between consecutive bases along the nucleotide chain in the vicinity of their carbon–carbon double bonds. The dimerization reaction can also occur among pyrimidine bases in dsRNA (double-stranded RNA)—uracil or cytosine. Two common UV products are cyclobutane pyrimidine dimers (CPDs) and 6–4 photoproducts. These premutagenic lesions alter the structure and possibly the base-pairing. Up to 50–100 such reactions per second might occur in a skin cell during exposure to sunlight, but are usually corrected within seconds by photolyase reactivation or nucleotide excision repair. Uncorrected lesions can inhibit polymerases, cause misreading during transcription or replication, or lead to arrest of replication. Pyrimidine dimers are the primary cause of melanomas in humans.Thymine
Thymine (T, Thy) is one of the four nucleobases in the nucleic acid of DNA that are represented by the letters G–C–A–T. The others are adenine, guanine, and cytosine. Thymine is also known as 5-methyluracil, a pyrimidine nucleobase. In RNA, thymine is replaced by the nucleobase uracil. Thymine was first isolated in 1893 by Albrecht Kossel and Albert Neumann from calves' thymus glands, hence its name.Uracil
Uracil (; U) is one of the four nucleobases in the nucleic acid of RNA that are represented by the letters A, G, C and U. The others are adenine (A), cytosine (C), and guanine (G). In RNA, uracil binds to adenine via two hydrogen bonds. In DNA, the uracil nucleobase is replaced by thymine. Uracil is a demethylated form of thymine.
Uracil is a common and naturally occurring pyrimidine derivative. The name "uracil" was coined in 1885 by the German chemist Robert Behrend, who was attempting to synthesize derivatives of uric acid. Originally discovered in 1900 by Alberto Ascoli, it was isolated by hydrolysis of yeast nuclein; it was also found in bovine thymus and spleen, herring sperm, and wheat germ. It is a planar, unsaturated compound that has the ability to absorb light.Based on 12C/13C isotopic ratios of organic compounds found in the Murchison meteorite, it is believed that uracil, xanthine and related molecules can also be formed extraterrestrially.In 2012, an analysis of data from the Cassini mission orbiting in the Saturn system showed that Titan's surface composition may include uracil.