In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The function of TFs is to regulate—turn on and off—genes in order to make sure that they are expressed in the right cell at the right time and in the right amount throughout the life of the cell and the organism. Groups of TFs function in a coordinated fashion to direct cell division, cell growth, and cell death throughout life; cell migration and organization (body plan) during embryonic development; and intermittently in response to signals from outside the cell, such as a hormone. There are up to 2600 TFs in the human genome.
TFs work alone or with other proteins in a complex, by promoting (as an activator), or blocking (as a repressor) the recruitment of RNA polymerase (the enzyme that performs the transcription of genetic information from DNA to RNA) to specific genes.
A defining feature of TFs is that they contain at least one DNA-binding domain (DBD), which attaches to a specific sequence of DNA adjacent to the genes that they regulate. TFs are grouped into classes based on their DBDs. Other proteins such as coactivators, chromatin remodelers, histone acetyltransferases, histone deacetylases, kinases, and methylases are also essential to gene regulation, but lack DNA-binding domains, and therefore are not TFs.
TFs are of interest in medicine because TF mutations can cause specific diseases, and medications can be potentially targeted toward them.
|Transcription factor glossary|
Transcription factors are essential for the regulation of gene expression and are, as a consequence, found in all living organisms. The number of transcription factors found within an organism increases with genome size, and larger genomes tend to have more transcription factors per gene.
There are approximately 2600 proteins in the human genome that contain DNA-binding domains, and most of these are presumed to function as transcription factors, though other studies indicate it to be a smaller number. Therefore, approximately 10% of genes in the genome code for transcription factors, which makes this family the single largest family of human proteins. Furthermore, genes are often flanked by several binding sites for distinct transcription factors, and efficient expression of each of these genes requires the cooperative action of several different transcription factors (see, for example, hepatocyte nuclear factors). Hence, the combinatorial use of a subset of the approximately 2000 human transcription factors easily accounts for the unique regulation of each gene in the human genome during development.
Transcription factors bind to either enhancer or promoter regions of DNA adjacent to the genes that they regulate. Depending on the transcription factor, the transcription of the adjacent gene is either up- or down-regulated. Transcription factors use a variety of mechanisms for the regulation of gene expression. These mechanisms include:
Transcription factors are one of the groups of proteins that read and interpret the genetic "blueprint" in the DNA. They bind to the DNA and help initiate a program of increased or decreased gene transcription. As such, they are vital for many important cellular processes. Below are some of the important functions and biological roles transcription factors are involved in:
In eukaryotes, an important class of transcription factors called general transcription factors (GTFs) are necessary for transcription to occur. Many of these GTFs do not actually bind DNA, but rather are part of the large transcription preinitiation complex that interacts with RNA polymerase directly. The most common GTFs are TFIIA, TFIIB, TFIID (see also TATA binding protein), TFIIE, TFIIF, and TFIIH. The preinitiation complex binds to promoter regions of DNA upstream to the gene that they regulate.
Other transcription factors differentially regulate the expression of various genes by binding to enhancer regions of DNA adjacent to regulated genes. These transcription factors are critical to making sure that genes are expressed in the right cell at the right time and in the right amount, depending on the changing requirements of the organism.
Many transcription factors in multicellular organisms are involved in development. Responding to stimuli, these transcription factors turn on/off the transcription of the appropriate genes, which, in turn, allows for changes in cell morphology or activities needed for cell fate determination and cellular differentiation. The Hox transcription factor family, for example, is important for proper body pattern formation in organisms as diverse as fruit flies to humans. Another example is the transcription factor encoded by the Sex-determining Region Y (SRY) gene, which plays a major role in determining sex in humans.
Cells can communicate with each other by releasing molecules that produce signaling cascades within another receptive cell. If the signal requires upregulation or downregulation of genes in the recipient cell, often transcription factors will be downstream in the signaling cascade. Estrogen signaling is an example of a fairly short signaling cascade that involves the estrogen receptor transcription factor: Estrogen is secreted by tissues such as the ovaries and placenta, crosses the cell membrane of the recipient cell, and is bound by the estrogen receptor in the cell's cytoplasm. The estrogen receptor then goes to the cell's nucleus and binds to its DNA-binding sites, changing the transcriptional regulation of the associated genes.
Not only do transcription factors act downstream of signaling cascades related to biological stimuli but they can also be downstream of signaling cascades involved in environmental stimuli. Examples include heat shock factor (HSF), which upregulates genes necessary for survival at higher temperatures, hypoxia inducible factor (HIF), which upregulates genes necessary for cell survival in low-oxygen environments, and sterol regulatory element binding protein (SREBP), which helps maintain proper lipid levels in the cell.
Many transcription factors, especially some that are proto-oncogenes or tumor suppressors, help regulate the cell cycle and as such determine how large a cell will get and when it can divide into two daughter cells. One example is the Myc oncogene, which has important roles in cell growth and apoptosis.
Transcription factors can also be used to alter gene expression in a host cell to promote pathogenesis. A well studied example of this are the transcription-activator like effectors (TAL effectors) secreted by Xanthomonas bacteria. When injected into plants, these proteins can enter the nucleus of the plant cell, bind plant promoter sequences, and activate transcription of plant genes that aid in bacterial infection. TAL effectors contain a central repeat region in which there is a simple relationship between the identity of two critical residues in sequential repeats and sequential DNA bases in the TAL effector’s target site. This property likely makes it easier for these proteins to evolve in order to better compete with the defense mechanisms of the host cell.
It is common in biology for important processes to have multiple layers of regulation and control. This is also true with transcription factors: Not only do transcription factors control the rates of transcription to regulate the amounts of gene products (RNA and protein) available to the cell but transcription factors themselves are regulated (often by other transcription factors). Below is a brief synopsis of some of the ways that the activity of transcription factors can be regulated:
Transcription factors (like all proteins) are transcribed from a gene on a chromosome into RNA, and then the RNA is translated into protein. Any of these steps can be regulated to affect the production (and thus activity) of a transcription factor. An implication of this is that transcription factors can regulate themselves. For example, in a negative feedback loop, the transcription factor acts as its own repressor: If the transcription factor protein binds the DNA of its own gene, it down-regulates the production of more of itself. This is one mechanism to maintain low levels of a transcription factor in a cell.
In eukaryotes, transcription factors (like most proteins) are transcribed in the nucleus but are then translated in the cell's cytoplasm. Many proteins that are active in the nucleus contain nuclear localization signals that direct them to the nucleus. But, for many transcription factors, this is a key point in their regulation. Important classes of transcription factors such as some nuclear receptors must first bind a ligand while in the cytoplasm before they can relocate to the nucleus.
Transcription factors may be activated (or deactivated) through their signal-sensing domain by a number of mechanisms including:
In eukaryotes, DNA is organized with the help of histones into compact particles called nucleosomes, where sequences of about 147 DNA base pairs make ~1.65 turns around histone protein octamers. DNA within nucleosomes is inaccessible to many transcription factors. Some transcription factors, so-called pioneering factors are still able to bind their DNA binding sites on the nucleosomal DNA. For most other transcription factors, the nucleosome should be actively unwound by molecular motors such as chromatin remodelers. Alternatively, the nucleosome can be partially unwrapped by thermal fluctuations, allowing temporary access to the transcription factor binding site. In many cases, a transcription factor needs to compete for binding to its DNA binding site with other transcription factors and histones or non-histone chromatin proteins. Pairs of transcription factors and other proteins can play antagonistic roles (activator versus repressor) in the regulation of the same gene.
Most transcription factors do not work alone. Many large TF families form complex homotypic or heterotypic interactions through dimerization. For gene transcription to occur, a number of transcription factors must bind to DNA regulatory sequences. This collection of transcription factors, in turn, recruit intermediary proteins such as cofactors that allow efficient recruitment of the preinitiation complex and RNA polymerase. Thus, for a single transcription factor to initiate transcription, all of these other proteins must also be present, and the transcription factor must be in a state where it can bind to them if necessary. Cofactors are proteins that modulate the effects of transcription factors. Cofactors are interchangeable between specific gene promoters; the protein complex that occupies the promoter DNA and the amino acid sequence of the cofactor determine its spatial conformation. For example, certain steroid receptors can exchange cofactors with NF-κB, which is a switch between inflammation and cellular differentiation; thereby steroids can affect the inflammatory response and function of certain tissues.
TAD is domain of the transcription factor that binds other proteins such as transcription coregulators. Proteins containing TADs are Gal4, Gcn4, Oaf1, Leu3, Rtg3, Pho4, Gln3 in yeast and p53, NFAT, NF-κB and VP16 in mammals. Many TADs are as short as 9 amino acids (present in e.g., p53, VP16, MLL, E2A, HSF1, NF-IL6, NFAT1 and NF-κB Gal4, Pdr1, Oaf1, Gcn4, VP16, Pho4, Msn2, Ino2 and P201).
The portion (domain) of the transcription factor that binds DNA is called its DNA-binding domain. Below is a partial list of some of the major families of DNA-binding domains/transcription factors:
|basic helix-loop-helix||InterPro: IPR001092||Pfam PF00010||SCOP 47460|
|basic-leucine zipper (bZIP)||InterPro: IPR004827||Pfam PF00170||SCOP 57959|
|C-terminal effector domain of the bipartite response regulators||InterPro: IPR001789||Pfam PF00072||SCOP 46894|
|AP2/ERF/GCC box||InterPro: IPR001471||Pfam PF00847||SCOP 54176|
|homeodomain proteins, which are encoded by homeobox genes, are transcription factors. Homeodomain proteins play critical roles in the regulation of development.||InterPro: IPR009057||Pfam PF00046||SCOP 46689|
|lambda repressor-like||InterPro: IPR010982||SCOP 47413|
|srf-like (serum response factor)||InterPro: IPR002100||Pfam PF00319||SCOP 55455|
|winged helix||InterPro: IPR013196||Pfam PF08279||SCOP 46785|
|* multi-domain Cys2His2 zinc fingers||InterPro: IPR007087||Pfam PF00096||SCOP 57667|
|* Zn2/Cys6||SCOP 57701|
|* Zn2/Cys8 nuclear receptor zinc finger||InterPro: IPR001628||Pfam PF00105||SCOP 57716|
Transcription factors interact with their binding sites using a combination of electrostatic (of which hydrogen bonds are a special case) and Van der Waals forces. Due to the nature of these chemical interactions, most transcription factors bind DNA in a sequence specific manner. However, not all bases in the transcription factor-binding site may actually interact with the transcription factor. In addition, some of these interactions may be weaker than others. Thus, transcription factors do not bind just one sequence but are capable of binding a subset of closely related sequences, each with a different strength of interaction.
Because transcription factors can bind a set of related sequences and these sequences tend to be short, potential transcription factor binding sites can occur by chance if the DNA sequence is long enough. It is unlikely, however, that a transcription factor will bind all compatible sequences in the genome of the cell. Other constraints, such as DNA accessibility in the cell or availability of cofactors may also help dictate where a transcription factor will actually bind. Thus, given the genome sequence it is still difficult to predict where a transcription factor will actually bind in a living cell.
Additional recognition specificity, however, may be obtained through the use of more than one DNA-binding domain (for example tandem DBDs in the same transcription factor or through dimerization of two transcription factors) that bind to two or more adjacent sequences of DNA.
Transcription factors are of clinical significance for at least two reasons: (1) mutations can be associated with specific diseases, and (2) they can be targets of medications.
Many transcription factors are either tumor suppressors or oncogenes, and, thus, mutations or aberrant regulation of them is associated with cancer. Three groups of transcription factors are known to be important in human cancer: (1) the NF-kappaB and AP-1 families, (2) the STAT family and (3) the steroid receptors.
Below are a few of the better-studied examples:
|Rett syndrome||Mutations in the MECP2 transcription factor are associated with Rett syndrome, a neurodevelopmental disorder.||Xq28|
|Diabetes||A rare form of diabetes called MODY (Maturity onset diabetes of the young) can be caused by mutations in hepatocyte nuclear factors (HNFs) or insulin promoter factor-1 (IPF1/Pdx1).||multiple|
|Developmental verbal dyspraxia||Mutations in the FOXP2 transcription factor are associated with developmental verbal dyspraxia, a disease in which individuals are unable to produce the finely coordinated movements required for speech.||7q31|
|Autoimmune diseases||Mutations in the FOXP3 transcription factor cause a rare form of autoimmune disease called IPEX.||Xp11.23-q13.3|
|Li-Fraumeni syndrome||Caused by mutations in the tumor suppressor p53.||17p13.1|
|Breast cancer||The STAT family is relevant to breast cancer.||multiple|
|Multiple cancers||The HOX family are involved in a variety of cancers.||multiple|
Approximately 10% of currently prescribed drugs directly target the nuclear receptor class of transcription factors. Examples include tamoxifen and bicalutamide for the treatment of breast and prostate cancer, respectively, and various types of anti-inflammatory and anabolic steroids. In addition, transcription factors are often indirectly modulated by drugs through signaling cascades. It might be possible to directly target other less-explored transcription factors such as NF-κB with drugs. Transcription factors outside the nuclear receptor family are thought to be more difficult to target with small molecule therapeutics since it is not clear that they are "drugable" but progress has been made on Pax2 and the notch pathway.
Gene duplications have played a crucial role in the evolution of species. This applies particularly to transcription factors. Once they occur as duplicates, accumulated mutations encoding for one copy can take place without negatively affecting the regulation of downstream targets. However, changes of the DNA binding specificities of the single-copy LEAFY transcription factor, which occurs in most land plants, have recently been elucidated. In that respect, a single-copy transcription factor can undergo a change of specificity through a promiscuous intermediate without losing function. Similar mechanisms have been proposed in the context of all alternative phylogenetic hypotheses, and the role of transcription factors in the evolution of all species.
There are different technologies available to analyze transcription factors. On the genomic level, DNA-sequencing and database research are commonly used The protein version of the transcription factor is detectable by using specific antibodies. The sample is detected on a western blot. By using electrophoretic mobility shift assay (EMSA), the activation profile of transcription factors can be detected. A multiplex approach for activation profiling is a TF chip system where several different transcription factors can be detected in parallel.
The most commonly used method for identifying transcription factor binding sites is chromatin immunoprecipitation (ChIP). This technique relies on chemical fixation of chromatin with formaldehyde, followed by co-precipitation of DNA and the transcription factor of interest using an antibody that specifically targets that protein. The DNA sequences can then be identified by microarray or high-throughput sequencing (ChIP-seq) to determine transcription factor binding sites. If no antibody is available for the protein of interest, DamID may be a convenient alternative.
As described in more detail below, transcription factors may be classified by their (1) mechanism of action, (2) regulatory function, or (3) sequence homology (and hence structural similarity) in their DNA-binding domains.
There are two mechanistic classes of transcription factors:
|Examples of specific transcription factors|
|Factor||Structural type||Recognition sequence||Binds as|
|Heat shock factor||Basic zipper||5'-XGAAX-3'||Trimer|
|(G/C) = G or C |
X = A, T, G or C
Transcription factors have been classified according to their regulatory function:
Activator protein 1 (AP-1) is a transcription factor that regulates gene expression in response to a variety of stimuli, including cytokines, growth factors, stress, and bacterial and viral infections. AP-1 controls a number of cellular processes including differentiation, proliferation, and apoptosis. The structure of AP-1 is a heterodimer composed of proteins belonging to the c-Fos, c-Jun, ATF and JDP families.ATF1
Cyclic AMP-dependent transcription factor ATF-1 is a protein that in humans is encoded by the ATF1 gene.
This gene encodes an activating transcription factor, which belongs to the ATF subfamily and bZIP (basic-region leucine zipper) family. It influences cellular physiologic processes by regulating the expression of downstream target genes, which are related to growth, survival, and other cellular activities. This protein is phosphorylated at serine 63 in its kinase-inducible domain by serine/threonine kinases, cAMP-dependent protein kinase A, calmodulin-dependent protein kinase I/II, mitogen- and stress-activated protein kinase and cyclin-dependent kinase 3 (cdk-3). Its phosphorylation enhances its transactivation and transcriptional activities, and enhances cell transformation.Activating transcription factor
Activating transcription factor, ATF, is a group of bZIP transcription factors, which act as homodimers or heterodimers with a range of other bZIP factors. First, they have been described as members of the CREB/ATF family, whereas it turned out later that some of them might be more similar to AP-1-like factors such as c-Jun or c-Fos. In general, ATFs are known to respond to extracellular signals and this suggests an important role that they have in maintaining homeostasis. Some of these ATFs, such as ATF3, ATF4, and ATF6 are known to play a role in stress responses. Another example of ATF function would be ATFx that can suppress apoptosis.
Genes include ATF1, ATF2, ATF3, ATF4, ATF5, ATF6, ATF7, ATFx.Activating transcription factor 2
Activating transcription factor 2, also known as ATF2, is a protein that, in humans, is encoded by the ATF2 gene.COUP-TFI
COUP-TF1 (COUP Transcription Factor 1) also known as NR2F1 (Nuclear Receptor subfamily 2, group F, member 1) is a protein that in humans is encoded by the NR2F1 gene. This protein is a member of nuclear hormone receptor family of steroid hormone receptors.Chicken ovalbumin upstream promoter-transcription factor
The chicken ovalbumin upstream promoter transcription factor (COUP-TFs) proteins are members of the nuclear receptor family of intracellular transcription factors. There are two variants of the COUP-TFs, labeled as COUP-TFI and COUP-TFII encoded by the NR2F1 and NR2F2 genes respectively.
COUP-TFs play critical roles in the development of organisms.ERM transcription factor
ERM transcription factor is a transcription factor generated in Sertoli cells, which are found in the testes and play a crucial role in spermatogenesis.General transcription factor
General transcription factors (GTFs), also known as basal transcriptional factors, are a class of protein transcription factors that bind to specific sites (promoter) on DNA to activate transcription of genetic information from DNA to messenger RNA. GTFs, RNA polymerase, and the mediator (a multi-protein complex) constitute the basic transcriptional apparatus that first bind to the promoter, then start transcription. GTFs are also intimately involved in the process of gene regulation, and most are required for life.A transcription factor is a protein that binds to specific DNA sequences (enhancer or promoter), either alone or with other proteins in a complex, to control the rate of transcription of genetic information from DNA to messenger RNA by promoting (serving as an activator) or blocking (serving as a repressor) the recruitment of RNA polymerase. As a class of protein, general transcription factors bind to promoters along the DNA sequence or form a large transcription preinitiation complex to activate transcription. General transcription factors are necessary for transcription to occur.NK2 homeobox 1
NK2 homeobox 1 (NKX2-1), also known as thyroid transcription factor 1 (TTF-1), is a protein which in humans is encoded by the NKX2-1 gene.Octamer transcription factor
An octamer transcription factor is a transcription factor which binds to the "ATTTGCAT" sequence.Examples include:
Oct-1 - POU2F1
Oct-2 - POU2F2
Oct-3/4 – POU5F1
Oct-6 – POU3F1
Oct-7 – POU3F2
Oct-8 - POU3F3
Oct-9 – POU3F4
Oct-11 – POU3F4Sp1 transcription factor
Transcription factor Sp1, also known as specificity protein 1* is a protein that in humans is encoded by the SP1 gene.Sp8 transcription factor
Transcription factor Sp8 also known as specificity protein 8 (SP-8) or Btd transcription factor (buttonhead) is a protein that in humans is encoded by the SP8 gene. Sp8 is a transcription factor in the Sp/KLF family.Transcription factor II A
Transcription factor TFIIA is a nuclear protein involved in the RNA polymerase II-dependent transcription of DNA. TFIIA is one of several general (basal) transcription factors (GTFs) that are required for all transcription events that use RNA polymerase II. Other GTFs include TFIID, a complex composed of the TATA binding protein TBP and TBP-associated factors (TAFs), as well as the factors TFIIB, TFIIE, TFIIF, and TFIIH. Together, these factors are responsible for promoter recognition and the formation of a transcription preinitiation complex (PIC) capable of initiating RNA synthesis from a DNA template.Transcription factor II B
Transcription factor II B (TFIIB) is a general transcription factor that is involved in the formation of the RNA polymerase II preinitiation complex (PIC) and aids in stimulating transcription initiation. TFIIB is localised to the nucleus and provides a platform for PIC formation by binding and stabilising the DNA-TBP (TATA-binding protein) complex and by recruiting RNA polymerase II and other transcription factors. It is encoded by the TFIIB gene, and is homologous to both archaeal transcription factor B and more distantly to bacterial sigma factorsTranscription factor II D
Transcription factor II D (TFIID) is one of several general transcription factors that make up the RNA polymerase II preinitiation complex. RNA polymerase II holoenzyme is a form of eukaryotic RNA polymerase II that is recruited to the promoters of protein-coding genes in living cells. It consists of RNA polymerase II, a subset of general transcription factors, and regulatory proteins known as SRB proteins. Before the start of transcription, the transcription Factor II D (TFIID) complex binds to the TATA box in the core promoter of the gene.Transcription factor II E
Transcription factor II E (TFIIE) is one of several general transcription factors that make up the RNA polymerase II preinitiation complex. It is a tetramer of two alpha and two beta chains and interacts with TAF6/TAFII80, ATF7IP, and varicella-zoster virus IE63 protein.TFIIE recruits TFIIH to the initiation complex and stimulates the RNA polymerase II C-terminal domain kinase and DNA-dependent ATPase activities of TFIIH. Both TFIIH and TFIIE are required for promoter clearance by RNA polymerase. Transcription factor II E is encoded by the GTF2E1 and GTF2E2 genes. TFIIE is thought to be involved in DNA melting at the promoter: it contains a zinc ribbon motif that can bind single stranded DNA.Transcription factor II F
Transcription factor IIF (TFIIF) is one of several general transcription factors that make up the RNA polymerase II preinitiation complex.Transcription factor IIF is encoded by the GTF2F1, GTF2F2, and GTF2F2L genes.TFIIF binds to RNA Polymerase II when the enzyme is already unbound to any other transcription factor, thus preventing it from contacting DNA outside the promoter. Furthermore, TFIIF stabilizes the RNA polymerase II while it's contacting TBP and TFIIB.Transcription factor II H
Transcription factor II Human (Transcription Factor II H; TFIIH) is an important protein complex, having roles in transcription of various protein-coding genes and DNA nucleotide excision repair (NER) pathways. TFIIH first came to light in 1989 when general transcription factor-δ or basic transcription factor 2 was characterized as an indispensable transcription factor in vitro. This factor was also isolated from yeast and finally named as TFIIH in 1992.TFIIH consists of ten subunits, 7 of which (ERCC2/XPD, ERCC3/XPB, GTF2H1/p62, GTF2H4/p52, GTF2H2/p44, GTF2H3/p34 and GTF2H5/TTDA) form the core complex. The cyclin activating kinase-subcomplex (CDK7, MAT1, and cyclin H) is linked to the core via the XPD protein Two of the subunits, ERCC2/XPD and ERCC3/XPB, have helicase and ATPase activities and help create the transcription bubble. In a test tube these subunits are only required for transcription if the DNA template is not already denatured or if it is supercoiled.
Two other TFIIH subunits, CDK7 and cyclin H, phosphorylate serine amino acids on the RNA polymerase II C-terminal domain and possibly other proteins involved in the cell cycle. Next to a vital function in transcription initiation, TFIIH is also involved in nucleotide excision repair.Twist transcription factor
Twist-related protein 1 (TWIST1) also known as class A basic helix-loop-helix protein 38 (bHLHa38) is a basic helix-loop-helix transcription factor that in humans is encoded by the TWIST1 gene.
Transcription factors and intracellular receptors
see also transcription factor/coregulator deficiencies
|Evolution of genetic systems|
|Control of development|