Genetics/Response element classes

From Wikiversity

Identifying a bona fide response element is more difficult than a simple inspection. In order to attribute the response element to a candidate sequence, some observations have to be conducted using molecular, biological and biophysical methods and functional approaches. Findings may indicate that response element in the promoter is a functional element.[1]

A likely response element found by simple inspection may also be inactive due to methylation.

Response Elements: "Nucleotide sequences, usually upstream, which are recognized by specific regulatory transcription factors, thereby causing gene response to various regulatory agents. These elements may be found in both promoter and enhancer regions."[2]

"Under conditions of stress, a transcription activator protein binds to the response element and stimulates transcription. If the same response element sequence is located in the control regions of different genes, then these genes will be activated by the same stimuli, thus producing a coordinated response."[3]

Def. nucleotide "sequences, usually upstream, which are recognized by specific regulatory transcription factors, thereby causing gene response to various regulatory agents", [that] "may be found in both promoter and enhancer regions"[4] are called response elements.

Basic helix–loop–helix

Basic helix–loop–helix structural motif of aryl hydrocarbon receptor nuclear translocator (ARNT). Two α-helices (blue) are connected by a short turn loop (red).[5] Credit: Thomas Splettstoesser.{{free media}}

A basic helix–loop–helix (bHLH) is a protein structural motif that characterizes one of the largest families of dimerizing transcription factors.[6][7][8][9]

bHLH transcription factors are often important in development or cell activity. For one, BMAL1-Clock is a core transcription complex in the molecular circadian clock. Other genes, like c-Myc and HIF-1, have been linked to cancer due to their effects on cell growth and metabolism.

The motif is characterized by two α-helices connected by a turn loop. In general, transcription factors including this domain are dimeric, each with one helix containing basic amino acid residues that facilitate DNA binding.[10] In general, one helix is smaller, and, due to the flexibility of the loop, allows dimerization by folding and packing against another helix. The larger helix typically contains the DNA-binding regions. bHLH proteins typically bind to a consensus sequence called an E-box, CANNTG.[11] The canonical E-box is CACGTG (palindromic), however some bHLH transcription factors, notably those of the bHLH-PAS domain (PAS) family, bind to related non-palindromic sequences, which are similar to the E-box. bHLH TFs may homodimerize or heterodimerize with other bHLH TFs and form a large variety of dimers, each one with specific functions.[12]

Basic helix-loop-helix leucine zipper

Basic helix-loop-helix leucine zipper (bHLH-ZIP) transcription factors are transcription factors containing both basic helix-loop-helix (bHLH) and leucine zipper motifs (bZIP).

Basic helix-span-helix

Basic helix-span-helix (bHSH) is a basic domain response element.

Basic leucine zipper domain

CREB (top) is a transcription factor capable of binding DNA via the bZIP domain (bottom) and regulating gene expression. Credit: Yikrazuul.{{free media}}

Proteins containing this domain are transcription factors.[13][14]

bZIP transcription factors are found in all eukaryotes and form one of the largest families of dimerizing TFs.[15][16] An evolutionary study from 2008 revealed that 4 bZIP genes were encoded by the genome of the most recent common ancestor of all plants.[17] Interactions between bZIP transcription factors are numerous and complex [18][19][15] and play important roles in cancer development[20] in epithelial tissues, steroid hormone synthesis by cells of endocrine tissues,[21] factors affecting reproductive functions,[22] and several other phenomena that affect human health.

β-Scaffold factors with minor groove contacts

[edit | edit source]

Rel homology region, STAT, p53-like, MADS box, TATA-binding proteins, High-mobility group, Grainyhead, Cold-shock domain, and Runt.

Catabolite activators

General regulatory factors

"General regulatory factors (GRFs), such as Reb1, Abf1, Rap1, Mcm1, and Cbf1, positionally organize yeast chromatin through interactions with a core consensus DNA sequence."[23]

"These factors (Reb1, Abf1, Mcm1, Rap1, and Cbf1) organize nucleosomes and are referred to as general regulatory factors (GRFs) (Yu and Morse 1999; Yarragudi et al. 2004; Raisner et al. 2005; Badis et al. 2008; Hartley and Madhani 2009; Hughes and de Boer 2013). By directing nucleosome organization, GRFs help maintain nucleosome-free promoter regions (NFRs) (Badis et al. 2008; Hartley and Madhani 2009), thereby giving the transcription machinery access to the DNA. While GRF binding and their cognate sites are enriched within promoter NFRs (Rhee and Pugh 2011), thousands of additional seemingly equivalent motifs are not bound. They reside both within NFR regions and in nucleosome-encased gene bodies. While, in principle, other proteins might prevent binding, this premise has not been experimentally verified on a genomic scale."[23]

Two "general regulatory factors, Abf1 and Rap1, [contribute] to nucleosome occupancy in Saccharomyces cerevisiae. These factors have each been shown to bind to a few hundred promoters, but [...] thousands of loci show localized regions of altered nucleosome occupancy within 1 h of loss of Abf1 or Rap1 binding, and that altered chromatin structure can occur via binding sites having a wide range of affinities."[24]

"DNA-binding transcription factors can be inhibited from binding nucleosomal sites in some cases, but in other circumstances can out-compete histones for their binding sites, thus creating regions of open chromatin (19,20). Factors in the latter category have the potential to dictate chromatin structure at a significant portion of the genome if their binding sites are widespread. In yeast, a small group of multifunctional, DNA-binding proteins termed General Regulatory Factors (GRFs), including Abf1, Rap1 and Reb1, have this potential".[24]

SGT1 is a protein that in humans is encoded by the ECD gene.[25][26][27]


The λ repressor of bacteriophage lambda employs two helix-turn-helix motifs (left; green) to bind DNA (right; blue and red). The λ repressor protein in this image is a dimer. Credit: Zephyris.{{free media}}

The helix-turn-helix (HTH) is a major structural motif capable of binding DNA, where each monomer incorporates two α helices, joined by a short strand of amino acids, that bind to the major groove of DNA, occurring in many proteins that regulate gene expression.[28] The discovery of the helix-turn-helix motif was based on similarities between several genes encoding transcription regulatory proteins from bacteriophage lambda and Escherichia coli: Cro, Catabolite activator protein (CAP), and cI protein (λ repressor), which were found to share a common 20–25 amino acid sequence that facilitates DNA recognition.[29][30][31][32]

The helix-turn-helix motif is a DNA-binding motif. The recognition and binding to DNA by helix-turn-helix proteins is done by the two α helices, one occupying the N-terminal end of the motif, the other at the C-terminus. In most cases, such as in the Cro repressor, the second helix contributes most to DNA recognition, and hence it is often called the "recognition helix". It binds to the major groove of DNA through a series of hydrogen bonds and various Van der Waals interactions with exposed bases. The other α helix stabilizes the interaction between protein and DNA, but does not play a particularly strong role in its recognition.[29] The recognition helix and its preceding helix always have the same relative orientation.[33]

Several attempts have been made to classify the helix-turn-helix motifs based on their structure and the spatial arrangement of their helices.[33][34][35] Some of the main types are described below.

The di-helical helix-turn-helix motif is the simplest helix-turn-helix motif. A fragment of Engrailed homeodomain encompassing only the two helices and the turn was found to be an ultrafast independently folding protein domain.[36]

An example of this motif is found in the transcriptional activator Myb.[37]

The tetra-helical helix-turn-helix motif has an additional C-terminal helix compared to the tri-helical motifs. These include the LuxR-type DNA-binding HTH domain found in bacterial transcription factors and the helix-turn-helix motif found in the TetR repressors.[38] Multihelical versions with additional helices also occur.[39]

The winged helix-turn-helix (wHTH) motif is formed by a 3-helical bundle and a 3- or 4-strand beta-sheet (wing). The topology of helices and strands in the wHTH motifs may vary. In the transcription factor ETS wHTH folds into a helix-turn-helix motif on a four-stranded anti-parallel beta-sheet scaffold arranged in the order α1-β1-β2-α2-α3-β3-β4 where the third helix is the DNA recognition helix.[40][41]

Other derivatives of the helix-turn-helix motif include the DNA-binding domain found in MarR, a regulator of multiple antibiotic resistance, which forms a winged helix-turn-helix with an additional C-terminal alpha helix.[35][42]

Nuclear factor I (NF-I) is a family of closely related transcription factors that constitutively bind as dimers to specific sequences of DNA with high affinity.[43] Family members contain an unusual DNA binding domain that binds to the recognition sequence TTGGCXXXXXGCCAA.[44]

Pocket domains

Pocket protein family consists of three proteins:[45]

  • RB – Retinoblastoma protein
  • p107 – Retinoblastoma-like protein 1
  • p130 – Retinoblastoma-like protein 2

WD-40 repeat family

Ribbon diagram of the C-terminal WD40 domain of Tup1 (a transcriptional corepressor in yeast), which adopts a 7-bladed beta-propeller fold. Ribbon is colored from blue (N-terminus) to red (C-terminus).[46] Credit: WillowW.{{free media}}

"Receptor for activated C kinase (RACK1) is a highly conserved, eukaryotic protein of the WD-40 repeat family. [...] During Phaseolus vulgaris root development, RACK1 (PvRACK1) mRNA expression was induced by auxins, abscissic acid, cytokinin, and gibberellic acid."[47]

The WD40 repeat (also known as the WD or beta-transducin repeat) is a short structural motif of approximately 40 amino acids, often terminating in a tryptophan-aspartic acid (W-D) dipeptide.[48]

WD40 domain-containing proteins have 4 to 16 repeating units, all of which are thought to form a circularised beta-propeller structure (see figure to the right).[49][50]

WD40-repeat proteins are a large family found in all eukaryotes and are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control, autophagy and apoptosis.[51] The underlying common function of all WD40-repeat proteins is coordinating multi-protein complex assemblies, where the repeating units serve as a rigid scaffold for protein interactions. The specificity of the proteins is determined by the sequences outside the repeats themselves. Examples of such complexes are G proteins (beta subunit is a beta-propeller), general transcription factor (TAFII) transcription factor, and E3 ubiquitin ligase.[49][50]

Zinc finger DNA-binding domains

, Cys
, including nuclear receptors, Cys
, Alternating composition, and WRKY.

See also

