Gene transcriptions/Boxes/CAATs

From Wikiversity
Jump to navigation Jump to search
As representative of the Metazoa here is an image of a twaid shad. Credit: Hans Hillewaert.

A "CCAAT box (also sometimes abbreviated a CAAT box or CAT box) is a distinct pattern of nucleotides"[1] along the template strand of DNA in eukaryotes.


[edit | edit source]

A "repeating sequence of nucleotides that forms a transcription or a regulatory signal"[2] is a box.

Consensus sequences

[edit | edit source]

In the direction of transcription on the template strand, the consensus sequence for a CAAT box is 3'-GGCCAATCT-5'.[1]

On the coding strand "(T/C)G ATTGG (T/C)(T/C)(A/G) was the sequence that favored CBF binding [in the mouse pro-α2(1) collagen promoter]."[3] On the template strand, this is 3'-(C/T)(A/G)(A/G)CCAATC(A/G)-5'. "[T]he favorable sequence for CBF binding was TG ATTGG (T/C)(T/C)(A/G)."[3]

Core promoters

[edit | edit source]

Notation: let the symbol CBF represent the CAAT-box binding factor.

A CAAT box when present occurs "upstream by 75-80 bases to the initial transcription site."[1]

"In many eukaryotic class II promoters, CCAAT motifs are often found between 50 and 100 nucleotides upstream of the transcription start site (17-20), and these motifs are recognized by different classes of CCAAT-binding proteins, one of which is CBF."[4]

"In many higher eukaryotic class II promoters, CCAAT motifs (or ATTGG motifs in the opposite strand), are often found between −50 and −110 relative to the start of transcription (1-4). The precise location of these CCAAT motifs and the promoter sequences around the motif of a specific gene are highly conserved during evolution."[3]

"In metazoa, the CBF-DNA complex is characterized by its requirement for a high degree of conservation within the binding motif CCAAT (7, 21, 22), and sequences surrounding the pentameric motif contribute to the binding specificity (Ref. 16 and references therein)."[4]

"Computer analysis of 502 unrelated RNA polymerase II promoter regions showed that approximately 30% of the promoters contained a CCAAT sequence (or ATTGG sequence on the complementary strand) and that in a large number of vertebrate promoters the CCAAT motif was located around nucleotide −80 upstream of the transcription start site (4)."[3]

"[I]n most of these promoters the flanking sequences of ATTGG were TG on the 5′ side and (T/C)(T/C)(A/G) on the 3′ side".[3]

"[T]he CCAAT-flanking sequences [occur] around the CCAAT motifs in most eukaryotic promoters harboring a CCAAT sequence in these proximal promoters."[3]

"In contrast to many animal CCAAT motifs, the majority of the plant sequences contain only one C or lack a CAAT-box completely."[4]

Gene transcriptions

[edit | edit source]

"Genes that have this element seem to require it for the gene to be transcribed in sufficient quantities. It is frequently absent from genes that encode proteins used in virtually all cells. This box along with the GC box is known for binding general transcription factors. CAAT and GC are primarily located in the region from 100-150bp upstream from the TATA box. Both of these consensus sequences belong to the regulatory promoter. Full gene expression occurs when transcription activator proteins bind to each module within the regulatory promoter. Protein specific binding is required for the CCAAT box activation. These proteins are known as CCAAT box binding proteins/CCAAT box binding factors."[1]


[edit | edit source]

"Transcriptional downregulation of E-cadherin appears to be an important event in the progression of various epithelial tumors. SIP1 (ZEB-2) is a Smad-interacting, multi-zinc finger protein that shows specific DNA binding activity. [Expression] of wild-type but not of mutated SIP1 downregulates mammalian E-cadherin transcription via binding to both conserved E2 boxes of the minimal E-cadherin promoter."[5]

"Analysis of mouse and human E-cadherin promoters revealed a conserved modular structure with positive regulatory elements including two E2 boxes (CACCTG) with a potential repressor role Behrens et al. 1991, Giroldi et al. 1997."[5]

"The two E2 boxes in the mouse and human E-cadherin promoter sequences were demonstrated to play a crucial role in the epithelial-specific expression of E-cadherin Behrens et al. 1991, Giroldi et al. 1997. Mutation of these sequence elements results in upregulation of the E-cadherin promoter in dedifferentiated cancer cells, whereas the wild-type promoter shows low activity in such cells. Recently, it was shown that the zinc finger transcriptional repressor Snail can downregulate E-cadherin by binding to the E boxes in the E-cadherin promoter Batlle et al. 2000, Cano et al. 2000. Human Snail belongs to a family of zinc finger proteins, which contain four or five zinc finger domains of the C2H2 type at their C-terminal end. These zinc fingers bind to the CANNTG sequence in E box motifs."[5]

"δEF1 and SIP1 have been shown to bind spaced CACCT DNA sequences, including E2 boxes (CACCTG), by their zinc finger clusters (Remacle et al., 1999)."[5]

"To address the specificity of SIP1 action, mutagenesis of the E-cadherin promoter in either its upstream E2 box 1 (−75) or its downstream E2 box 3 (−25), or in both E2 boxes was performed [...]."[5]

Wild-type "SIP1 represses the E-cadherin promoter, likely through binding via both zinc finger clusters to spaced E2 boxes as demonstrated previously (Remacle et al., 1999) and confirmed here by a DNA-mediated pull-down assay of SIP1 protein [...]. Wild-type but not mutated SIP1 from transfected human cells could be efficiently precipitated by biotinylated E-cadherin promoter oligonucleotides, comprising two wild-type E2 box sequences. Mutation of the E2 boxes resulted in the loss of SIP1 binding."[5]

Human E2 boxes are E2-box 1 (GCAGGTGA), E2-box 2 (TGGCCGGC) and E2-box 3 (TCACCTGG).[5]

"Alignment of the E-cadherin promoter sequences of dog, mouse, and man. Conserved regulatory elements are indicated: E2 boxes 1 and 3, CCAAT box, and GC box. The E2 box 2 has been described as part of a palindromic E-pal sequence in the mouse E-cadherin promoter (Behrens et al., 1991), but is conserved neither in canine nor in human sequences."[5]

Human NeuroD (BETA2/BHF1) genes

[edit | edit source]

"There was no consensus CAAT box. [...] In addition, we performed mutation analyses of the E2 box and the E3 box to evaluate whether the E2 and E3 boxes regulate the transcriptional activity of the human NeuroD gene [...]."[6]

Human glucocerebrosidase genes

[edit | edit source]

The "5′ genomic sequences revealed promoter elements containing a TATA box at nucleotides −23 to −27 and a CAAT box between nucleotides [...] and an E2 box [...]."[7]

Cap signal elements

[edit | edit source]

"Studies have reported that the cap signal element with the TATA-box, CAAT-box, and GC-box is the most general element of the POL II promoter and exists in major protein [...]."[8]


[edit | edit source]
  1. A1BG is not transcribed by a CAAT box.

A1BG samplings

[edit | edit source]

A CCAAT box (also sometimes abbreviated a CAAT box or CAT box) is a distinct pattern of nucleotides along the template strand of DNA in eukaryotes.

On the template strand, the CAAT box consensus sequence is 3'-(C/T)(A/G)(A/G)CCAATC(A/G)-5'.

For the Basic programs (starting with SuccessablesCAAT.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. CAAT - 3'-(C/T)(A/G)(A/G)CCAATC(A/G)-5', -- there are zero, -+ there are zero, +- there are zero, ++ there are zero.
  2. CAAT - 3'-(A/G)(C/T)(C/T)GGTTAG(C/T)-5', complement, -- there are zero, -+ there are zero, +- there are zero, and ++ there are zero.
  3. CAAT - 3'-(A/G)-C-T-A-A-C-C-(A/G)-(A/G)-(C/T)-5', inverse, -- there are zero, -+ there are zero, +- there are zero, and ++ there are zero.
  4. CAAT - 3'-(C/T)-G-A-T-T-G-G-(C/T)-(C/T)-(A/G)-5', complement inverse, -- there are zero, -+ there are zero, +- there are zero, and ++ there are zero.

With each SuccessablesCAAT.bas extended from 958 to 4445 nts starting just beyond ZNF497, there are no changes in results.

No CAAT boxes occur on either side of A1BG.

See also

[edit | edit source]


[edit | edit source]
  1. 1.0 1.1 1.2 1.3 "CAAT box". San Francisco, California: Wikimedia Foundation, Inc. April 8, 2013. Retrieved 2013-04-14.
  2. "Box (disambiguation)". San Francisco, California: Wikimedia Foundation, Inc. May 23, 2013. Retrieved 2013-06-15.
  3. 3.0 3.1 3.2 3.3 3.4 3.5 Weimin Bi, Ling Wu, Françoise Coustry, Benoit de Crombrugghe and Sankar N. Maity (October 17, 1997). "DNA Binding Specificity of the CCAAT-binding Factor CBF/NF-Y". The Journal of Biological Chemistry 272 (42): 26562-72. doi:10.1074/jbc.272.42.26562. Retrieved 2013-04-14. 
  4. 4.0 4.1 4.2 Victor Kusnetsov, Martin Landsberger, Jörg Meurer and Ralf Oelmüller (December 10, 1999). "The Assembly of the CAAT-box Binding Complex at a Photosynthesis Gene Promoter Is Regulated by Light, Cytokinin, and the Stage of the Plastids". The Journal of Biological Chemistry 274 (50): 36009-14. doi:10.1074/jbc.274.50.36009. Retrieved 2013-04-14. 
  5. 5.0 5.1 5.2 5.3 5.4 5.5 5.6 5.7 Joke Comijn, Geert Berx, Petra Vermassen, Kristin Verschueren, Leo van Grunsven, Erik Bruyneel, Marc Mareel, Danny Huylebroeck, Frans van Roy (June 2001). "The Two-Handed E Box Binding Zinc Finger Protein SIP1 Downregulates E-Cadherin and Induces Invasion". Molecular Cell 7 (6): 1267-78. doi:10.1016/S1097-2765(01)00260-X. Retrieved 11 January 2019. 
  6. Takafumi Miyachi, Hirofumi Maruyama, Takeshi Kitamura, Shigenobu, Nakamura and Hideshi Kawakami (8 June 1999). "Structure and regulation of the human NeuroD (BETA2/BHF1) gene". Molecular Brain Research 69 (2): 223-231. doi:10.1016/S0169-328X(99)00112-6. Retrieved 2 February 2019. 
  7. Dan Moran, Emilia Galperin and Mia Horowitz (31 July 1997). "Identification of factors regulating the expression of the human glucocerebrosidase gene". Gene 194 (2): 201-213. Retrieved 2 February 2019. 
  8. Hyun-Jun Jang, Jin Won Choi, Young Min Kim, Sang Su Shin, Kichoon Lee and Jae Yong Han (November 2011). "Reactivation of Transgene Expression by Alleviating CpG Methylation of the Rous sarcoma virus Promoter in Transgenic Quail Cells". Molecular Biotechnology 49 (3): 222–228. doi:10.1007/s12033-011-9393-7. Retrieved 2 February 2019. 
[edit | edit source]