Gene transcriptions/Boxes/ATAs/Laboratory

From Wikiversity
Jump to: navigation, search

Gene transcriptions are usually initiated by a transcription factor. The gene being considered in this laboratory is GeneID: 1, "A1BG alpha-1-B glycoprotein [ Homo sapiens ]". The transcription factors are the ATA boxes.

A laboratory is a specialized activity where a student, teacher, or researcher can have hands-on, or as close to hands-on as possible, experience actively analyzing an entity, source, or object of interest.

Usually, expensive equipment, instruments, and/or machinery are available for taking the entity apart to see and accurately record how it works, what it's made of, and where it came from. This may involve simple experiments to test reality, collect data, and attempts to make some sense out of it.

Expensive equipment can be replaced or substituted for with more readily available tools.


Carpetas-pequeñas.png

Evaluation

evaluation activity

This is another gene transcription laboratory to test the transcription of A1BG by the transcription factor: ATA boxes.

Notations[edit]

Main source: Notations

You are free to create your own notation or use that already presented. A method to statistically assess your locator is also needed.

Laboratory control group[edit]

A laboratory control group of some large number of laboratory test subjects or results may be used to define normal limits for the presence of an effect.

Instructions[edit]

This laboratory is an activity for you to explore the universe for, to create a method for, or to examine. While it is part of the {{Gene project}}, it is also independent.

Some suggested entities to consider are

  1. available classification,
  2. human genes,
  3. eukaryotes,
  4. nucleotides,
  5. classical physics quantities,
  6. known gene expressions, or
  7. geometry.

More importantly, there are your entities.

You may choose to define your entities or use those already available.

Usually, research follows someone else's ideas of how to do something. But, in this laboratory you can create these too.

This is a gene project laboratory, but you may create what a laboratory, or a {{Gene project}} is.

This laboratory is structured.

I will provide an example. The rest is up to you.

Questions, if any, are best placed on the Discuss page.

To include your participation in each of these laboratories create a subpage of your user page once you register at Wikiversity and use this subpage, for example, your online name/laboratory effort.

Enjoy learning by doing!

Hypotheses[edit]

Main source: Hypotheses
  1. A1BG is not transcribed by an ATA box.
  2. ATA boxes have a role as downstream signal transducers in A1BG.
  3. ATA boxes may assist transcription of A1BG by other transcription factors.

Introduction[edit]

Main source: Introductions

The ATA box is a variant of the TATA box that appears in the globin and other genes. Instead of a sequence TATA as in the TATA box, the ATA box lacks the first thymine (T) and may be tissue specific.

"The 3' flanking area contained the highly conserved hexanucleotide sequence A-A-T-A-A-A found in eukaryotic messages between the terminator codon and the polyadenylylation site (44)."[1]

"ATA boxes [AATAAA] can be clearly identified in the chicken αA- and αD-globin genes about 70 bp upstream from the initiator ATG codon [...] The sequences of the proposed cap sites agree with those determined for other globin genes (Fig. 6A; Refs. 15, 24, and 32) as do their positions relative to the ATA boxes"[2]

An ATA box may have the sequence AAATAT.[3] The CArG box has the sequence CCTATTATGG.[3]

"The [Sminthopsis crassicaudata putative embryonic β-globin gene] ATA box, located 30 bp 5' to the putative cap site, is of the form AAATAAAA typically found in eutherian embryonic β-like globin genes. In sequence comparisons with ATA boxes from human, mouse, and [Didelphis virginiana] adult and embryonic β-like globin genes, the S.c-ε ATA box was found to most closely resemble that found in the D. virginiana ε-globin gene (Fig 4)."[4]

This suggests a consensus sequence of 3'-AAATA(A/T)A-5' on the template strand, or perhaps 3'-(A/C/G/T)AATA(A/T)A-5'.

Human genes[edit]

Main sources: Genes/Human and Human genes

GeneID: 3043 HBB hemoglobin subunit beta [ Homo sapiens (human) ] The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5'-epsilon -- gamma-G -- gamma-A -- delta -- beta--3'.

Cystatin genes[edit]

The "four cystatin genes [GeneID: 1469 CST1, GeneID: 1470 CST2, GeneID: 1471 CST3, and GeneID: 1472 CST4] contain the ATA-box sequence (ATAAA) in their 5'-flanking regions; however, the CAT-box sequence (CAT), a binding site of the transcription factor, CTF, is found only in the 5'-flanking region of the S-type cystatin genes."[5]

β-thalassemia[edit]

"DNA sequence analysis of a cloned β-globin gene from a Chinese patient with β-thalassemia revealed a single nucleotide substitution (A→ G) within the ATA box homology and 28 base pairs upstream from the cap site."[6]

"Comparison of the level of β-globin transcripts in a variety of deletion mutants shows that for efficient transcription, both the ATA or Goldberg–Hogness box, and a region between 100 and 58 base pairs in front of the site at which transcription is initiated, are required. Deletion of either of these regions results in a decrease in the level of β-globin transcripts by an order of magnitude; deletion of the ATA box causes an additional loss in the specificity of the site of initiation of RNA synthesis. The DNA sequences downstream from the ATA box, including the natural β-globin mRNA cap site, are dispensable for transcription in vivo."[7]

"The first is a sequence rich in the nucleic acids adenine and thymine (the Goldberg-Hogness, "TATA," or "ATA" box) which is located 20-30 base pairs upstream from the RNA initiation site (the cap site which is the transcriptional start site for the mRNA) and is characterized by a concensus sequence (5'-TATAA-ATA-3')."[8]

Core promoters[edit]

The diagram shows an overview of the four core promoter elements B recognition element (BRE), TATA box, initiator element (Inr), and downstream promoter element (DPE), with their respective consensus sequences and their distance from the transcription start site.[9] Credit: Jennifer E.F. Butler & James T. Kadonaga.

The core promoter is approximately -34 nts upstream from the TSS.

From the first nucleotide just after ZSCAN22 to the first nucleotide just before A1BG are 4460 nucleotides. The core promoter on this side of A1BG extends from approximately 4425 to the possible transcription start site at nucleotide number 4460.

From the first nucleotide just after ZNF497 to the first nucleotide just before A1BG are 858 nucleotides. The core promoter on this side of A1BG extends from approximately 824 to the possible transcription start site at nucleotide number 858.

Def. "the factors, including RNA polymerase II itself, that are minimally essential for transcription in vitro from an isolated core promoter" is called the basal machinery, or basal transcription machinery.[10]

Proximal promoters[edit]

Def. a "promoter region [juxtaposed to the core promoter that] binds transcription factors that modify the affinity of the core promoter for RNA polymerase.[12][13]"[11] is called a proximal promoter.

The proximal sequence upstream of the gene that tends to contain primary regulatory elements is a proximal promoter.

It is approximately 250 base pairs [or nucleotides, nts] upstream of the [transcription] start site.

The proximal promoter begins about nucleotide number 4210 in the negative direction.

The proximal promoter begins about nucleotide number 608 in the positive direction.

Distal promoters[edit]

The "upstream regions of the human CYP11A and bovine CYP11B genes [have] a distal promoter in each gene. The distal promoters are located at −1.8 to −1.5 kb in the upstream region of the CYP11A gene and −1.5 to −1.1 kb in the upstream region of the CYP11B gene."[12]

"Using cloned chicken βA-globin genes, either individually or within the natural chromosomal locus, enhancer-dependent transcription is achieved in vitro at a distance of 2 kb with developmentally staged erythroid extracts. This occurs by promoter derepression and is critically dependent upon DNA topology. In the presence of the enhancer, genes must exist in a supercoiled conformation to be actively transcribed, whereas relaxed or linear templates are inactive. Distal protein–protein interactions in vitro may be favored on supercoiled DNA because of topological constraints."[13]

Distal promoter regions may be a relatively small number of nucleotides, fairly close to the TSS such as (-253 to -54)[14] or several regions of different lengths, many nucleotides away, such as (-2732 to -2600) and (-2830 to -2800).[15]

The "[d]istal promoter is not a spacer element."[16]

Using an estimate of 2 knts, a distal promoter to A1BG would be expected after nucleotide number 2460.

If there are any distal ATA boxes between ZN497 and A1BG, they are inside the gene for ZN497 as there are only 858 nts between them.

Samplings[edit]

Main sources: Models/Samplings and Samplings

Once you've decided on an entity, source, or object, compose a method, way, or procedure to explore it.

One way is to perceive (see, feel, hear, taste, or touch, for example) if there are more than one of them.

Ask some questions about it.

Does it appear to have a spatial extent?

Is there any change over time?

Can it be profiled with a kind of spectrum for example, by emitted radiation? Sample by plotting two or more apparent variables against each other, like intensity versus wavelength.

Is there some location, time, intensity, where there isn't one?

Regarding hypothesis 1: A1BG is not transcribed by an ATA box.

For the Basic programs (starting with SuccessablesATA.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:

  1. negative strand in the negative direction (from ZSCAN22 to A1BG) is SuccessablesATA--.bas, looking for 3'-AATAAA-5', 1, 3'-AATAAA-5', 1726,
  2. negative strand in the positive direction (from ZNF497 to A1BG) is SuccessablesATA-+.bas, looking for 3'-AATAAA-5', 0,
  3. positive strand in the negative direction is SuccessablesATA+-.bas, looking for 3'-AATAAA-5', 3, 3'-AATAAA-5', 3014, 3'-AATAAA-5', 3335, 3'-AATAAA-5', 4072,
  4. positive strand in the positive direction is SuccessablesATA++.bas, looking for 3'-AATAAA-5', 0,
  5. complement, negative strand, negative direction is SuccessablesATAc--.bas, looking for 3'-TTATTT-5', 3, 3'-TTATTT-5', 3014, 3'-TTATTT-5', 3335, 3'-TTATTT-5', 4072,
  6. complement, negative strand, positive direction is SuccessablesATAc-+.bas, looking for 3'-TTATTT-5', 0,
  7. complement, positive strand, negative direction is SuccessablesATAc+-.bas, looking for 3'-TTATTT-5', 1, 3'-TTATTT-5', 1726,
  8. complement, positive strand, positive direction is SuccessablesATAc++.bas, looking for 3'-TTATTT-5', 0,
  9. inverse complement, negative strand, negative direction is SuccessablesATAci--.bas, looking for 3'-TTTATT-5', 5, 3'-TTTATT-5', 3013, 3'-TTTATT-5', 3334, 3'-TTTATT-5', 4071, 3'-TTTATT-5', 4075, 3'-TTTATT-5', 4221,
  10. inverse complement, negative strand, positive direction is SuccessablesATAci-+.bas, looking for 3'-TTTATT-5', 1, 3'-TTTATT-5', 703,
  11. inverse complement, positive strand, negative direction is SuccessablesATAci+-.bas, looking for 3'-TTTATT-5', 1, 3'-TTTATT-5', 4537,
  12. inverse complement, positive strand, positive direction is SuccessablesATAci++.bas, looking for 3'-TTTATT-5', 0,
  13. inverse, negative strand, negative direction, is SuccessablesATAi--.bas, looking for 3'-AAATAA-5', 1, 3'-AAATAA-5', 4537,
  14. inverse, negative strand, positive direction, is SuccessablesATAi-+.bas, looking for 3'-AAATAA-5', 0,
  15. inverse, positive strand, negative direction, is SuccessablesATAi+-.bas, looking for 3'-AAATAA-5', 5, 3'-AAATAA-5', 3013, 3'-AAATAA-5', 3334, 3'-AAATAA-5', 4071, 3'-AAATAA-5', 4075, 3'-AAATAA-5', 4221,
  16. inverse, positive strand, positive direction, is SuccessablesATAi++.bas, looking for 3'-AAATAA-5', 1, 3'-AAATAA-5', 703.

Regarding hypothesis 2: ATA boxes have a role as downstream signal transducers in A1BG.

Regarding hypothesis 3: ATA boxes may assist transcription of A1BG by other transcription factors.

Verifications[edit]

To verify that your sampling has explored something, you may need a control group. Perhaps where, when, or without your entity, source, or object may serve.

Another verifier is reproducibility. Can you replicate something about your entity in your laboratory more than 3 times. Five times is usually a beginning number to provide statistics (data) about it.

For an apparent one time or perception event, document or record as much information coincident as possible. Was there a butterfly nearby?

Has anyone else perceived the entity and recorded something about it?

Gene ID: 1, includes the nucleotides between neighboring genes and A1BG. These nucleotides can be loaded into files from either gene toward A1BG, and from template and coding strands. These nucleotide sequences can be found in Gene transcriptions/A1BG. Copying the above discovered ATA boxes and putting the sequences in "⌘F" locates these sequences in the same nucleotide positions as found by the computer programs.

Core promoters ATA boxes[edit]

From the first nucleotide just after ZSCAN22 to the first nucleotide just before A1BG are 4460 nucleotides. The core promoter on this side of A1BG extends from approximately 4425 to the possible transcription start site at nucleotide number 4460.

There are no ATA boxes on the negative strand, negative direction in the core promoter between 4425 and 4460.

There is the following inverse ATA box on the negative strand, negative direction: 1, 3'-AAATAA-5' at 4537.

From the first nucleotide just after ZNF497 to the first nucleotide just before A1BG are 858 nucleotides. The core promoter on this side of A1BG extends from approximately 824 to the possible transcription start site at nucleotide number 858.

There are no ATA boxes in the positive direction in the core promoter between 824 and 858.

Proximal promoter ATA boxes[edit]

The proximal promoter begins about nucleotide number 4210 in the negative direction.

There is the following inverse ATA box on the positive strand, negative direction: 3'-AAATAA-5' at 4221.

The proximal promoter begins about nucleotide number 708 in the positive direction.

There are no ATA boxes in the proximal promoter between 708 and 858.

Distal promoter ATA boxes[edit]

Using an estimate of 2 knts, a distal promoter to A1BG would be expected after nucleotide number 2460.

There is the following ATA boxe on the negative strand in the negative direction: 1, 3'-AATAAA-5' at 1726 nts from ZSCAN22.

There are the following ATA boxes on the positive strand in the negative direction: 3, 3'-AATAAA-5' at 3014, 3'-AATAAA-5' at 3335, and 3'-AATAAA-5' at 4072.

There are the following inverse ATA boxes on the positive strand, negative direction: 4, 3'-AAATAA-5' at 3013, 3'-AAATAA-5' at 3334, 3'-AAATAA-5' at 4071, 3'-AAATAA-5' at 4075.

There is the following inverse ATA box on the positive strand, positive direction: 1, 3'-AAATAA-5' at 703.

Additional distal ATA boxes in the positive direction may be inside ZNF497 or beyond.

Transcribed ATA boxes[edit]

The "four cystatin genes [GeneID: 1469 CST1, GeneID: 1470 CST2, GeneID: 1471 CST3, and GeneID: 1472 CST4] contain the ATA-box sequence (ATAAA) in their 5'-flanking regions".[5]

For "efficient transcription [of the rabbit β-globin gene in vivo], both the ATA or Goldberg-Hogness box, and a region between 100 and 58 base pairs in front of the site at which transcription is initiated, are required."[7]

The "DNA sequences downstream from the Goldberg-Hogness box are not necessary for specific in vitro transcription because a variety of mutants lacking sequences extending from -20 (or about 4 nucleotides to the 3' side of the ATA box) to as far as + 164 (in the small intron) are transcribed effectively in vitro.12"[7]

"The promoter region of the gene contains an ATA box and a CCAAT box, which are located 32 and 74 bp upstream, respectively, from the transcription initiation site."[17]

"The conserved ATA box fixes the initiation of transcription to a point 30 bp downstream from its position in the 5' flanking sequence.67,86,87"[18]

"As is the case in BhO and phl, the ATA box in eucaryotic genes is typically located in an AT-rich region, approximately 30 bases from the mRNA cap site (Efstratiadis et al., 1980)."[19]

"The putative promoter region of the mouse K19 gene is highly homologous to the corresponding sequences of the human and bovine K19 genes. It contains an ATA box, a CAAT box and two potential Sp1-binding sites."[20]

"The analysis of the promoter region indicated that a putative ATA box is located 54 nucleotides upstream from the transcription start site".[21]

"In the promoter region, an ATA box and a CCAAT box are located 32 and 74 base pairs (bp), respectively, upstream from the transcription initiation site."[22]

Laboratory reports[edit]

Below is an outline for sections of a report, paper, manuscript, log book entry, or lab book entry. You may create your own, of course.

ATA boxes transcription laboratory

by --Marshallsumter (discusscontribs) 23:32, 18 November 2017 (UTC)

Abstract[edit]

Three hypotheses regarding the possibility of ATA boxes being transcription factors for GeneID: 1 alpha-1-B glycoprotein (A1BG) have been examined: (1) A1BG is not transcribed by any ATA boxes, (2) ATA boxes have a role as downstream signal transducers in A1BG, and (3) ATA boxes may assist transcription of A1BG by other transcription factors. These have been tested by literature searching articles that report ATA boxes in the promoter region of a particular human gene and by using a simple computer program to look for ATA boxes in the nucleotide sequences on either side of the A1BG gene. Both the template DNA strand and the coding strand have been checked. To show that these ATA boxes can be used during or for transcription of A1BG at least one transcription factor has been found.

Introduction[edit]

According to one source, A1BG is transcribed from the direction of ZNF497: 3' - 58864890: CGAGCCACCCCACCGCCCTCCCTTGG+1GGCCTCATTGCTGCAGACGCTCACCCCAGACACTCACTGCACCGGAGTGAGCGCGACCATCATG : 58866601-5', per Michael David Winther, Leah Christine Knickle, Martin Haardt, Stephen John Allen, Andre Ponton, Roberto Justo De Antueno, Kenneth Jenkins, Solomon O. Nwaka, and Y. Paul Goldberg, Fat Regulated Genes, Uses Thereof and Compounds for Mudulating Same, US Patent Office, July 29, 2004, at http://www.google.com/patents?hl=en&lr=&vid=USPATAPP10416914&id=7iaVAAAAEBAJ&oi=fnd&printsec=abstract#v=onepage&q&f=false where the second 'G' at left of four Gs in a row is the TSS. Transcription was triggered in cell cultures and the transcription start site was found using reverse transcriptase. But, the mechanism for transcription is unknown.

Controlling the transcription of A1BG may have significant immune function against snake envenomation. A1BG forms a complex that is similar to those formed between toxins from snake venom and A1BG-like plasma proteins. These inhibit the toxic effect of snake venom metalloproteinases or myotoxins and protect the animal from envenomation.[23]

Many transcription factors (TFs) may occur upstream and occasionally downstream of the transcription start site (TSS), in this gene's promoter. The following have been examined so far: (1) AGC boxes (GCC boxes), (2) CArG boxes, (3) enhancer boxes, (4) HY boxes, (5) metal responsive elements (MREs), and (6) STAT5s.

An AGC box was found in the distal promoter of either gene ZSCAN22 or A1BG on both the template and coding strands. But, as the only known transcription of A1BG occurs between Gene ID: 162968 ZNF497 and Gene ID: 1 A1BG, it is unlikely that this AGC box is naturally used to transcribe A1BG.

A full web search produced several references including a GeneCard[24] for "zinc finger protein 497" and "GCC box", including "May be involved in transcriptional regulation."[24] Zinc fingers are mentioned in association with GCC boxes in plants. It seems unlikely that an AGC box is involved in any way with the transcription of A1BG.

By combining a literature search with computer analysis of each promoter between ZSCAN22 and A1BG and ZNF497 and A1BG, CArG boxes have been found. To show that these CArG boxes may be used during or for transcription of A1BG at least one transcription factor has been affirmed.

A literature search of more recent results discovered: "Of the [Flowering Locus C] FLC binding sites, 69% contained at least one CArG-box motif with the core consensus sequence CCAAAAAT(G/A)G and an AAA extension at the 3′ end [. Three] other MADS-box flowering-time regulators, SOC1, SVP, and AGAMOUS-LIKE 24 (AGL24), bind to two different CArG-box motifs at 502 bp (CTAAATATGG) and 287 bp (CAATAATTGG) upstream of the translation start in the SEP3 gene (24), consistent with different specificities for the different MADS-box proteins."[25]

These together with the core motif CC(A/T)6GG suggest a more general CArG-box motif of (C(C/A/T)(A/T)6(A/G)G). Subsequent computer-program testing revealed two more general CArG boxes: 3'-CAAAAAAAAG-5' at 1399 nts from ZSCAN22 and 3'-CATTAAAAGG-5' at 3441 nts from ZSCAN22, but none within 958 nts toward A1BG from ZNF497.

These results show that the presence of CArG boxes on the ZSCAN22 side of A1BG implies their use when transcribing A1BG, although they may be pointing toward ZSCAN22. These suggest that the hypothesis (A1BG is not transcribed by a CArG box) is false. Regarding the second hypothesis (The lack of a CArG box on either side of A1BG does not prove that it is not actively used to transcribe A1BG), the presence of more general CArG boxes in the distal promoter tentatively confirms this hypothesis.

CArG boxes do occur in the distal promoter of A1BG. And, it is likely that a CArG box is involved in some way with the transcription of A1BG.

The presence of many enhancer boxes on both sides of A1BG demonstrate that the hypothesis: "A1BG is not transcribed by an enhancer box", is false.

The finding by literature search of evidence verifying that at least one transcription factor can enhance or inhibit the transcription of A1BG using one or more enhancer boxes disproves the hypothesis: "Existence of an enhancer box on either side of A1BG does not prove that it is actively used to transcribe A1BG".

Enhancer boxes do occur in the proximal and distal promoters of A1BG. And, it is likely that an enhancer box is involved in some way with the transcription of A1BG.

HY boxes were not found in either core promoters or the proximal promoters in either direction. However, HY boxes were found in the distal promoters between ZSCAN22 and A1BG. No genes are described in the literature so far as transcribed from HY boxes in any distal promoters.

Either A1BG can be transcribed by HY boxes in the distal promoter, or A1BG is not transcribed by HY boxes. As the literature appears absent from a Google Scholar advanced search to confirm possible transcription from distal promoters, wet chemistry experiments are needed to test the possibility.

By combining a literature search with computer analysis of the promoter between ZSCAN22 and A1BG and ZNF497 and A1BG, metal responsive elements have been found. Literature search has also discovered at least three post-translational isoforms including the unaltered precursor. Although no metal responsive elements overlap any enhancer boxes in the distal promoter, there are elements in the distal promoter.

"The human genome is estimated to contain 700 zinc-finger genes, which perform many key functions, including regulating transcription. [Four] clusters of zinc-finger genes [occur] on human chromosome 19".[26]

Nearby zinc-fingers on chromosome 19 include ZNF497 (GeneID: 162968), ZNF837 (GeneID: 116412), and ZNF8 (GeneID: 7554).

"In rodents and in humans, about one third of the zinc-finger genes carry the Krüppel-associated box (KRAB), a potent repressor of transcription (Margolin et al. 1994), [...]. There are more than 200 KRAB-containing zinc-finger genes in the human genome, about 40% of which reside on chromosome 19 and show a clustered organization suggesting an evolutionary history of duplication events (Dehal et al. 2001)."[26]

ZNF8 is in cluster V along with A1BG.[26]

"In contrast to the four clusters considered [I through IV], one that occurs at the telomere of chromosome 19, which we will call cluster V, has been very stable [over mouse, rat, and human]."[26]

"Apart from the somewhat unexpected location of Zfp35 on mouse chromosome 18 and of the AIBG orthologs on mouse chromosome 15 and rat chromosome 7, there has been little rearrangement."[26]

So far no article has reported any linkage between zinc, including various zinc fingers, or cadmium, and A1BG.

Regarding additional isoforms, mention has been made of "new genetic variants of A1BG."[27]

"Proteomic analysis revealed that [a circulating] set of plasma proteins was α 1 B-glycoprotein (A1BG) and its post-translationally modified isoforms."[28]

Pharmacogenomic variants have been reported. There are A1BG genotypes.[29]

A1BG has a genetic risk score of rs893184.[29]

"A genetic risk score, including rs16982743, rs893184, and rs4525 in F5, was significantly associated with treatment-related adverse cardiovascular outcomes in whites and Hispanics from the INVEST study and in the Nordic Diltiazem study (meta-analysis interaction P=2.39×10−5)."[29]

"rs893184 causes a histidine (His) to arginine (Arg) [nonsynonymous single nucleotide polymorphism (nsSNP), A (minor) for G (major)] substitution at amino acid position 52 in A1BG."[29]

For example, GeneID: 9 has isoforms: a, b, X1, and X2. Each of these (a and b) have variants. Variants 1-6 and 9 all encode the same isoform (a).

Variants 7, 8 and 10 all encode isoform b. Isoforms X1 and X2 are predicted.

Variants can differ in promoters, untranslated regions, or exons. For GeneID: 9: This variant (1) represents the longest transcript but encodes the shorter isoform (a). This variant is transcribed from a promoter known as P1, promoter 2, or NATb promoter.

This variant (2, also known as Type IID) lacks an alternate exon in the 5' UTR, compared to variant 1. This variant is transcribed from a promoter known as P1, promoter 2, or NATb promoter.

This variant (9, also known as Type IA) has a distinct 5' UTR and represents use of an alternate promoter known as the NATa or P3 promoter, compared to variant 1.

But, A1BG in NCBI Gene lists only one isoform, the gene locus itself, and the protein transcribed is a precursor subject to translational or more likely post-translational modifications.

The presence of multiple MREs coupled with experimental results from the literature indicating post-translational isoforms tends to confirm the existence of two or more isoforms for A1BG.

Regarding hypothesis 1: STAT5s have a role as downstream signal transducers in A1BG.

The only known TSS for A1BG lies at 858 nts from ZNF497 toward A1BG. A STAT5 transcription site lies at 3'-TTCCGGGAA-5' at 808 in the proximal promoter, i.e. from 800 (-58) to 808 (-50). This suggests that STAT5 assists in the transcription of A1BG.

"Computer analysis of the 2.3 kb rat a1bg promoter fragment revealed [a] Stat5 [site] at [...] −69/−61 [...]."[30]

The murine downstream promoter element is only 11 nts displaced from the human one. This suggests a STAT5 participation in human gene transcription of A1BG.

Regarding hypothesis 2: A1BG is not transcribed by any STAT5s.

"Computer analysis of the 2.3 kb rat a1bg promoter fragment revealed two putative Stat5 sites [...] at −2077/−2069 [and] −69/−61 [...]."[30]

There are two STAT5s on the negative strand in the negative direction, 3'-AAGCAACTT-5' at 3506 (-954) and 3'-AAGGGACTT-5' at 3782 (-678) in the distal promoter between ZSCAN22 and A1BG. Although much closer than their likely murine counterparts, they are on the other side of A1BG from the STAT5 site confirming hypothesis 1. If active in humans or murine-like STAT5s occur within or beyond ZNF497 in this distal promoter, then human A1BG is transcribed using STAT5 promoters disproving hypothesis 2.

A Google Scholar search using ZNF497 with STAT5 found no articles discussing STAT5 sites inside or associated with ZNF497. To confirm they exist, a data file going say 3,000 nts away from A1BG into ZNF497 needs to be created and tested for a distal promoter on this side.

Regarding hypothesis 3: STAT5s may assist transcription of A1BG by other transcription factors.

Literature search has found that STAT5s assist transcription of A1BG by other transcription factors.[30] The proximal STAT5 promoter is -58 to -50 from A1BG TSS. If another STAT5 promoter is at -2.3 kb, it is about -1.4 kb inside ZNF497 which is 3212 nts long. Per analogy to the rat this would be expected.[30]

Per earlier laboratories transcription factors may occur in the distal promoters on the ZNF497 side of A1BG for

  1. AGC boxes,
  2. CArG boxes,
  3. Enhancer boxes,
  4. HY boxes,
  5. MREs and
  6. STAT5s.

The STAT5 promoter on the other side of A1BG (at about +3 kb is way beyond -2.1 through ZNF497 unless the DNA is folded to allow the STAT5 on the ZSCAN22 side to be used in analogy to the STAT5 on the same side as in the rat.[30]

It isn't known which, if any, assist in locating and affixing the transcription mechanism for A1BG. This examination is the first to test one such DNA-occurring TF: the ATA boxes.

Experiments[edit]

Computer programs were written and run on the positive and negative strands between ZSCAN22 or ZNF497 and A1BG.

Regarding hypotheses 1: A1BG is not transcribed by any ATA boxes.

Here, the experiments have two parts: (1) are there any ATA box promoters? and (2) are any of these used to transcribe A1BG?

The Basic programs (starting with SuccessablesATA.bas) were written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+) looking for 16 possible types of promoters.

Regarding hypothesis 2: ATA boxes have a role as downstream signal transducers in A1BG.

To test for possible downstream signal transducers, each program tested at least 100 nts downstream of each TSS.

Regarding hypothesis 3: ATA boxes may assist transcription of A1BG by other transcription factors.

Experiments consist literature searches which seek transcription of A1BG by ATA boxes and other transcription factors already found.

Results[edit]

Regarding hypotheses 1: A1BG is not transcribed by any ATA boxes.

There are no ATA boxes on the negative strand, negative direction in the core promoter between 4425 and 4460.

There are no ATA boxes in the positive direction in the core promoter between 824 and 858.

There is the following inverse ATA box on the negative strand, negative direction in the proximal promoter: 3'-TTTATT-5', 4221.

There are no ATA boxes in the proximal promoter between 708 and 858. But, there is an inverse complement ATA box in the negative strand, positive direction of the proximal promoter: 3'-TTTATT-5' at 703.

There are the following complement ATA boxes on the negative strand, negative direction in the distal promoter: 3'-TTATTT-5', 3014, 3'-TTATTT-5', 3335, and 3'-TTATTT-5', 4072.

There are the following inverse ATA boxes on the negative strand, negative direction in the distal promoter: 3'-TTTATT-5', 3013, 3'-TTTATT-5', 3334, 3'-TTTATT-5', 4071, and 3'-TTTATT-5', 4075.

There is the following ATA boxe on the negative strand in the negative direction in the distal promoter: 1, 3'-AATAAA-5' at 1726 nts from ZSCAN22.

There is the following inverse ATA box on the positive strand, positive direction: 1, 3'-AAATAA-5' at 703. Additional distal ATA boxes in the positive direction may be inside ZNF497 or beyond.

"The conserved ATA box fixes the initiation of transcription to a point 30 bp downstream from its position in the 5' flanking sequence.67,86,87"[18]

"As is the case in BhO and phl, the ATA box in eucaryotic genes is typically located in an AT-rich region, approximately 30 bases from the mRNA cap site (Efstratiadis et al., 1980)."[19]

"The analysis of the promoter region indicated that a putative ATA box is located 54 nucleotides upstream from the transcription start site".[21]

"In the promoter region, an ATA box and a CCAAT box are located 32 and 74 base pairs (bp), respectively, upstream from the transcription initiation site."[22]

Regarding hypothesis 2: ATA boxes have a role as downstream signal transducers in A1BG.

There is the following inverse ATA box on the negative strand, negative direction: 1, 3'-AAATAA-5' at 4537.

"The promoter region of the gene contains an ATA box and a CCAAT box, which are located 32 and 74 bp upstream, respectively, from the transcription initiation site."[17]

"The putative promoter region of the mouse K19 gene is highly homologous to the corresponding sequences of the human and bovine K19 genes. It contains an ATA box, a CAAT box and two potential Sp1-binding sites."[20]

Regarding hypothesis 3: ATA boxes may assist transcription of A1BG by other transcription factors.

"The promoter region of the gene contains an ATA box and a CCAAT box, which are located 32 and 74 bp upstream, respectively, from the transcription initiation site."[17]

For "efficient transcription [of the rabbit β-globin gene in vivo], both the ATA or Goldberg-Hogness box, and a region between 100 and 58 base pairs in front of the site at which transcription is initiated, are required."[7]

Discussion[edit]

Conclusions[edit]

Laboratory evaluations[edit]

To assess your example, including your justification, analysis and discussion, I will provide such an assessment of my example for comparison and consideration.

Evaluation

No wet chemistry experiments were performed to confirm that Gene ID: 1 is transcribed from either side using ATA boxes, especially in the distal promoters. The NCBI database is generalized, whereas individual human genome testing could demonstrate that A1BG is transcribed from either side. It would be a good check to add sufficient nts to the data sets for the ZNF497 side to confirm transcription of A1BG per analogy to the rat.

See also[edit]

References[edit]

  1. Stephen A. Liebhaber, Michel J. Goossens, and Yuet Wai Kan (December 1980). "Cloning and complete nucleotide sequence of human 5'-α-globin gene". Proceedings of the National Academy of Science USA 77 (12): 7054-8. http://www.pnas.org/content/77/12/7054.full.pdf. Retrieved 2013-06-28. 
  2. Jerry B. Dodgson and James Douglas Engel (10 April 1983). "The nucleotide sequence of the adult chicken alpha-globin genes". The Journal of Biological Chemistry 258 (7): 4623-9. http://www.jbc.org/content/258/7/4623.full.pdf. Retrieved 2017-02-04. 
  3. 3.0 3.1 Shigemi Kimura, Kuniya Abe, Misao Suzuki, Masakatsu Ogawa, Kowashi Yoshioka, Tadasi Kaname, Teruhisa Miike and Ken-ichi Yamamura (June 1997). "A 900 bp genomic region from the mouse dystrophin promoter directs lacZ reporter expression only to the right heart of transgenic mice". Development, Growth & Differentiation 39 (3): 257-65. doi:10.1046/j.1440-169X.1997.t01-2-00001.x. http://onlinelibrary.wiley.com/doi/10.1046/j.1440-169X.1997.t01-2-00001.x/full. Retrieved 2013-06-28. 
  4. Steven J.B. Cooper and Rory M.HOPE (December 1993). "Evolution and expression of a beta-like globin gene of the Australian marsupial Sminthopsis crassicaudata". Proceedings of the National Academy of Sciences USA 90: 11777-81. http://www.pnas.org/content/90/24/11777.full.pdf. Retrieved 2017-02-04. 
  5. 5.0 5.1 Eiichi Saitoh and Satoko Isemura (January 1, 1993). "Molecular Biology of Human Salivary Cysteine Proteinase Inhibitors". Critical Reviews in Oral Biology and Medicine 4 (3/4): 487-93. doi:10.1177/10454411930040033301. http://cro.sagepub.com/content/4/3/487.full.pdf. Retrieved 2013-06-28. 
  6. Stuart H. Orkin, Julianne P. Sexton, Tu-chen Cheng, Sabra C. Goff, Patricia J. V. Giardina, I. Lee Joseph and Haig H. Hazazian Jr. (1983). "ATA box transcription mutation in β-thalassemia". Nucleic Acids Research 11 (14): 4727-34. doi:10.1093/nar/11.14.4727. http://nar.oxfordjournals.org/content/11/14/4727.short. Retrieved 2014-05-29. 
  7. 7.0 7.1 7.2 7.3 G. C. Grosveld, E. De Boer, C. K. Shewmaker, & R. A. Flavell (January 14, 1982). "DNA sequences necessary for transcription of the rabbit β-globin gene in vivo". Nature 295 (5845): 120-6. doi:10.1038/295120a0. http://www.nature.com/nature/journal/v295/n5845/abs/295120a0.html. Retrieved 2014-05-29. 
  8. GE Smith, MD Summers (1988). "Method for producing a recombinant baculovirus expression vector". US Patent (4,745,051). http://www.google.com/patents/US4745051. Retrieved 2014-05-29. 
  9. Jennifer E.F. Butler, James T. Kadonaga (October 15, 2002). "The RNA polymerase II core promoter: a key component in the regulation of gene expression". Genes & Development 16 (20): 2583–292. doi:10.1101/gad.1026202. PMID 12381658. http://genesdev.cshlp.org/content/16/20/2583.full. 
  10. Stephen T. Smale and James T. Kadonaga (July 2003). "The RNA Polymerase II Core Promoter". Annual Review of Biochemistry 72 (1): 449-79. doi:10.1146/annurev.biochem.72.121801.161520. PMID 12651739. http://www.lps.ens.fr/~monasson/Houches/Kadonaga/CorePromoterAnnuRev2003.pdf. Retrieved 2012-05-07. 
  11. Thomas Shafee and Rohan Lowe (09 March 2017). "Eukaryotic and prokaryotic gene structure". WikiJournal of Medicine 4 (1): 2. doi:10.15347/wjm/2017.002. https://upload.wikimedia.org/wikiversity/en/0/0c/Eukaryotic_and_prokaryotic_gene_structure.pdf. Retrieved 2017-04-06. 
  12. Koichi Takayama, Ken-ichirou Morohashi, Shin-ichlro Honda, Nobuyuki Hara and Tsuneo Omura (1 July 1994). "Contribution of Ad4BP, a Steroidogenic Cell-Specific Transcription Factor, to Regulation of the Human CYP11A and Bovine CYP11B Genes through Their Distal Promoters". The Journal of Biochemistry 116 (1): 193–203. doi:10.1093/oxfordjournals.jbchem.a124493. https://academic.oup.com/jb/article-abstract/116/1/193/780029. Retrieved 2017-08-16. 
  13. Michelle Craig Barton, Navid Madani, and Beverly M. Emerson (8 July 1997). "Distal enhancer regulation by promoter derepression in topologically constrained DNA in vitro". Proceedings of the National Academy of Sciences of the United States of America 94 (14): 7257-62. http://www.pnas.org/content/94/14/7257.short. Retrieved 2017-08-16. 
  14. A Aoyama, T Tamura, K Mikoshiba (March 1990). "Regulation of brain-specific transcription of the mouse myelin basic protein gene: function of the NFI-binding site in the distal promoter". Biochemical and Biophysical Research Communications 167 (2): 648-53. doi:10.1016/0006-291X(90)92074-A. http://www.sciencedirect.com/science/article/pii/0006291X9092074A. Retrieved 2012-12-13. 
  15. J Gao and L Tseng (June 1996). "Distal Sp3 binding sites in the hIGBP-1 gene promoter suppress transcriptional repression in decidualized human endometrial stromal cells: identification of a novel Sp3 form in decidual cells". Molecular Endocrinology 10 (6): 613-21. doi:10.1210/me.10.6.613. http://mend.endojournals.org/content/10/6/613.short. Retrieved 2012-12-13. 
  16. Peter Pasceri, Dylan Pannell, Xiumei Wu, and James Ellis (July 15, 1998). "Full activity from human β-globin locus control region transgenes requires 5′ HS1, distal β-globin promoter, and 3′ β-globin sequences". Blood 92 (2): 653-63. http://bloodjournal.hematologylibrary.org/content/92/2/653.short. Retrieved 2012-12-13. 
  17. 17.0 17.1 17.2 Lily C. Hsu, Wen-Chung Chang and Akira Yoshida (November 1989). "Genomic structure of the human cytosolic aldehyde dehydrogenase gene". Genomics 5 (4): 857-865. doi:10.1016/0888-7543(89)90127-4. http://www.sciencedirect.com/science/article/pii/0888754389901274. Retrieved 2017-11-17. 
  18. 18.0 18.1 D.R. Higgs, M.A. Vickers, A.O.M. Wilkie, I.-M. Pretorius, A.P. Jarman, and D.J. Weat (April 1989). "A Review of the Molecular Genetics of the Human α-Globin Gene". Blood, The Society Journal of The American Society of Hematology 73 (5): 1081-1104. https://www.researchgate.net/profile/Andrew_Jarman/publication/20507387_A_review_of_the_molecular_genetics_of_the_human_a-Globin_gene_cluster/links/0c960537ce65ab3c24000000/A-review-of-the-molecular-genetics-of-the-human-a-Globin-gene-cluster.pdf. Retrieved 2017-11-17. 
  19. 19.0 19.1 A Hill, S C Hardies, S J Phillips, M G Davis, C A Hutchison 3rd and M H Edgell (25 March 1984). "Two mouse early embryonic beta-globin gene sequences. Evolution of the nonadult beta-globins". The Journal of Biological Chemistry 259 (6): 3739-3747. http://www.jbc.org/content/259/6/3739.full.pdf. Retrieved 2017-11-17. 
  20. 20.0 20.1 Marc Lussier, Mario Filion, John G. Compton, Joseph H. Nadeau, Line Lapointe and André Royal (15 November 1990). "The mouse keratin 19-encoding gene: sequence, structure and chromosomal assignment". Gene 95 (2): 203-213. doi:10.1016/0378-1119(90)90363-V. http://www.sciencedirect.com/science/article/pii/037811199090363V. Retrieved 2017-11-17. 
  21. 21.0 21.1 Annie Charbonneau and Van-Luu The (26 January 2001). "Genomic organization of a human 5β-reductase and its pseudogene and substrate selectivity of the expressed enzyme". Biochimica et Biophysica Acta (BBA) - Gene Structure and Expression 1517 (2): 228-235. doi:10.1016/S0167-4781(00)00278-5. http://www.sciencedirect.com/science/article/pii/S0167478100002785. Retrieved 2017-11-17. 
  22. 22.0 22.1 Shelley M. Moore, Tiebing Liang, Tamara J. Graves, Kevin M. McCall, Lucinda G. Carr and Cindy L. Ehlers (1 July 2009). "Identification of a novel cytosolic aldehyde dehydrogenase allele, ALDHIAI*4". Human Genomics 3 (4): 304. doi:10.1186/1479-7364-3-4-304. https://humgenomics.biomedcentral.com/articles/10.1186/1479-7364-3-4-304. Retrieved 2017-11-17.  Cite error: Invalid <ref> tag; name "Moore" defined multiple times with different content
  23. Udby L, Sørensen OE, Pass J, Johnsen AH, Behrendt N, Borregaard N, Kjeldsen L. (October 2004). "Cysteine-rich secretory protein 3 is a ligand of alpha1B-glycoprotein in human plasma". Biochemistry 43 (40): 12877-86. doi:10.1021/bi048823e. PMID 15461460. https://www.ncbi.nlm.nih.gov/pubmed/15461460. Retrieved 2011-11-28. 
  24. 24.0 24.1 Weizmann Institute of Science (2017). "Zinc Finger Protein 497". Israel: Weizmann Institute of Science. Retrieved 2017-08-20. 
  25. Weiwei Deng, Hua Ying, Chris A. Helliwell, Jennifer M. Taylor, W. James Peacock, and Elizabeth S. Dennis (19 April 2011). "FLOWERING LOCUS C (FLC) regulates development pathways throughout the life cycle of Arabidopsis". Proceedings of the National Academy of Sciences United States of America 108 (16): 6680–6685. doi:10.1073/pnas.1103175108. http://www.pnas.org/content/108/16/6680.short. Retrieved 2017-09-17. 
  26. 26.0 26.1 26.2 26.3 26.4 Deena Schmidt and Rick Durrett (1 December 2004). "Adaptive Evolution Drives the Diversification of Zinc-Finger Binding Domains". Molecular Biology and Evolution 21 (12): 2326–2339. doi:10.1093/molbev/msh246. https://academic.oup.com/mbe/article/21/12/2326/1071065/Adaptive-Evolution-Drives-the-Diversification-of. Retrieved 2017-10-16. 
  27. H Eiberg, ML Bisgaard, J Mohr (01 December 1989). "Linkage between alpha 1B-glycoprotein (A1BG) and Lutheran (LU) red blood group system: assignment to chromosome 19: new genetic variants of A1BG". Clinical genetics 36 (6): 415-8. PMID 2591067. http://europepmc.org/abstract/MED/2591067. Retrieved 2017-10-08. 
  28. John R. Stehle Jr., Mark E. Weeks, Kai Lin, Mark C. Willingham, Amy M. Hicks, John F. Timms, Zheng Cui (January 2007). "Mass spectrometry identification of circulating alpha-1-B glycoprotein, increased in aged female C57BL/6 mice". Biochimica et Biophysica Acta (BBA) - General Subjects 1770 (1): 79-86. http://www.sciencedirect.com/science/article/pii/S0304416506001826. Retrieved 2017-10-08. 
  29. 29.0 29.1 29.2 29.3 Caitrin W. McDonough, Yan Gong, Sandosh Padmanabhan, Ben Burkley, Taimour Y. Langaee, Olle Melander, Carl J. Pepine, Anna F. Dominiczak, Rhonda M. Cooper-DeHoff, Julie A. Johnson (June 2013). "Pharmacogenomic Association of Nonsynonymous SNPs in SIGLEC12, A1BG, and the Selectin Region and Cardiovascular Outcomes". Hypertension 62 (1): 48-54. doi:10.1161/HYPERTENSIONAHA.111.00823. PMID 23690342. http://hyper.ahajournals.org/content/hypertensionaha/early/2013/05/20/HYPERTENSIONAHA.111.00823.full.pdf. Retrieved 2017-10-08. 
  30. 30.0 30.1 30.2 30.3 30.4 Cissi Gardmo and Agneta Mode (1 December 2006). "In vivo transfection of rat liver discloses binding sites conveying GH-dependent and female-specific gene expression". Journal of Molecular Endocrinology 37 (3): 433-441. doi:10.1677/jme.1.02116. http://jme.endocrinology-journals.org/content/37/3/433.full. Retrieved 2017-09-01. 

External links[edit]

{{Gene project}}