Article objectives

  • To describe general mechanisms of gene expression.
  • To differentiate between a cis-regulatory element and a trans-acting factor.
  • To define a transcription factor.
  • To define an operon.
  • To describe how the lac operon regulates transcription.
  • To describe the role of the TATA box.
  • To express the importance of gene regulation during development.
  • To describe the role of homeobox genes and gap genes.
  • To discuss gene regulation in terms of the development of cancer.
  • Each of your cells has about 22,000 genes. In fact, all of your cells have the same genes. So do all of your cells make the same proteins? Do all 22,000 genes get turned into proteins in every cell? Of course not. If they did, then all your cells would do the same thing. You have cells with different functions because you have cells with different proteins. And your cells have different proteins because they “use” different genes. The regulation of gene expression, or gene regulation, includes the mechanism to turn genes “on” and transcribe the gene into RNA. Any aspect of a gene’s expression may be regulated, from the onset of transcription to the post-translational modification of a protein. It is this regulation that determines when and how much of a protein to make, giving a cell its specific structure and function.

    Mechanisms of Regulation

    Any step of gene expression may be modulated, from the DNA-RNA transcription step to post-translational modification of a protein. Following is a list of stages where gene expression is regulated:

    • Chemical and structural modification of DNA or chromatin
    • Transcription
    • Translation
    • Post-transcriptional modification
    • RNA transport
    • mRNA degradation
    • Post-translational modifications

    We will now focus on regulation at the level of transcription. During transcription RNA polymerase reads the DNA template to make a complementary strand of RNA. The genes to which RNA polymerase binds is a highly regulated process. When RNA polymerase binds to a gene, it binds to the promoter, a segment of DNA that allows a gene to be transcribed. The promoter helps RNA polymerase find the start of a gene.

    Gene regulation at the level of transcription controls when transcription occurs as well as how much RNA is created. This regulation is controlled by cis-regulatory elements and trans-acting factors. A cis-regulatory element is a region of DNA which regulates the expression of a gene or multiple genes located on that same strand of DNA. These cisregulatory elements are often the binding sites of one or more trans-acting factors, usually a regulatory protein which interacts with RNA polymerase. A cis-regulatory element may be located in a gene’s promoter region, in an intron, or in the 3’ region.

    A regulatory protein, or a transcription factor, is a protein involved in regulating gene expression. It is usually bound to a cis-regulatory element. Regulatory proteins often must be bound to a cis-regulatory element to switch a gene on (activator), or to turn a gene off (repressor).

    Transcription of a gene by RNA polymerase can be regulated by at least five mechanisms:

    Specificity factors (proteins) alter the specificity of RNA polymerase for a promoter or set of promoters, making it more or less likely to bind to the promoter and begin transcription.
    Repressors (proteins) bind to non-coding sequences on the DNA that are close to or overlap the promoter region, impeding RNA polymerase’s progress along the strand.
    Basal factors, transcription factors that help position RNA polymerase at the start of a gene.
    • Enhancers are sites on the DNA strand that are bound by activators in order to loop the DNA, bringing a specific promoter to the initiation complex.
    An initiation complex is composed of RNA polymerase and trans-acting factors.
    Activators
    * (proteins) that enhance the interaction between RNA polymerase and a particular promoter.

    As the organism grows more sophisticated, gene regulation becomes more complex, though prokaryotic organisms possess some highly regulated systems. Some human genes are controlled by many activators and repressors working together. Obviously, a mutation in a cis-regulatory region, such as the promoter, can greatly affect the proper expression of a gene. It may keep the gene permanently off, such that no protein can be made, or it can keep the gene permanently on, such that the corresponding protein is constantly made. Both of these can have detremental effects on the cell.

    Prokaryotic Gene Regulation

    In prokaryotes, a combination of activators and repressors determines whether a gene is transcribed. As you know, prokaryotic organisms are fairly simple organisms with much less DNA. Prokaryotic genes are arranged in operons, a region of DNA with a promoter, an operator (defined below), and one or more genes that encode proteins needed to perform a certain task. To maintain homeostasis (and survive), the organism must quickly adapt changing environmental conditions. The regulation of transcription plays a key role in this process.

    For a bacteria, many aspects of gene regulation are due to the presence or absence of certain nutrients. In prokaryotes, repressors bind to regions called operators that are generally located immediately downstream from the promoter. Activators bind to the upstream portion of the promoter.

    The Lac Operon

    The lac operon (Figure 1) is an operon required for the transport and metabolism of lactose in E. coli. The lac operon is regulated by the availability of lactose. The lac operon consists of a promoter, an operator, three adjacent structural genes which code for enzymes and a terminator. The three genes are: lacZ, lacY, and lacA. All three genes are controlled by the same regulatory elements.

    Figure 1: The lac operon. The lac operon contains genes for three enzymes, lac, lacY, and lac A, as well as the promoter, operator, and terminatory regulatory regions.

    In bacteria, the lac repressor protein blocks the synthesis of enzymes that digest lactose when there is no lactose present (Figure 2). When lactose is present, it binds to the repressor, causing it to detach from the DNA strand.

    Specific control of the lac operon depends on the availability of lactose. The enzymes needed to metabolize lactose are not produced when lactose is not present. When lactose is available, and therefore needs to be metabolized, the operon is turned on, RNA polymerase binds to the promoter, and the three genes are transcribed into a single mRNA molecule. However, if lactose is not present (and therefore does not need to be metabolized), the operon is turned off by the lac repressor protein (Figure 2).

    The lacI gene, which encodes the lac repressor, lies near the lac operon and is always expressed (constitutive). Therefore, the lac repressor protein is always present in the bacteria. In the absence of lactose, the lac repressor protein will bind to the operator, just past the promoter in the lac operon. The repressor blocks the binding of RNA polymerase to the promoter, keeping the operon turned off (Figure 2).

    When lactose is available, a lactose metabolite called allolactose binds to the repressor. This interaction causes a conformational change in the repressor shape and the repressor falls off the operator, allowing RNA polymerase to bind to the promoter and initiate transcription (Figure 2).

    Figure 2: Regulation of the lac operon. When lactose is present, RNA polymerase (red) binds to the promoter (P) and the three genes are expressed, producing a single mRNA for the three genes. When lactose is unavailable, the lac repressor (yellow) binds to the operator (O) and inhibits the binding of RNA polymerase to the promoter. The three genes are not expressed.

    Eukaryotic Gene Regulation

    All your cells have the same DNA (and therefore the same genes), yet they have different proteins because they express different genes. In eukaryotic cells, the start of transcription is one of the most complex aspects of gene regulation. Transcriptional regulation involves the formation of an initiation complex involving interactions between a number of transcription factors, cis-regulatory elements, and enhancers, distant regions of DNA that can loop back to interact with a gene’s promoter. These regulatory elements occur in unique combinations within a given cell type, resulting in only necessary genes being transcribed in certain cells. Transcription factors bind to a DNA strand, allowing RNA polymerase to bind and start transcription.

    Each gene has unique cis-regulatory sequences, only allowing specific transcription factors to bind. However, there are common regulatory sequences found in most genes. The TATA box is a cis-regulatory element found in the promoter of most eukaryotic genes. It has the DNA sequence 5’-TATAAA-3’ or a slight variant, and has been highly conserved throughout evolution. When the appropriate cellular signals are present, RNA polymerase binds to the TATA box, completing the initiation complex. A number of transcription factors first bind to the TATA box while other transcription factors bind to the previously attached factors, forming a multi-protein complex. It is only when all the appropriate factors are bound that RNA polymerase will recognize the complex and bind to the DNA, initiating transcription.

    One of the more complex eukaryotic gene regulation processes is during development. What genes must be turned on during development so that tissues and organs form from simple cells?

    Regulation of Gene Expression During Development

    What makes the heart form during development? What makes the skin form? What makes a structure become an arm instead of a leg? These processes occur during development because of a highly specific pattern of gene expression. This intensely regulated pattern of gene expression turns genes on in the right cell at the right time, such that the resulting proteins can perform their necessary functions to ensure proper development. Transcription factors play an extremely important role during development. Many of these proteins can be considered master regulatory proteins, in the sense that they either activate or deactivate the transcription of other genes and, in turn, these secondary gene products can regulate the expression of still other genes in a regulatory cascade. Homeobox and gap genes are important transcription factors during development.

    Homeobox Genes

    Homeobox genes contain a highly conserved DNA sequence known as a homeobox and are involved in the regulation of genes important to development. A homeobox is about 180 base pairs long; it encodes a 60 amino acid domain within the protein (known as the homeodomain), which can bind DNA. Proteins with a homeodomain are therefore transcription factors. These factors typically switch on series of other genes, for instance, the genes needed to encode the proteins to make a leg.

    A particular subgroup of homeobox genes are the Hox genes. Protein products of Hox genes function in patterning the body, providing the placement of certain body parts during development. In other words, Hox genes determine where limbs and other body segments will grow in a developing fetus or larva. Mutations in any one of these genes can lead to the growth of extra, typically non-functional body parts in invertebrates. The Antennapedia mutation in Drosophila results in a leg growing from the head in place of an antenna. A mutation in a vertebrate Hox genes usually results in miscarriage.

    Gap Genes

    A gap gene controls the shape of a developing zygote early in its development. The products of these genes produce gaps in a rather uniform arrangement of cells (Figure 3). One example of this is the Kruppel gene, which regulates the activity of a number of other genes. Gap genes encode transcription factors, and the Kruppel gene is a zinc-finger protein. A zinc finger is a DNA binding region within the protein. A zinc finger consists of two antiparallel sheets and an helix with a zinc ion, which is important for the stability of this region. Gap genes control the expression of other genes within specific regions of cells in the developing organism. This allows specific genes to be expressed in certain cells at the appropriate stage of development.

    Figure 3: Gap gene expression. Shown is the expression pattern of four gap genes, Kruppel, Giant, Knirps, and Tailless, in a developing Drosophilia embryo. Note how the expression of these genes creates an unique pattern resulting in gaps in what was a rather uniform arrangement of cells.

    Regulation of Gene Expression in Cancer

    Carcinogenesis depends on both the activation of oncogenes and deactivation of tumor suppressor genes. At least two separate mutations are necessary to develop cancer. For example, a mutation in a proto-oncogene would not necessarily lead to cancer, as normally functioning tumor suppressor genes would counteract the effects of the oncogene. It is the second mutation in the tumor suppressor gene that could lead to uncontrolled cell growth and possibly cancer. Both oncogenes and tumor suppressor genes play an important role in gene regulation and cell proliferation (Figure 4).

    Figure 4: Signal transduction pathways. Ras (upper middle section) activates a number of pathways but an especially important one seems to be the mitogen-activated protein kinases (MAPK). MAPK transmit signals downstream to other protein kinases and gene regulatory proteins. Note that many of these pathways are initiated when a signal binds to its receptor outside the cell. Most pathways end with altered gene regulation and cell proliferation. The p53 tumor suppressor protein is shown at the lower section of the figure stimulating p21. The complexity of the pathways demonstrate the significant role these play in the cell.

    Oncogenes

    The products of proto-oncogenes are required for normal growth, repair and homeostasis. However, when these genes are mutated, they turn into oncogenes and play a role in the development of cancer. Proto-oncogenes may be growth factors, transcription factors, or other proteins involved in regulation. A very common oncogene, ras, is normally a regulatory GTPase that switches a signal transduction chain on and off. Ras and Ras-related proteins are products of oncogenes found in 20% to 30% of human tumors.

    Ras is a G protein, a regulatory GTP hydrolase that cycles between an activated and inactivated form. When a growth factor binds to its receptor on the outside of the cell, a signal is relayed to RAS. As a G protein, Ras is activated when GTP is bound to it. The active Ras then passes the signal to a series of protein kinases, regulatory proteins that eventually activate transcription factors to alter gene expression and produce proteins that stimulate the cell cycle (Figure 4). Many of the genes and proteins involved in signal transduction pathways are interconnected to ras. Any mutation that makes ras more active or otherwise interrupts the normal signal transduction pathways (Figure 4) may result in excessive cell division and cancer.

    Tumor Suppressor Genes

    An example of a tumor suppressor gene is p53, which encodes a 53,000 dalton protein, The p53 gene is activated by DNA damage. DNA may be damaged by ultraviolet light, and any damaged DNA may be harmful to the cell. Mutations causing problems with any of the components of Figure 4 may lead to the development of cancer. So that damaged DNA is not replicated, the cell cycle must be temporarily stopped so that the DNA can be repaired. The p53 tumor suppressor gene encodes a transcription factor that regulates the synthesis of cell cycle inhibiting proteins (Figure 4). p53 often activates a gene named p21, whose protein product temporarily stops the cell cycle. If the DNA can not be repaired, p53 activates other genes that lead to cell death, or apoptosis. This prevents the cell from passing on damaged DNA. If the p53 tumor suppressor gene is defective, as by mutation, DNA damage in the cell may accumulate and the cell may survive to replicate the damaged DNA. The damaged DNA would then be passed to other cells through many cell divisions, and cancer could develop.

    Images courtesy of:

    http://en.wikipedia.org/wiki/Image:Lac_operon1.png. Public Domain.

    http://en.wikipedia.org/wiki/Image:Lac_operon.png. GFDL.

    http://commons.wikimedia.org/wiki/Image:Gap_ene_expression.png. Public Domain.

    http://en.wikipedia.org/wiki/Image:Signal_transduction_v1.png. GNU-FDL.