To preprints

(10) King C.C. (1992) Modular Transposition and the Structure of Eucaryote Regulatory Evolution Genetica 86 127-142.

Now available in HTML format

Click here to read: Modular Transposition and the Structure of Eucaryote Regulatory Evolution

The full paper can be downloaded (in Acrobat Reader (pdf)) format

This paper examines a model in which transposable elements provide a modular architecture for the cellular genome, complemented by cellular recombinational transformations, arising in turn as a dynamical consequence of this modular structure. It is proposed that the ecology of transposable elements in a given organism is a function of recombinational protocols of the evolving cellular genome. In mammals this is proposed to involve coordinated meiosis-phased activation of LINEs, SINEs and retrogenes complemented by endogenous retroviral transfer between cells.

Fig 1 : Diversity of retrogene structures and composite evolutionary events:
(a) Generation of viral genes through retroviral recombination (Varmus 1982).
(b) Evolution of the human amylase family showing inserted g-actin retropseudogene, retroviral and LTR regulators (Samuelson et. al. 1990).
(c) The amylase gene region showing the compound genes, a pseudogene and L1 elements.
(d) l-immunoglobulin processing and its relation to the pseudogene ly1.
(e) A possible tree for the rat a-globin pseudogenes (Vanin 1984).
(f) A typical processed retropseudogene, rat a-tubulin, showing bounding direct repeats, 3' poly-A, removed introns and an inserted SINE (Lemischka & Sharp 1982). (g) Target site repeats.
(h) The Ce3 pseudogene with LTRs, 5' region and exons flanked by short direct repeats (Ueda et. al. 1982).

The theme of this paper is that two forms of feedback between mobile and cellular protocols lead to a complex ecological relationship between the forms of transposable element and recombination processes in the cellular genome. The first of these is the feedback between phenotypic and element survival, which may require for example negative regulation of element transposition. The second is the feedback cellular recombination processes have on the transposable element ecology, as a result of both element and organism adaption and survival.

It is proposed that the evolution of major phylla is accompanied by particular structural relationships between the resident mobile families, constituting a stochastic dynamical system, which is characteristic both of the individual protocols of the elements and of cellular recombinational and structural features. In particular in mammals, a global linkage is proposed between five types of transposable structure, LINEs, SINEs, retrogenes, endogenous LTR-retroelements and cellular conversion processes, which constitutes a stochastic process phased to the reproduction cycle, providing a higher-level event space structure, enhancing survival through adaptability.

Excerpt:
Integral Evolution:

After the chiasmata of crossing over become apparent, the diplotene phase of meiosis occurs. Lampbrush chromosomes with 100kb loops producing massive transcripts proceeding far past their normal 3' ends into various repetitive sequences are notable in many species. Although part of the function of these, particularly in large oocytes of rapidly dividing embryos such as Xenopus, includes maternal mRNA precursors, they have also been cited as a recombinational phase providing for gene conversion processes. Rates of putated spooling motion of DNA around the loops allows time for transient sequential expression of the entire genome during oogenesis in Xenopus (Callan 1963, 1969, Wolfe 1972). The diplotene lampbrush phase in mammals is very prolonged. In humans it lasts from the fifth month of gestation to menopause 40 years later. Transcription over this period cannot be necessary for oocyte mRNA production. The relation of these transcripts to Davidson & Posakony's (1982) embryonic dsRNA remains to be determined. The female plays a predominant role, both because of the length of diplotene and the fact that the lampbrush phase occurs in both autosomes and the X chromosomes (King 1978). By contrast lampbrush expression occurs only transiently and on the Y-chromosome in spermatogenesis (Watson et. al. 1987).

The prospective dependence of SINEs such as Alu and processed retrogenes on the L1 LINE reverse transcriptase suggests that the three structures of LINEs, SINEs and processed retrogenes may all be mobilized in a form of concerted transposition phased with meiosis and driven by the L1 elements in a manner similar to I-R dysgenesis (Weiner et. al. 1986). It is noticeable that in several genes with testis-specific variants including the mouse zinc-finger Zfx and cytochrome c, the retrogenes formed arise from the non-testis version of the gene, supporting oogenic transposition. Although retrogenes are predominantly spliced, suggesting a cytoplasmic origin, the exceptions such as ya4 (both introns), rat preproinsulin 1 (one intron), partially processed U2 and the predominantly nuclear expression of the Alu family allow for nuclear retrogenes. The lampbrush chromosome phase is bypassed in Drosophila supporting the different mode of action of its mobile element ecology.

Meiotic transcript expression could also explain how genes other than the most ubiquitous housekeeping genes become retrogenes. If transcription is hyper-regulated in the prolonged diplotene e.g. by high polymerase content to produce transcripts reflecting genomic information rather than metabolic enzymes, the transcript population may reflect a more balanced distribution, in which rare genes are much more heavily expressed than usual. Variation may also occur over time, during the extended diplotene, and may involve unusual patterns of transcription, for example involving long pol III transcripts.

The wide differences of ~30% divergence in single-copy DNA between rat and mouse, compared with only 2% between man and chimp, is a function of the time of the divergence, the longer generation time (~15yrs) and hence lower real-time tolerable mutational load in higher apes, and more stringent DNA repair. However the more than comparable differences in phenotype, particularly in brain structure, of the higher primates suggests a more elaborate explanation is required. The long generation times, have to be explained by increasingly improbable mutations of a few key genes, or by a recombinational mechanism to allow the germ line to keep pace with the changing somatic phenotype.


Fig 2: Possible linked structures in the integral model of transposition-based evolution (a), include violations of Weismann's doctrine (b).

Mammalian transpositional evolution is thus dominated by the complementary action of two reverse transcriptases - the retroviral and LINE RTs. While the LINE RT has catalysed the emergence of the LINEs themselves, the ubiquitous SINEs and the processed retrogenes, the retroviral RT acts as a vector for passive and recombinant cell to cell transfer of genes and mutational retroviral and solo LTR inserts. The integral evolution model combines complementary action of the these two retroelement RTs into a complementary action between the meiotic expression of LINEs, SINEs and retrogenes and the developmental expression of retroviruses :

(1) Meiosis includes a local recombinational editing process, involving coordinated mobilization of SINEs and retrogenes by LINEs, and possibly cellular conversion processes. This process is mediated principally through the female extended diplotene and is pivotal to the evolution of mammalian species with long generation times, particularly the great apes and Homo sapiens (King 1978).

(2) Retroviral transfer to the germ-line both in early development and adult life contribute an additional source of mutations containing effective somatic information derived by clonal selection of somatic cells during development.

The possibility that retroviral transfer from the soma to the germ-line can stochastically violate Weismann's doctrine has been a controversial theme (Gorczynski & Steele 1980, Steele 1984) which nevertheless has received repeated interest and comment (Reanney 1974, 75, 76, Bernstein et. al. 1983, Vanin 1984, King 1985, Pollard 1987). Somatic information occasionally penetrating the germ line in the form of transcribed RNAs could subsequently recombine with germ-line genes in the diplotene phase. Certain aspects of the retrovirus life cycle could form a basis for such soma-germ transposition.

Differentiation-specific expression is common to endogenous retroviruses and is consistent with selective inhibition of transcription in germ line cells, but capacity for activation of at least some retroviruses in other cell types. Antisera to simian sarcoma virus demonstrated viral antigens in 24/24 human placentas examined (Deinhardt 1980). Such somatic expression and would allow proviral germ integration since the block is at transcription and the integrase is packaged within the viral capsid. The

Recombinant retroviruses carrying cellular genes would be particularly effective at both having a specific regulatory effect and carrying the same pattern of regulation subsequently back into the germ line. Such somatic information would already be selected as fit by the regulatory competition and controlled cell death that is a prominent feature of mammalian neurogenesis (Blakemore 1991) and immunogenesis. The existence of recombinant oncoviruses, and recombination rates per cycle of 10-30% demonstrate that this process is able to act generally enough to apply to a variety of cellular regulatory genes of length < ~10 kb, as illustrated by Ce3. A selectively male retrovirus has been noted (Phillips et. al. 1982). A variety of retroviruses carry sexually selective steroid enhancers. This raises the possibility of sex-specific activation of retroviruses in gametogenesis and development.

Such a stochastic process would thus only permit fixation of acquired characteristics on an occasional mutational basis, however it would resolve the conceptual problems of describing an organism as complex as Homo sapiens as a mere causal offshoot, whose only function in evolution is to facilitate fertilization of a single-celled germ line.

Modulation of the process would limit the occurrence of deleterious mutations to a tolerable load per generation. An estimate of 1% of retrovirus-related sequences in human DNA based on an average size of 1.5kb (5 solo LTRs per 5kb retroviral unit) would constitute 20,000 copies, however this number is accounted for by the ~40,000 human THE1 LTR-retroelement family inserts alone. 20 smaller families could lift this figure to 60,000-80,000, similar to the ~90,000 L1 copy number and ~50,000 cellular gene number, but less than the 18/1 ratio of retrogenes to cellular genes predicted by Zuckerkandyl et. al. (1989). Given 35 myrs at 20 yrs/generation we have 1.7 x 106 generations representing a germ-line entry rate of .045/generation, somewhat less than tolerable loads. Several opposing factors may make this figure inaccurate, including mutational assimilation of solo LTRs, and post-insertional family amplification. Small numbers of mutational events may be consistent with quantum non-convergence (King 1989).

The above processes could be linked to a second set of transpositional processes occurring in embryogenic development and adult life. A particular strain of recombinant virus generated during developmental activation, including for example a mutant gene effecting neurogenesis could trigger a regulatory change in much the same way the better-known leukaemia viruses cause tumorigenic transformation. Expression of such elements could result in mutations to new regulatory schemes, such as a repeated cycle of mitotic development, or a new pattern of cytotactic growth. The higher levels of growth factor transcripts in early embryogenesis would promote the formation of retrogenes for these genes at this time.

A further factor which is likely to intervene in this process is genomic stress. If a given cell type is under a form of stress, this may stimulate concerted transposition, activating germ-line transfer. The elongated diplotene could be phased to pick up retrotranscribed stress information through recombination. A variety of circumstances could cause stress, including repeated neuronal stimulation in learning.

The development of the 1015 synapses in the human brain requires the action of only 30,000 genes, about 60% of the total human complement, representing a particularly challenging problem in parallel regulation. Recent work indicates that development of the cerebral cortex may proceed on general principles of coordinated growth, tissue layer organization and cell migration in which specific sensory structures are only later established, partly through stimulation from the developing senses (Blakemore 1991), including chaotic excitation (King 1991). Regulatory competition and controlled cell death is a prominent feature. Diversity of both developmental proteins and interactive molecules such as neurotransmitter receptor proteins can play a significant role in brain organization. The receptors involved in long-term potentiation in the hippocampus also appear to be derived from genes involved in embryonic differentiation making a further link between these. Notably the G-protein linked receptors form an extensive intronless gene family consistent with the coordinated family recombination model, fig 1.

Males Drive Evolution in the raw mutation race: The evolutionary rates of sperm DNA have been shown to be generally faster than ovum DNA. This does not contradict the model because it is a basic rule concerning relative mutation rates and error-correcting capacity.

Fig 3 : Diverse mutational and regulatory potential of SINEs: (a) Gene or exon duplication, (b,c) Inverted and direct Alu repeats permit chromosomal reaarangements, (d) anti-sense pol III transcript forming dsRNA, (e) (-)-ve enhancer (f) spliced Alu intron interacts with Alu in the promoter, (g) altered poly-A sites, translation or stability by 3' Alu insertion, (h) generation of a retrogene with 5' pol II promoter by upstream transcription from a pol III Alu site.

Appendix : SINEs as Cellular Regulatory Elements

Although copy numbers vary widely between species, arguing against a fundamental regulatory role, a variety of cellular functions have been suggested for Alu elements, fig 3, including origins of replication, transcription control, RNA processing, promoting recombination, transposition or inversion of sections of the cellular genome bounded by repeated elements, and limiting gene conversion by insertion into one of a family.

Some SINE transcripts show evidence of tissue-specific expression. The rat ID sequences (Sutcliffe et. al. 1984a,b), and a primate 200bp RNA homologous to the left Alu monomer are expressed selectively in brain tissue (Watson & Sutcliffe 1987). Although the dissimilarity of the rat and primate sequences suggests they may not perform a conserved function, their pattern of expression is conserved. The possibility that pol III transcripts, possibly including anti-sense RNA function in gene regulation has further support (Lassar et. al. 1983, Carlson & Ross 1983, Weiner et. al. 1986). Manley & Colozzo (1982) have also proposed an Alu pol III transcriptional control model based on pol III transcription from Alu. The frequency of Alu could allow internal binding of their sequences in introns or binding between pol III transcripts and hnRNA to play a role in post-transcriptional regulation similar to Davidson and Britten's (1979) model, or Alu in the 3' end at translation (Yamamoto et. al. 1984). It has been noted that differing sections of Alu bind C-factor, T-antigen and other proteins. Cis acting regulatory function would allow differential divergence from consensus to provide differential regulation. Saffer and Thurston (1989) have discovered a monkey Alu element containing a 2-5 fold modulating 38 bp negative enhancer - the protein reducing sequence which is followed by (GT)n instead of An, which acts on a variety of promoters of both pol II and pol III and may inhibit Alu itself. The motif could be common to a (GT)n-containing subfamily, providing for coordinated regulation. The concept of functionality and selection may have more varied constraints, arising from RNA secondary structure than those on coding sequences. Zuckerkandyl et. al. (1989) have coined the term cheap genes for near-neutrally evolving forms of function.

Confirmatory Evidence:

Alu sequences may be a factor in Primate Evolution New Scientist 25 Sept 95

THE human genome is littered with "junk" DNA that everyone used to think had no real function. But now one of the most common types of genetic junk turns out to contain a working copy of a genetic switch that activates other genes. The junk sequence, known as Alu, may have played an important role in the evolution of primates, says Wanda Reynolds of the Sidney Kimmel Cancer Center in San Diego. Alu is a 283-nucleotide sequence that acts as a "jumping gene". From time to time, it inserts copies of itself randomly into the genome. Over the past 30 to 60 million years these insertions have occurred repeatedly, leaving roughly a million copies of Alu scattered through the human genome and making up almost 10 per cent of all the DNA in each cell. During this time, the sequences of the various Alus have begun to diverge, so that four distinct subfamilies of Alu can now be recognised. While studying one of these subfamilies, Reynolds noticed a short stretch of DNA only 14 bases long- that looked familiar. Elsewhere in the genome, there are nearly identical sequences that function as anchor points for proteins that bind to hormones and which therefore provide a way for hor mones to turn genes on and off. Reynold and her colleague Gordon Vansant learned that the Alu sequence also binds to a hor mone receptor-in this case, the receptor for a hormone called retinoic acid, whic activates genes at the proper times durin development (Proceedings of the National Academy of Sciences, vol 92, p 8229). Vansant and Reynolds then turned their attention to a naturally occurring Alu that sits close to the human gene for keratin, protein found in our skin, hair and nails They looked at cells in which they had re placed the keratin gene with a "marker' gene whose activity could be easily mea sured. When the researchers then delete the Alu sequence, they found that th marker gene became 35 times less active. Since submitting their paper for publica tion, they have found functional bindin sites for the retinoic acid receptor in second subfamily of Alus. This subfamil also contains sequences that bind to thyroi hormone receptors, says Reynolds, "so th story is going to get even more interesting" A few Alus have previously been sho to affect the activity of nearby genes, but the new study is the first to show how. The results also provide the first clear evidence that most Alus could have the potential to regulate human genes. Other researchers have been hunting for similar effects but without success. "I've been looking for mobile elements carrying out significant regulatory roles, and I've made little progress," says Roy Britten of the California Institute of Technology in Pasadena. Reynolds believes that most Alus have little effect on nearby genes, perhaps because they are bundled deep within folds of DNA. But she says that with a million Alus strewn randomly through the genome during the course of primate evolution, at least a few are likely to have landed where they could regulate a nearby gene. When this occurred, she suggests, the effect would be equivalent to randomly twisting a knob on an instrument panel. Usually the effect would be harmful, but once in a while it might produce an interesting and beneficial genetic novelty. "We can't prove it," she says, "but it seems that over the last 30 to 50 million years, it would provide good evolutionary fodder." Bob Holmes, Santa Cruz

 

To return to publications (click publications)