Read Junk DNA: A Journey Through the Dark Matter of the Genome Online
Authors: Nessa Carey
The most critical component in each cluster is a region of junk DNA that controls the expression of the protein-coding genes. This critical component is called the imprinting control element, or ICE. It’s a little like lighting a room with twelve light bulbs.
If you want to adjust the level of light in the room, you could use a range of bulbs with different luminosities, and you could have a separate switch for each. But that’s a fairly labour-intensive way of controlling the overall light level. Much better to have all twelve bulbs on one circuit and control them simultaneously with either an on/off switch or a dimmer switch if you want a bit more flexibility.
The ICE acts as the central dimmer switch, but there’s a slight complication compared with our electrical analogy. The reason why the ICE is important is because it is responsible for driving the expression of a large non-coding RNA molecule. This long non-coding RNA can switch off the expression of the genes in the surrounding cluster. So, essentially, imprinting is critically dependent on two types of junk DNA – ICE regions on the genome, and the long non-coding RNAs the ICEs control. If the long non-coding RNA at a specific cluster is switched on, it switches off expression of the protein-coding genes in that cluster. On the other hand, if the long non-coding RNA driven by the ICE is repressed, the protein-coding genes in the cluster can be activated.
Imprinting critically depends on the junk DNA and its crosstalk with the epigenetic system. The ICE can be epigenetically modified. Expression of the long non-coding RNA is dependent on whether or not the DNA at its ICE is methylated. If the ICE DNA is methylated, this prevents expression of the long non-coding RNA. If the ICE has escaped methylation, the long non-coding RNA will be expressed. Essentially there is a reciprocal arrangement. If the long non-coding RNA is expressed, the genes in the cluster on the same chromosome will be switched off. If the long non-coding RNA is not expressed, the genes in the cluster on the same chromosome will be switched on. The long non-coding RNAs in the imprinted regions are sometimes extraordinarily long, the biggest being a staggering 1 million bases in length.
5
Unfortunately, we are still a bit sketchy in our understanding
of the exact mechanisms that the long non-coding RNAs use to repress the expression of the nearby gene cluster. It certainly seems to involve the epigenetic system again, resulting in the deposition of repressive epigenetic modifications on the protein-coding genes. If key epigenetic genes such as the major repressor that we met in Chapter 9 are knocked out in developing embryos, some of the imprinted genes that would normally be switched off are expressed.
6
It’s not just restricted to the major repressor either, as knockout of other epigenetic genes that establish repressive histone modifications has similar effects.
7
,
8
This demonstrates the importance of the epigenetic system in carrying out the instructions of the long non-coding RNA. It’s likely this is because the long non-coding RNA attracts these enzymes to the imprinted cluster, thereby targeting the histone modifications to the protein-coding genes.
Epigenetic modifications are also present at the ICE itself. As we would expect, if the ICE DNA is methylated, the histone modifications are ones which are associated with switching genes off. If the ICE is unmethylated, the histone modifications are those which are associated with switching genes on. The pattern of epigenetic modifications on the ICE is completely consistent across the DNA and histone proteins.
9
In the imprinting process, the critical determinant is whether or not the junk DNA forming the ICE is methylated or not. There have been suggestions that the methylation of the ICEs evolved when silencing of nearby parasitic elements such as those we met in Chapter 4 spread into neighbouring regions. This may have conferred a fitness advantage, and been selected for in subsequent generations.
10
It’s intriguing that in the most primitive mammals, the egg-laying monotremes such as the duck-billed platypus and the echidna, there are uncharacteristically few parasitic elements near the regions where we would expect to find the ICEs in higher mammals.
11
Resetting the imprint
But how does the methylation pattern become established at the ICE in modern mammals and passed on, given that it is not dependent on differences in DNA sequences between the maternally and paternally derived genomes? How does it get set properly? A woman will inherit imprinted regions from her father in which the ICE is methylated/non-methylated to signify she received this region from her dad. But if she passes this same imprinted region on to her child, this paternal imprint must be removed and replaced with one showing it came from mother.
This seems full of internal contradictions, but it becomes a little easier to understand if we once again visit the world of the musical. Not Oscar Hammerstein this time but Hal David, who was the lyricist who worked for a long time with Burt Bacharach. They wrote the songs for the 1973 flop film musical
Lost Horizon
. One of the songs from this became famous and contains a quite useful concept for us: ‘The world is a circle without a beginning and nobody knows where it really ends.’ Developmental processes become much easier to visualise if we think of them as never-ending circles rather than in straight lines. The ‘put it on–take it off–put it on’ cycle in the generation of the imprinted ICE is shown in Figure 10.2. This shows how eggs always pass on a maternal pattern of ICE methylation. A similar process allows sperm always to pass on the reciprocal paternal pattern.
Of course, one of the questions this schema raises is how the developing eggs and sperm identify ICE regions and how they ‘know’ which should be methylated and which unmethylated. This is an area of very active research and it may be different for each ICE, and between male and female germ cells. Some of it is frankly still a mystery but there are certain features that have been elucidated. We know that in the maternal germline, i.e. the cells that give rise to eggs, the process is critically dependent on the enzymes that can add DNA methylation to previously unmethylated CpG
motifs.
a
12
After that, the pattern is actively sustained by an enzyme whose job it is to maintain existing methylation patterns.
b
13
Other proteins are also likely to be involved in establishing the correct methylation patterns, some of which are likely to be selectively expressed in the developing germ cells.
Figure 10.2
Cycles of methylation and demethylation ensure that chromosomes are passed on to children with the correct modifications to indicate parent of origin.
How do the enzymes in the germ cells recognise the ICE regions among all the other genomic DNA? Again there are gaps in our knowledge, although it has been suggested that certain repeated sequences in these special junk DNA regions may play a role.
14
These are quite poorly conserved at the sequence level between species, but may look more similar when we consider their three-dimensional structures. The cell may have a way of recognising them through their shape, rather than their sequence.
15
This is similar to the findings for long non-coding RNAs we saw in Chapter 8.
Although there are obviously plenty of questions that remain
about imprinting, we are confident that this is absolutely the reason why both sexes have to contribute to the offspring. In a complex set of breeding experiments using genetically modified mice, researchers showed in 2007 that they could generate viable mice by inserting two egg nuclei into one fertilised egg. The reason they were able to do this was that they artificially altered the pattern of imprinting at two regions in the mouse genome. In one of the egg nuclei, they had created a methylation pattern that looked like the normal paternal pattern, not the maternal one. This fooled the developmental pathways into believing that the genetic material was from a male rather than a female. This demonstrated a particularly strong role for these two imprinted regions in controlling development. It also showed that the only real block to bi-maternal reproduction is the DNA methylation pattern at key genes. It disproved a previous hypothesis that sperm were required because the sperm themselves carried certain necessary accessory factors such as particular proteins or RNA molecules required to kick-start development properly.
16
Going back to Figure 10.2 we can see that imprinting patterns may change during development. Imprinted control of gene expression seems to be particularly important during development. In mice, for example, most of the 140 or so imprinted genes are only imprinted in the placenta. In adult tissues both or neither copy of the genes may be expressed. This confirms that control of growth during early development was probably the major reason why imprinting evolved. There seems to be almost a geographical reason for this. In the imprinting clusters, the genes nearest the ICE may remain imprinted in all tissues but the ones further from the control centre may only be imprinted in the placenta. Selected cell types in the brain seem to be particularly likely to retain imprinting, although there is no clear consensus on why this would be favoured evolutionarily in most cases. There have been suggestions that the long non-coding RNA produced from the ICE
attracts DNA methylation to the nearest genes but attracts histone modifications to the more distant genes in the cluster.
17
Because histone modifications can be more easily altered than DNA methylation, this may provide a mechanism for releasing more distant genes from imprinting as tissues mature.
So, imprinting occurs, and we have insights into at least some of the mechanisms by which this happens. In light of the theory that imprinting has evolved to balance out the competing evolutionary drives of the mother and foetus (and thus indirectly the father), it’s not surprising that the majority of protein-coding genes controlled by imprinting are ones involved in foetal growth and infant suckling, along with metabolism.
18
It’s also not surprising that when imprinting goes wrong, defects in growth are the commonest symptoms.
When imprinting goes bad
Studies of imprinting disorders really took off in the 1980s, when it was first becoming possible to identify genes associated with inherited diseases. The techniques involved finding families with more than one individual affected by a condition, and then analysing these families to narrow down the region on a chromosome that caused the disease. We can do this pretty easily now because we have the sequence of the normal human genome and access to very cheap sequencing technologies. But back in the 1980s, finding a mutation which caused a disease was akin to being asked to find a specific broken light bulb when all you knew was that it was in a house in America. It took years of work by large teams of scientists to identify the mutations underlying a condition.
A number of groups were looking into a disease called Prader-Willi syndrome. Babies born with Prader-Willi syndrome have a low birth weight and poor suckling responses. Their muscle tone doesn’t develop properly until after weaning, so the babies
are quite floppy. As the children get older, their appetite becomes completely insatiable and as a consequence they develop early and extreme obesity. The children also suffer from mild mental disability.
19
A completely different set of researchers was working on a condition with very different symptoms. This is called Angelman syndrome. Children suffering from this condition have small, under-developed heads, severe learning disabilities and are very late at moving on to solid food. The children are prone to outbursts of laughter for no reason, but thankfully the previous appallingly insensitive description of these patients as ‘happy puppets’ is falling into disuse.
20
Imagine building a railway across a continent, where one set of workers starts in the east and builds westward, and the other starts in the west and builds eastwards. At first the workers are in completely different territories, but as time goes on they begin to get closer and closer to each other, and eventually there is a point (assuming all has gone well) where they meet, drive in the last spike, shake hands and have a drink. This is pretty much what happened to the researchers investigating Prader-Willi syndrome and Angelman syndrome. The difference, of course, compared with our railway analogy is that the scientists never expected to meet. They thought they were building independent railways, to completely different cities, and yet they each ended up in exactly the same spot as the other.