Gödel, Escher, Bach: An Eternal Golden Braid (93 page)

Read Gödel, Escher, Bach: An Eternal Golden Braid Online

Authors: Douglas R. Hofstadter

Tags: #Computers, #Art, #Classical, #Symmetry, #Bach; Johann Sebastian, #Individual Artists, #Science, #Science & Technology, #Philosophy, #General, #Metamathematics, #Intelligence (AI) & Semantics, #G'odel; Kurt, #Music, #Logic, #Biography & Autobiography, #Mathematics, #Genres & Styles, #Artificial Intelligence, #Escher; M. C

BOOK: Gödel, Escher, Bach: An Eternal Golden Braid
10.59Mb size Format: txt, pdf, ePub

Amino Acids

Proteins are composed of sequences of amino acids, which come in twenty primary varieties, each with a three-letter abbreviation:

ala

alanine

arg

arginine

asn

asparagines

asp

aspartic acid

cys

cysteine

gln

glutamine

glu

glutamic acid

gly

glycine

his

histidine

He

isoleucine

leu

leucine

lys

lysine

met methionine

phe phenylalanine

pro

praline

ser

serine

thr

threonine

trp

tryptophan

tyr

tyrosine

val

valine

Notice the slight numerical discrepancy with Typogenetics, where we had only fifteen

"amino acids" composing enzymes. An amino acid is a small molecule of roughly the same complexity as a nucleotide; hence the building blocks of proteins and of nucleic acids (
DNA
,
RNA
) are roughly of the same size. However, proteins are composed of much shorter sequences of components: typically, about three hundred amino acids make a complete protein, whereas a strand of
DNA
can consist of hundreds of thousands or millions of nucleotides.

Ribosomes and Tape Recorders

Now when a strand of m
RNA
, after its escape into the cytoplasm, encounters a ribosome, a very intricate and beautiful process called translation takes place. It could be said that this process of translation is at the very heart of

all of life, and there are many mysteries connected with it. But in essence it is easy to describe. Let us first give a picturesque image, and then render it more precise. Imagine the mRNA to be like a long piece of magnetic recording tape, and the ribosome to be like a tape recorder. As the tape passes through the playing head of the recorder, it is "read" and converted into music, or other sounds. Thus magnetic markings are "translated" into notes.

Similarly, when a "tape" of mRNA passes through the "playing head" of a ribosome, the

"notes" which are produced are amine acids, and the "pieces of music" which they make up are proteins. This is what translation is all about: it is shown in Figure 96.

The Genetic Code

But how can a ribosome produce a chain of amino acids when it is reading a chain of nucleotides This mystery was solved in the early 1960's by the efforts of a large number of people, and at the core of the answer lies the Genetic Code-a mapping from triplets of nucleotides into amino acids (see Fig. 94). This is in spirit extremely similar to the Typogenetic

Code,

except

that

here,

three

consecutive

bases

(or

nucleotides)

form a codon,

whereas there,

CUA
GAU

only two were

needed. Thus

, C u A g A u

there must be

4x4x4 (equals

64)

different

entries in the

A typical segment of mRNA

table, instead of

sixteen.

A

read first as two triplets

ribosome clicks

down a strand

(above), and second as three

of RNA three

nucleotides at

duplets (below): an example

a time-which is

to say, one.

of hemiolia in biochemistry

codon at a time

-and

each

time it does so,

it appends a single new amino acid to the protein it is presently manufacturing. Thus, a protein comes out of the ribosome amino acid by amino acid.

Tertiary Structure

However, as a protein emerges from a ribosome, it is not only getting longer and longer, but it is also continually folding itself up into an extraordinary three-dimensional shape, very much in the way that those funny little Fourth-of-July fireworks called "snakes"

simultaneously grow longer and curl up, when they are lit. This fancy shape is called the protein's tertiary structure (Fig. 95), while the amino acid sequence per se is called the primary structure of the protein. The tertiary structure is implicit in the primary structure, just as in Typogenetics. However, the recipe for deriving the tertiary structure, if you know only the primary structure, is by far more complex than that given in Typogenetics. In fact, it is one of the outstanding problems of contemporary molecular biology to figure out some rules by which the tertiary structure of a protein can be predicted if only its primary structure is known.

The Genetic Code.

U

C

A

G

phe

ser

tyr

cys

U

phe

ser

tyr

C

U

cys

leu

ser

punt.

A

punt.

leu

ser

punc.

trp

G

leu

pro

his

arg

U

leu

pro

his

arg

C

C

leu

pro

A

gin

arg

Ieu

pro

G

gln

arg

ile

thr

asn

ser

U

ile

thr

asn

ser

C

A

ile

thr

lys

arg

A

met

thr

lys

arg

G

G

val

ala

asp

gly

U

val

ala

asp

gly

C

val

ala

glu

gly

A

val

ala

glu

gly

G

FIGURE 94. The Genetic Code, by which each triplet in a strand of messenger RNA codes for one of twenty amino acids (or a punctuation mark).

Reductionistic Explanation of Protein Function

Another discrepancy between Typogenetics and true genetics-and this is probably the most serious one of all-is this: whereas in Typogenetics, each component amino acid of an enzyme is responsible for some specific "piece of the action", in real enzymes, individual amino acids cannot be assigned such clear roles. It is the tertiary structure as a whole which determines the mode in which an enzyme will function; there is no way one can say, "This

amino acid's presence means that such-and-such an operation will get performed". In other words, in real genetics, an individual amino acid's contribution to the enzyme's overall function is not "context-free". However, this fact should not be construed in any way as ammunition for an anti reductionist argument to the effect that "the whole [enzyme] cannot be explained as the sum of its parts". That would he wholly unjustified. What is justified is rejection of the simpler claim that "each amino acid contributes to the sum in a manner which is independent of the other amino acids present". In other words, the function of a protein cannot be considered to be built up from context-free functions of its parts; rather, one must consider how the parts interact. It is still possible in principle to write a computer program which takes as input the primary structure of a protein,

FIGURE 95. The structure of myoglobin, deduced from high-resolution X-ray data. The large-scale "twisted pipe" appearance is the tertiary structure; the finer helix inside-the

"alpha helix"-is the secondary structure. [From A. Lehninger, Biochemistry]

and firstly determines its tertiary structure, and secondly determines the function of the enzyme. This would be a completely reductionistic explanation of the workings of proteins, but the determination of the "sum" of the parts would require a highly complex algorithm.

The elucidation of the function of an enzyme, given its primary, or even its tertiary, structure, is another great problem of contemporary molecular biology.

Perhaps, in the last analysis, the function of the whole enzyme can be considered to be built up from functions of parts in a context-free manner, but where the parts are now considered to be individual particles, such as electrons and protons, rather than "chunks", such as amino acids. This exemplifies the "Reductionist's Dilemma": In order to explain everything in terms of context free sums, one has to go down to the level of physics; but then the number of particles is so huge as to make it only a theoretical "in-principle" kind of thing.

So, one has to settle for a context-dependent sum, which has two disadvantages. The first is that the parts are much larger units, whose behavior is describable only on a high level, and therefore indeterminately. The second is that the word "sum" carries the connotation that each part can be assigned a simple function and that the function of the whole is just a context-free sum of those individual functions. This just cannot be done when one tries to explain a whole enzyme's function, given its amino acids as parts. But for better or for worse, this is a general phenomenon which arises in the explanations of complex systems. In order to acquire an intuitive and manageable understanding of how parts interact-in short, in order to proceed-one often has to sacrifice the exactness yielded by a microscopic, context-free picture, simply because of its unmanageability. But one does not sacrifice at that time the faith that such an explanation exists in principle.

Transfer RNA and Ribosomes

Returning, then, to ribosomes and RNA and proteins, we have stated that a protein is manufactured by a ribosome according to the blueprint carried from the DNA's "royal chambers" by its messenger, RNA. This seems to imply that the ribosome can translate from the language of codons into the language of amino acids, which amounts to saying that the ribosome "knows" the Genetic Code. However, that amount of information is simply not present in a ribosome. So how does it do it? Where is the Genetic Code stored? The curious fact is that the Genetic Code is stored-where else?-in the DNA itself. This certainly calls for some explanation.

Let us back off from a total explanation for a moment, and give a partial explanation.

There are, floating about in the cytoplasm at any given moment, large numbers of four-leaf-clover-shaped molecules; loosely fastened (i.e., hydrogen-bonded) to one leaf is an amino acid, and on the opposite leaf there is a triplet of nucleotides called an anticodon. For our purposes, the other two leaves are irrelevant. Here is how these "clovers" are used by the ribosomes in their production of proteins. When a new

FIGURE 96. A section of mRNA passing through a ribosome. Floating nearby are t
RNA
molecules, carrying amino acids which are stripped off by the ribosome and appended to the growing protein. The Genetic Code is contained in the t
RNA
molecules, collectively. Note how the base-pairing (A-U, C-G) is represented by interlocking letter-forms in the diagram.

[Drawing by Scott E. Kim]

codon of m
RNA
clicks into position in the ribosome's "playing head", the ribosome reaches out into the cytoplasm and latches onto a clover whose anticodon is complementary to the m
RNA
codon. Then it pulls the clover into such a position that it can rip off the clover's amino acid, and stick it covalently onto the growing protein. (Incidentally, the bond between an amino acid and its neighbor in a protein is a very strong covalent bond, called a "peptide bond". For this reason, proteins are sometimes called "polypeptides".) Of course it is no accident that the "clovers" carry the proper amino acids, for they have all been manufactured according to precise instructions emanating from the "throne room".

Other books

Nowhere to Hide by Tobin, Tracey
Enoch's Device by Joseph Finley
Sister Noon by Karen Joy Fowler
Wolf Totem: A Novel by Rong, Jiang
Stripped Bear by Kate Baxter