Read It Began with Babbage Online
Authors: Subrata Dasgupta
Suppose now that machines A and B are combined by way of a control device Câcall this combination A + B + C automaton Dâthat, when provided with
d
(A), passes it to A for constructing a copy of A, passes
d
(A) to B to produce a copy of
d
(A), and inserts the copy of
d
(A) into the new automaton A. So D can reproduce A along with the description
d
(A) of A.
Last, consider the machine D provided with its own description
d
(D). Call this combination automaton E. What E can do is produce another identical automaton E. E
is thus self-reproducing
because it can produce E that can produce another E and so on.
As von Neumann put it,
d
(D) “is roughly effecting the function of a gene” whereas the copying automaton B “performs the fundamental act of reproduction, the duplicating of the genetic material, which is clearly the fundamental operation in the multiplication of living cells.”
42
von Neumann's “general theory of automata” was, first, a comparative study of biological and artificial neuron systems; second, it explored, in a highly speculative way, the idea of self-reproducing automata, thereby positing a general theory that could mimic the self-reproducing capacity of biological cells. It embraced not only the computing automaton
à la
Turing, but also something startlingly new: a self-constructing automaton. In
fact, this would be the beginning of an entirely different branch of automata theory that came to be called
cellular automata theory
, which considered how
arrays
of individual automata (“cells”) could work to transmit information between cells, perform computation, and construct various computational organs.
43
New scientific paradigms, new artistic and literary styles, and new technological movements are created by a select few. The birth of new creative impulses belong to the realm, primarily, of intellectual history rather than to social or cultural history. It is only when that impulse or movement is recognized as a paradigm that it spreads to the larger population. Revolutionary science, art, design
then
become normal science, art, design. Which is why we find the birthing years of computer science dominated by a small number of people, some of whom appear, disappear, and reappear during these early years. It is as if, having carved out a space of their ownâindeed,
created
a space of their ownâhaving been led to a virgin land they invent for themselves a space within that land before other newcomers have the chance to come on it. von Neumann was one such person, as we have seen. Turing was surely another (and he will appear once more very soon). Shannon was a third such person.
Shannon, as we have also seen, in 1938, connected the technological art of switching circuit design with the abstract, symbolic logic of Boolean algebra. It is, perhaps, for this reason that the design of circuits that input, process, and output the Boolean valuesâ1 and 0 or TRUE and FALSEâis called logic design.
44
In fact, during the late 1940s, Shannon did much more. In 1948, then a mathematician at the Bell Telephone Laboratories in Murray Hill, New Jersey, Shannon published an article on a mathematical theory of communication.
45
A year later, with Warren Weaver (1894â1978), a mathematician and preeminent science administrator at the Rockefeller Foundation, Shannon published a book that developed this theory more fully.
46
The mathematical theory of communication is otherwise and more succinctly called
information theory
, which forms the theoretical foundation of telecommunications engineering and has also influenced the study of human communication.
47
The word
information
is used in information theory in a specific sort of way; it has nothing to do with how “information” is understood in everyday language. We usually think of information as being
about
something. In common parlance, information has
meaning
, there is a semantic aspect to it. In information theory, however, information is devoid of meaning. It is simply the commodity that is transmitted across communication “channels,” whether between human beings, along telegraph wires, or across telephone lines. The unit of information in information theory is called the
bit
(short for
binary digit
, a term coined by another distinguished Bell Laboratory mathematician and Shannon's colleague, John W. Tukey [1915â2000]
48
). It
means
nothing but itself, just as a unit of money refers to nothing but itself.
Shannon is commonly referred to as the Father of Information Theory, butâlike all such glib journalistic appellationsâthis, too, much be viewed quizzically if only because the history of the origins of information theory began well before Shannon.
49
What is beyond dispute is that he has a preeminent place in the creation of information theory.
Insofar as the transmission and storage of information bits are prominent aspects of the design of computer systems, Shannon's contribution to information theory has an obvious place in the history of computing. But this is
not
why he appears in this part of our story. Shannon was one of those individuals who, during the 1940s crossed interdisciplinary boundaries with total insouciance, who ignored the narrow domestic walls the poet Tagore had dreamed of demolishing. He was, after all, a contemporary of Wiener, and it is this trait that ushers him into this chapter.
For Shannon, consummate applied mathematician and engineering theorist though he was, was also intrigued with
what computers could do
. He was interested in the computer as far more than an instrument of numeric computation. He envisioned machines that would design circuits, route telephone calls, perform symbolic (not numeric) mathematical computations, translate from one language to another, make strategic military decisions, reason logically, orchestrate melodies.
50
In other words, he was interested in the computer
as an intelligent being
in the most eclectic sense of “intelligence.” He believed that such computers were not only theoretically possible, but also economically viableâthat is, they were feasible from an engineering point of view.
51
Shannon ruminated on these ideas in 1950. Notice that these possibilities formed the other face of the naturalâartificial wall. Craik and von Neumann had dwelled on the possibility of regarding human neural activity and thought in computational terms. Shannon was envisioning computers as capable of the kind of neural processing that produces human intelligent thought and action. If Craik and von Neumann contemplated human cognition in mechanical terms, Shannon was imagining machines in human cognitive terms. But, both perspectives converged to a common theme:
computing
(in some broad sense)
and mindâbrain processing were related
.
The great French philosopher René Descartes (1596â1650) had famously uttered the dictum
cogito ergo sum
, “I think, therefore I am”âmeaning, basically, that the very act of thinking is proof of one's mental existence. Craik, von Neumann, and Shannon were among the first who were claiming that
I compute, therefore I am
. The “I” here could as well be the computer as a human being.
However, Shannon was not merely speculating on the possibility of machine intelligence. He was an applied mathematician above all else and, as such, he was interested in specific problems and their solutions. Thus, in November 1949, the editor of the venerable British
Philosophical Magazine
(with a pedigree reaching back to the 18th century) received a manuscript authored by Shannon titled “Programming a Computer for Playing Chess.”
52
The article was published in March 1950.
53
Less than a year earlier, the EDSAC and the Manchester Mark I had run their first programs.
Chess is, of course, among the most sophisticated board games. As Shannon pointed out, the “problem” of chess playing is, on the one hand, extremely well-defined in terms of the rules that determine legal chess moves and the goal of the game (to checkmate the opponent); yet, it is neither too trivial nor too difficult to achieve the goal. To play well, demands considerable thought.
These very characteristics had prompted others long before Shannon to try and design chess-playing machines. One of these was by Leonardo Torres y Quevedo, the Spanish engineerâinventor whose
Essays in Automatics
(1914) we encountered in
Chapter 3
, Section IX. Evidently, among the “thinking automata” Torres y Quevedo had envisioned was a chess-playing machine he designed in 1914 for playing an end game of king and rook against king. Torres y Quevedo's machine played the side with king and rook, and could checkmate the human opponent in a few moves regardless of how the latter played.
54
Shannon would have liked to design a special-purpose chess computer. Rather wistfully, he rejected the idea because of cost. So he began with the assumption that he would have at his disposal a stored-program digital computer along the lines available at the time. The challenge was to
simulate
a chess-playing machine on what was an “arithmetic computer.” Or, in present-centered language, to design a
virtual
chess machine that could be implemented on a typical stored-program computer.
Chess, unlike many other board or card games, has no chance element in it. Moreover, it is a game of “perfect information” in that each player can see all the pieces on the board at all times. Shannon refers to the monumental and seminal book on the interdisciplinary field of game theory invented in 1944 by the ever-fertile von Neumann and German-American economist Oskar Morgernstern (1902â1977).
55
Along with cybernetics, information theory, and the computer, game theory was yet another of the extraordinary intellectual creations of the mid 1940s. Referring to the von Neumann-Morgernstern book, Shannon noted that any particular position (that is, configuration of the pieces on a chess board) at any time would lead to one of three possibilities: White winning, Black winning, or a draw. However, there is no algorithm that can determine, in a
practical
way, which of these situations prevails in any given position. Indeed, as Shannon noted, if that was the case, chess would not be at all interesting as a game.
56
In principle, in any position, the following algorithm would work. The machine considers all possible next moves for itself. For each such next move it then considers all possible moves by its opponent. Then, for each of those, it considers all possible moves for itselfâand so on until the end is reached. The outcome in each end situation will be a win, loss, or draw. By working backward from the end, the machine would determine whether the current position would force a win, loss, or draw.
In current language, this strategy is known as
exhaustive search
or
brute
-
force search
.
57
And, to repeat, this strategy would work in principle. However, in practice even using a high-speed electronic computer, the amount of computation required would be unimaginably large.
Shannon, referring to the pioneering experimental studies of chess play conducted by Dutch psychologist and chess master Adriaan De Groot (1914â2006) in the mid 1940s, noted that, in typical positions, there are about 30 possible legal moves.
58
Shannon estimated that, assuming a typical game to last about 40 moves to resignation of one of the players, something like 10
120
alternatives would have to be considered from the initial position. Assuming the computer could ascertain one alternate position each microsecond, he calculated that something like 10
90
years would be required to compute an optimal move.
59
So exhaustive search was ruled out. More practical strategies were required. Rather than dream of a “perfect chess” machine, the aim should be to produce a machine that could perform at the level of a “good” human chess player.
60
This would involve a strategy that evaluated the “promise” of a position
P
using some appropriate “evaluation function”
f
(P)
, with the “promise” depending on such considerations as the overall positions of the pieces on the board at any particular stage of the game, the number of Black and White pieces on the board, and so on.
61
Shannon gave an example of an evaluation function
f
(P)
for a position
P
that uses different “weights” (measures of importance) to the various types of chess pieces on the board. Assuming that the machine explores only one move deepâthat is, it explores its own next moveâa possible strategy would suppose that M1, M2, â¦, Mn are the possible moves that can be made by the machine in a position
P
. If M1
P
, M2
P
, and so forth, signify the resulting positions when M1, M2, and so on, are made, then choose the move Mq that maximizes the evaluation function
f
(Mq
P
).
62
A “deeper” strategy would consider the opponent's responseâthat is, the strategy would “look ahead” to the opponent's possible moves. However, if the machine is playing White, Black's reply to a White move would endeavor to
minimize
f
(P)
. So if White plays Mi, Black would choose move Mij such that
f
(MijMi
P
) is a minimum. In which case, White should choose his first move such that
f
is a maximum
after
Black chooses his minimizing (that is, best) reply.