Understanding Computation (48 page)

Read Understanding Computation Online

Authors: Tom Stuart

Tags: #COMPUTERS / Programming / General

BOOK: Understanding Computation
3.08Mb size Format: txt, pdf, ePub

[
53
]
“Hardware” means the read/write head, the tape, and the rulebook. They’re not literally
hardware since a Turing machine is usually a thought experiment rather than a physical
object, but they’re “hard” in the sense that they’re a fixed part of the system, as opposed
to the ever-changing “soft” information that exists as characters written on the
tape.

[
54
]
The implementation of
TAPE_MOVE_HEAD_LEFT
is similar, although it
requires some extra list-manipulation functions that didn’t get
defined in
Lists
.

[
55
]
The term
Turing complete
is often used to
describe a system or programming language that can simulate any Turing
machine.

[
56
]
Because subtraction is the
inverse
of addition,
(x + (y − 1)) + 1 = (x + (y + −1)) + 1
. Because addition is
associative
,
(x + (y + −1)) + 1 = (x + y) + (−1 + 1)
. And because
−1 + 1 = 0
, which is the
identity element
for addition,
(x + y) + (−1 + 1) = x + y
.

[
57
]
Of course the underlying implementation of
#recurse
itself uses a recursive method definition, but that’s allowed,
because we’re treating
#recurse
as one of the four
built-in primitives of the system, not a user-defined method.

[
58
]
This second condition prevents us from ever getting into a situation where we need to
delete more characters than the string contains.

[
59
]
Doubling a number shifts all the digits in its binary
representation one place to the left, and halving it shifts them
all one place to the right.

[
60
]
A cyclic tag system rule doesn’t need to say “when the string begins with
1
, append the characters
011
,” because the first part is assumed—just “append the characters
011
” is enough.

[
61
]
Unlike a normal tag system, a cyclic tag system keeps going when no rule applies,
otherwise it would never get anywhere. The only way for a cyclic tag system to stop
running is for its current string to become empty; this always happens when the initial
string consists entirely of
0
characters, for
example.

[
62
]
The resulting sequence of
0
s
and
1
s is
not
meant to represent a binary number. It’s just a string of
0
characters with a
1
character marking a particular
position.

[
63
]
Out of a possible 512: nine cells are involved, and each cell
can be in one of two states, so there are 2 × 2 × 2 × 2 × 2 × 2 × 2 ×
2 × 2 = 512 different possibilities.

Chapter 8. Impossible Programs

The most merciful thing in the world, I think, is the inability of
the human mind to correlate all its contents.


H. P. Lovecraft

In this book, we’ve
explored different models of computers and programming languages, including several
kinds of abstract machine. Some of those machines are more powerful than others, and two
varieties in particular come with pretty obvious limitations: finite automata can’t solve
problems that involve unrestricted counting, like deciding whether a string of brackets is
correctly balanced, and pushdown automata can’t handle any problem where information needs to be
reused in more than one place, like deciding whether a string contains the same number of
a
,
b
, and
c
characters.

But the most advanced device we’ve seen, the Turing machine, seems to have everything that
we need: it’s got unlimited storage that can be accessed in any order, arbitrary loops,
conditionals, and subroutines. The extremely minimal programming language from
Chapter 6
, the lambda calculus, turned out to be surprisingly
powerful too: with a little ingenuity it allows us to represent simple values and complex data
structures as pure code, as well as implement operations that manipulate those representations.
And in
Chapter 7
, we saw many other simple systems that, like
the lambda calculus, have the same universal power as Turing machines.

How much further can we push this progression of increasingly powerful
systems? Perhaps not indefinitely: our attempts to make Turing machines more
powerful by adding features didn’t get us anywhere, which suggests there may
be a hard limit on computational power. So what are computers and
programming languages fundamentally capable of, and is there anything that
they can’t do? Are there any impossible programs?

The Facts of Life

These are pretty deep questions, so before we try to tackle them,
let’s take a tour of some fundamental facts from the world of computation.
Some of these facts are obvious, others less so, but they’re all important
prerequisites for thinking about the capabilities and limitations of
computing machines.

Universal Systems Can Perform Algorithms

What, generally
speaking, can we do with universal systems like Turing machines, the lambda
calculus, and partial recursive functions? If we can properly understand the capabilities of
these systems, then we’ll be able to investigate their limitations.

The practical purpose of a computing machine is to
perform
algorithms
. An algorithm is
a list of instructions describing some process for turning an input
value into an output value, as long as those instructions fulfill
certain criteria:

Finiteness

There are a finite number of instructions.

Simplicity

Each instruction is simple enough that it can be performed
by a person with a pencil and paper without using any
ingenuity.

Termination

A person following the instructions will finish within a
finite number of steps for any input.

Correctness

A person following the instructions will produce the right
answer for any input.

For example, one of the oldest known
algorithms is Euclid’s algorithm, which dates from around
300 BC. It takes two positive integers and returns the largest integer
that will divide them both exactly—their
greatest common
divisor
.
Here are its instructions:

  1. Give the two numbers the names
    x
    and
    y
    .

  2. Decide which of
    x
    or
    y
    is the larger number.

  3. Subtract the smaller number from the larger. (If
    x
    is larger, subtract
    y
    from it and make this the new value of
    x
    ; vice versa if
    y
    is larger.)

  4. Repeat steps 2 and 3 until
    x
    and
    y
    are equal.

  5. When
    x
    and
    y
    become equal, their value is the
    greatest common divisor of the original two numbers.

We’re happy to recognize this as an algorithm, because it appears to meet the basic
criteria. It only contains a few instructions, and they all seem simple enough to be
performed with pencil and paper by someone who doesn’t have any special insight into the
overall problem. With a bit of thought, we can also see that it must finish within a finite
number of steps for any input: every repetition of step 3 causes one of the two numbers to
get smaller, so they must eventually reach the same value
[
64
]
and cause the algorithm to finish. It’s not quite so obvious that it’ll always
give the correct answer, but a few lines of elementary algebra are enough to show that the
result must always be the greatest common divisor of the original numbers.

So Euclid’s algorithm is worthy of its name, but like any algorithm, it’s just a
collection of ideas expressed as human-readable words and symbols. If we want to do
something useful with it—maybe we’d like to explore its mathematical properties, or design a
machine that performs it automatically—we need to translate the algorithm into a stricter,
less ambiguous form that’s suitable for mathematical analysis or mechanical
execution.

We already have several models of computation that we could use
for this: we could try to write down Euclid’s algorithm as a Turing
machine rulebook, or a lambda calculus expression, or a partial
recursive function definition, but all of those would involve a lot of
housekeeping and other uninteresting detail. For the moment,
let’s just translate it into unrestricted Ruby:
[
65
]

def
euclid
(
x
,
y
)
until
x
==
y
if
x
>
y
x
=
x
-
y
else
y
=
y
-
x
end
end
x
end

This
#euclid
method contains essentially the same
instructions as the natural language description of Euclid’s algorithm, but this time,
they’re written in a way that has a strictly defined meaning (according to the operational
semantics of Ruby) and therefore can be interpreted by a machine:

>>
euclid
(
18
,
12
)
=> 6
>>
euclid
(
867
,
5309
)
=> 1

In this specific case, it’s been easy to take an informal, human-readable description of
an algorithm and turn it into unambiguous instructions for a machine to follow. Having
Euclid’s algorithm in a machine-readable form is very convenient; now we can perform it
quickly, repeatedly, and reliably without having to employ manual labor.

Note

Hopefully it’s clear that we could just as well have implemented
this algorithm with the lambda calculus by using similar techniques to
the ones we saw in
Numeric Operations
, or as a
partial recursive function built from the operations in
Partial Recursive Functions
, or as a collection of Turing
machine rules like the ones used for simple arithmetic in
Rules
.

This raises an important question: can
any
algorithm be turned into
instructions suitable for execution by a machine? Superficially that seems like a trivial
thing to ask—it was pretty obvious how to turn Euclid’s algorithm into a program, and as
programmers, we have a natural tendency to think of the two things as interchangeable—but
there’s a real difference between the abstract, intuitive idea of an algorithm and the
concrete, logical implementation of that algorithm within a computational system. Could
there ever be an algorithm so large, complex, and unusual that its essence can’t be captured
by an unthinking mechanical process?

Ultimately there can be no rigorous answer, because the question is philosophical rather
than scientific. The instructions of an algorithm must be “simple” and “without ingenuity”
so that it “can be performed by a person,” but those are imprecise ideas about human
intuition and capability, not mathematical assertions of the kind that can be used to prove
or disprove a hypothesis.

We can still collect evidence one way or the other by coming up with lots of algorithms
and seeing whether our computing system of choice—Turing machines, or lambda calculus, or
partial recursive functions, or Ruby—can implement them. Mathematicians and computer
scientists have been doing exactly that since the 1930s, and so far, nobody has managed to
devise a reasonable algorithm that can’t be performed by these systems, so we can be pretty
confident about our empirical hunch: it certainly
looks
as though a
machine can perform any algorithm.

Another strong piece of evidence is the fact that most of these
systems were developed independently as attempts to capture and analyze
the informal idea of an algorithm, and were only later found to be
exactly equivalent to each other. Every historical attempt to model the
idea of an algorithm has produced a system whose capabilities are
identical to those of a Turing machine, and that’s a pretty good hint
that a Turing machine adequately represents what an algorithm can
do.

The idea that any algorithm can be performed by a
machine—specifically a deterministic
Turing machine—is called the
Church–Turing
thesis
, and although it’s just a conjecture rather than a
proven fact, it has enough evidence in its favor to be generally
accepted as true.

Note

“Turing machines can perform any algorithm” is a philosophical
claim about the relationship between the intuitive idea of algorithms
and the formal systems that we use to implement them. What it actually
means is a matter of interpretation: we could see it as a statement
about what can and cannot be computed, or just as a firmer definition
of the word “algorithm.”

Either way, it’s called the “Church–Turing thesis,” not the
“Church–Turing theorem,” because it’s an informal claim rather than a
provable mathematical assertion—it can’t be expressed in purely
mathematical language, so there’s no way to construct a mathematical
proof. It’s widely believed to be true because it matches our
intuition about the nature of computation and the evidence of what
algorithms are capable of, but we still call it a “thesis” to remind
ourselves that it has a different status from provable ideas like
Pythagoras’ theorem.

The Church–Turing thesis implies that Turing machines, despite
their simplicity, have all the power required to perform any computation
that can in principle be carried out by a person following simple
instructions. Many people go further than this and claim that, since all
attempts to codify algorithms have led to universal systems that are
equivalent in power to Turing machines, it’s just not possible to do any
better: any real-world computer or programming language can only ever do
as much as a Turing machine can do, and no more. Whether it’s ultimately
possible to build a machine that’s more powerful than a Turing
machine—that can use exotic laws of physics to perform tasks beyond what
we think of as “algorithms”—is not definitively known, but it’s
definitely true that we don’t currently know how to
do it.

Programs Can Stand In for Turing Machines

As we saw in
Chapter 5
, the Turing machine’s simplicity makes it cumbersome
to design a rulebook for a particular task. To avoid our investigation of computability
being overshadowed by the fiddly details of Turing machine programming, we’ll use Ruby
programs as a substitute, just as we did for Euclid’s algorithm.

This sleight of hand is justified by universality: in principle, we can translate any
Ruby program into an equivalent Turing machine and vice versa, so a Ruby program is no more
or less powerful than a Turing machine, and anything we can discover about the limitations
of Ruby’s capabilities should apply equally to Turing machines.

A sensible objection is that Ruby has lots of practical
functionality that Turing machines don’t. A Ruby program can access the
filesystem, send and receive messages across the network, accept user
input, draw graphics on a bitmapped display, and so on, whereas even the
most sophisticated set of Turing machine rules can only ever read and
write characters on a tape. But that isn’t a fundamental problem,
because all of this extra functionality can be
simulated
with a Turing machine: if necessary, we
can designate certain parts of the tape as representing “the filesystem”
or “the network” or “the display” or whatever, and treat reading and
writing to those tape regions as though it was genuine interaction with
the outside world. None of these enhancements changes the underlying
computational power of a Turing machine; they just provide higher-level
interpretations of its activity on the tape.

In practice, we can sidestep the objection completely by restricting ourselves to simple
Ruby programs that avoid any controversial language features. For the rest of this chapter,
we’ll stick to writing programs that read a string from standard input, do some computation,
and then write a string to standard output when they’re finished; the input string is
analogous to the initial contents of a Turing machine’s tape, and the output string is like
the final tape contents.

Code Is Data

Programs live a double life.
As well as being instructions to control a particular
system, we can also think of a program as pure data: a tree of
expressions, a raw string of characters, or even a single large number.
This duality is usually taken for granted by us as programmers, but for
general-purpose computers it’s vitally important that programs can be
represented as data so that they can be used as input to other programs;
it’s the unification of code and data that makes software possible in
the first place.

We’ve already seen programs-as-data in the case of the universal
Turing machine, which expects another Turing machine’s rulebook to be
written on its tape as a sequence of characters. In
fancy
homoiconic
programming
languages like Lisp
[
66
]
and XSLT,
programs are explicitly written as data structures that
the language itself can manipulate: every Lisp program is a nested list
called an
s-expression
, and every
XSLT stylesheet is an
XML document.

In Ruby, only the interpreter (which, at least in the case of MRI,
is not itself written in Ruby) usually gets to see a structured
representation of the program, but the code-as-data principle still
applies. Consider this simple Ruby program:

puts
'hello world'

To an observer who understands the syntax and semantics of Ruby, this is a program that
sends a
puts
message to the
main
object with the string
'hello world'
,
which results in
the
Kernel#puts
method printing
hello world
to standard output. But on a lower level, it’s just
a sequence of characters, and because characters are represented as bytes, ultimately that
sequence can be viewed as a large number:

>>
program
=
"puts 'hello world'"
=> "puts 'hello world'"
>>
bytes_in_binary
=
program
.
bytes
.
map
{
|
byte
|
byte
.
to_s
(
2
)
.
rjust
(
8
,
'0'
)
}
=> ["01110000", "01110101", "01110100", "01110011", "00100000", "00100111",
"01101000", "01100101", "01101100", "01101100", "01101111", "00100000",
"01110111", "01101111", "01110010", "01101100", "01100100", "00100111"]
>>
number
=
bytes_in_binary
.
join
.
to_i
(
2
)
=> 9796543849500706521102980495717740021834791

In a sense,
puts 'hello world'
is Ruby program number
9796543849500706521102980495717740021834791.
[
67
]
Conversely, if someone tells us the number of a Ruby
program, we can easily turn it back into the program itself and run
it:

>>
number
=
9796543849500706521102980495717740021834791
=> 9796543849500706521102980495717740021834791
>>
bytes_in_binary
=
number
.
to_s
(
2
)
.
scan
(
/.+?(?=.{8}*\z)/
)
=> ["1110000", "01110101", "01110100", "01110011", "00100000", "00100111",
"01101000", "01100101", "01101100", "01101100", "01101111", "00100000",
"01110111", "01101111", "01110010", "01101100", "01100100", "00100111"]
>>
program
=
bytes_in_binary
.
map
{
|
string
|
string
.
to_i
(
2
)
.
chr
}
.
join
=> "puts 'hello world'"
>>
eval
program
hello world
=> nil

Of course, this scheme of encoding a program as a large number is
what makes it possible to to store it on disk, send it over the
Internet, and feed it to a Ruby interpreter (which is itself just a
large number on disk!) to make a particular computation happen.

Note

Since every Ruby program has a unique number, we can automatically generate all
possible programs: start by generating program number 1, then generate program number 2,
and so on.
[
68
]
If we did this for long enough, we’d eventually generate the next hot
asynchronous web development framework and retire to a life of leisure.

Other books

Dead Lucky by M.R. Forbes
Starting Over by Penny Jordan
Past Perfect by Leila Sales
Locked by Morgan, Eva
The Lake of Dreams by Kim Edwards
The Magykal Papers by Angie Sage