Read Structure and Interpretation of Computer Programs Online

Authors: Harold Abelson and Gerald Jay Sussman with Julie Sussman

Structure and Interpretation of Computer Programs (104 page)

BOOK: Structure and Interpretation of Computer Programs
12.58Mb size Format: txt, pdf, ePub
ads

produces the sequence

<
s
e
q
1
>
<
s
e
q
2
>

Whenever registers might need to be saved, the compiler's code generators use
preserving
, which is a more subtle method for combining
instruction sequences.
Preserving
takes three arguments: a set
of registers and two instruction sequences that are to be executed
sequentially. It appends the sequences in such a way that the
contents of each register in the set is preserved over the execution
of the first sequence, if this is needed for the execution of the
second sequence. That is, if the first sequence modifies the register
and the second sequence actually needs the register's original
contents, then
preserving
wraps a
save
and a
restore
of the register around the first sequence before appending the
sequences. Otherwise,
preserving
simply returns the appended
instruction sequences. Thus, for example,

(preserving (list <
r
e
g
1
> <
r
e
g
2
>) <
s
e
q
1
> <
s
e
q
2
>)

produces one of the following four sequences of instructions, depending on how
<
s
e
q
1
> and <
s
e
q
2
> use <
r
e
g
1
> and <
r
e
g
2
>:

By using
preserving
to combine instruction sequences the
compiler avoids unnecessary stack operations. This also isolates the
details of whether or not to generate
save
and
restore
instructions within the
preserving
procedure, separating them
from the concerns that arise in writing each of the individual code
generators.
In fact no
save
or
restore
instructions are explicitly
produced by the code generators.

In principle, we could represent an instruction sequence simply as a
list of instructions.
Append-instruction-sequences
could then
combine instruction sequences by performing an ordinary list
append
. However,
preserving
would then be a complex operation,
because it would have to analyze each instruction sequence to
determine how the sequence uses its registers.
Preserving
would be inefficient as well as complex, because it would have to
analyze each of its instruction sequence arguments, even though these
sequences might themselves have been constructed by calls to
preserving
, in which case their parts would have already been
analyzed. To avoid such repetitious analysis we will associate with each
instruction sequence some information about its register use.
When we construct a basic instruction sequence we
will provide this information explicitly,
and the procedures that combine instruction sequences will derive
register-use information for the combined sequence from the
information associated with the component sequences.

An instruction sequence will contain three pieces of information:

  • the set of registers that must be initialized before the
    instructions in the sequence are executed (these registers are said to
    be
    needed
    by the sequence),
  • the set of registers whose values are modified by the
    instructions in the sequence, and
  • the actual instructions (also called
    statements
    ) in
    the sequence.

We will represent an instruction sequence as a list of its three
parts. The constructor for instruction sequences is thus

(define (make-instruction-sequence needs modifies statements)
  (list needs modifies statements))

For example, the two-instruction sequence that looks up the value of
the variable
x
in the current environment, assigns the result
to
val
, and then returns, requires registers
env
and
continue
to have been initialized, and modifies register
val
.
This sequence would therefore be constructed as

(make-instruction-sequence '(env continue) '(val)
 '((assign val
           (op lookup-variable-value) (const x) (reg env))
   (goto (reg continue))))

We sometimes need to construct an instruction sequence with no statements:

(define (empty-instruction-sequence)
  (make-instruction-sequence '() '() '()))

The procedures for combining instruction sequences are shown in
section 
5.5.4
.

Exercise 5.31.
  
In evaluating a procedure application, the explicit-control evaluator
always saves and restores
the
env
register around the evaluation of the operator, saves and
restores
env
around the evaluation of each operand (except the
final one), saves and restores
argl
around the evaluation of each
operand, and saves and restores
proc
around the evaluation of the
operand sequence. For each of the following combinations, say which
of these
save
and
restore
operations are superfluous and
thus could be eliminated by the compiler's
preserving
mechanism:

(f 'x 'y)
((f) 'x 'y)
(f (g 'x) y)
(f (g 'x) 'y)

Exercise 5.32.
  
Using the
preserving
mechanism, the compiler will avoid saving
and restoring
env
around the evaluation of the operator of a
combination in the case where the operator is a symbol. We could also
build such optimizations into the evaluator. Indeed, the
explicit-control evaluator of section 
5.4
already
performs a similar optimization, by treating combinations with no
operands as a special case.

a. Extend the explicit-control evaluator to recognize as a separate class
of expressions combinations whose operator is a symbol, and to take
advantage of this fact in evaluating such expressions.

b. Alyssa P. Hacker suggests that by extending the evaluator to recognize
more and more special cases we could incorporate all the compiler's
optimizations, and that this would eliminate the advantage of compilation
altogether. What do you think of this idea?

5.5.2  Compiling Expressions

In this section and the next we implement the code generators to which the
compile
procedure dispatches.

Compiling linkage code

In general, the output of each code generator will end with
instructions – generated by the procedure
compile-linkage
– that
implement the required linkage. If the linkage is
return
then
we must generate the instruction
(goto (reg continue))
. This
needs the
continue
register and does not modify any registers.
If the linkage is
next
, then we needn't include any additional
instructions. Otherwise, the linkage is a label, and we generate a
goto
to that label, an instruction that does not need or modify
any registers.
36

(define (compile-linkage linkage)
  (cond ((eq? linkage 'return)
         (make-instruction-sequence '(continue) '()
          '((goto (reg continue)))))
        ((eq? linkage 'next)
         (empty-instruction-sequence))
        (else
         (make-instruction-sequence '() '()
          `((goto (label ,linkage)))))))

The linkage code is appended to an instruction sequence by
preserving
the
continue
register, since a
return
linkage will
require the
continue
register:
If the given instruction sequence modifies
continue
and the
linkage code needs it,
continue
will be saved and restored.

(define (end-with-linkage linkage instruction-sequence)
  (preserving '(continue)
   instruction-sequence
   (compile-linkage linkage)))

Compiling simple expressions

The code generators for self-evaluating expressions,
quotations, and variables construct instruction
sequences that assign the required value to the target register
and then proceed as specified by the linkage descriptor.

(define (compile-self-evaluating exp target linkage)
  (end-with-linkage linkage
   (make-instruction-sequence '() (list target)
    `((assign ,target (const ,exp))))))
(define (compile-quoted exp target linkage)
  (end-with-linkage linkage
   (make-instruction-sequence '() (list target)
    `((assign ,target (const ,(text-of-quotation exp)))))))
(define (compile-variable exp target linkage)
  (end-with-linkage linkage
   (make-instruction-sequence '(env) (list target)
    `((assign ,target
              (op lookup-variable-value)
              (const ,exp)
              (reg env))))))

All these assignment instructions modify the target register,
and the one that looks up a variable needs the
env
register.

Assignments and definitions are handled much as they are in the
interpreter. We recursively generate code that computes the value to
be assigned to the variable, and append to it a two-instruction
sequence that actually sets or defines the variable and assigns the
value of the whole expression (the symbol
ok
) to the target
register. The recursive compilation has target
val
and linkage
next
so that the code will put its result into
val
and
continue with the code that is appended after it. The appending is
done preserving
env
, since the environment is needed for setting
or defining the variable and the code for the variable value could be
the compilation of a complex expression that might modify the
registers in arbitrary ways.

(define (compile-assignment exp target linkage)
  (let ((var (assignment-variable exp))
        (get-value-code
         (compile (assignment-value exp) 'val 'next)))
    (end-with-linkage linkage
     (preserving '(env)
      get-value-code
      (make-instruction-sequence '(env val) (list target)
       `((perform (op set-variable-value!)
                  (const ,var)
                  (reg val)
                  (reg env))
         (assign ,target (const ok))))))))
(define (compile-definition exp target linkage)
  (let ((var (definition-variable exp))
        (get-value-code
         (compile (definition-value exp) 'val 'next)))
    (end-with-linkage linkage
     (preserving '(env)
      get-value-code
      (make-instruction-sequence '(env val) (list target)
       `((perform (op define-variable!)
                  (const ,var)
                  (reg val)
                  (reg env))
         (assign ,target (const ok))))))))

The appended two-instruction sequence requires
env
and
val
and modifies the target. Note that although we preserve
env
for
this sequence, we do not preserve
val
, because the
get-value-code
is designed to explicitly place its result in
val
for use by this sequence.
(In fact, if we did preserve
val
, we would
have a bug, because this would cause the previous contents of
val
to be restored right after the
get-value-code
is run.)

Compiling conditional expressions

The code for an
if
expression
compiled with a given target and linkage has the form

 <
compilation of predicate, target 
val
, linkage 
next
>
 (test (op false?) (reg val))
 (branch (label false-branch))
true-branch
 <
compilation of consequent with given target and given linkage or 
after-if
>
false-branch
 <
compilation of alternative with given target and linkage
>
after-if

To generate this code, we compile the predicate, consequent,
and alternative, and combine the resulting code with instructions
to test the predicate result and with newly generated labels
to mark the true and false branches and the end of the conditional.
37
In this arrangement of code, we must branch around the true branch
if the test is false. The only slight complication is in how the
linkage for the true branch should be handled. If the linkage for the
conditional is
return
or a label, then the true and false
branches will both use this same linkage. If the linkage is
next
, the true branch ends with a jump around the code for the false
branch to the label at the end of the conditional.

BOOK: Structure and Interpretation of Computer Programs
12.58Mb size Format: txt, pdf, ePub
ads

Other books

Electric Barracuda by Tim Dorsey
The Survival Game by Stavro Yianni
Forever and Ever by Patricia Gaffney
More to Us by Allie Everhart
A Slow-Burning Dance by Ravenna Tate
The Irish Bride by Cynthia Bailey Pratt
The Instant When Everything is Perfect by Jessica Barksdale Inclan