Self-modifying code

Tue Jul 12 09:16:55 AEST 1988

nather at ut-sally.UUCP (Ed Nather) writes:
>pardo at june.cs.washington.edu (David Keppel) writes:
>>[ Out of context quote ]
>[ Funny retort. ]
Sorry.  I do like your rejoinder, though.   Could be about
self-modifying code, even   :-)

> Sorry, no.  I've heard LOTS of arguments against programs that
> generate their own code, and all of them -- so far as I am aware
> -- assume the "proper" answer in generating arguments to show the
> answer is proper.  Be honest: suppose you *had* to do it this way,
> but needed a formal discipline to keep it clean and understandable,
> and *fast.*  Couldn't you find a reasonable way?

Let me rephrase my position.  Their is nothing architecturally weird
about programs that generate their own code.  Doing so does not cause
any problems on any machines that I am aware of, although few
OPERATING SYSTEMS support this.

There is a *major* problem with modifying existing code.  Consider a
machine with seperate I-cache and D-cache.  If you execute an
instruction at location 0x500, then one at 0x504 that branches back
to the one at 0x500, and the one at 0x500 modifies itself, either:

  + The cache can be built so that it pays attention to the changing
    data (code) in the I-cache.  This requires a "snoopy" cache, which
    is more complicated, less dense, and slower.  If you are
    constrained by price or die space, you may be forced to use a
    slower, smaller (worse hit-ratio) cache.
  + You can execute out of the D-cache.  This is effectively the
    same as having a snoopy I-cache.  If you have BOTH an I-cache and
    a D-cache, you're left with coherency problems that are solved by
    building a snoopy I-cache.
  + You can define this operation to be illegal, build a smaller,
    faster, non-snoopy cache, and pay the price when you need to run
    code that could be expressed "naturally" as self-modifying code.

Assume that you have an algorithm that is, say, 1000 instructions and
that each instruction takes 4 cycles to execute (must be a RISC :-).
One particular part of it, 10 instructions, is written as
self-modifying code.  It can also be written as 100 instructions of
non-self-modifying (static) code.  Further assume that the effect of
having a non-snoopy cache is to make the computer run 10% faster
(e.g., the cache is faster AND has a better hit rate).  Then the
running time of the program on the machine with the snoopy I-cache
is:

    T = 990 * 4.0 + 10 * 4.0 = 4000 cycles

On the machine without a snoopy I-cache (10% faster),

    T = 990 * 3.6 + 100 * 3.6 = 3924 cycles

which is marginally FASTER for NOT using self-modifying code,
even though it takes an additional 90 isntructions.

	;-D on  ( Cycles to burn reading flames )  Pardo