Crash a RISC machine from user-mode code:
James C Burley
burley at world.std.com
Sat Aug 11 16:43:56 AEST 1990
NOTE: -LONG- POSTING, look at the summary at the bottom first if you don't
want to read a single long posting on crashing systems!! I'd boil it down,
but I've already spent too much access $$ just composing the thing, so I
apologize to everyone for the length and to those who knew me as a software
tech writer long ago (I've always been an overly verbose engineer :-)!
--------
Hmm, this discussion was at first very interesting to me but seems to have
gotten off the track I was hoping for...let me explain:
1) As I recall, the original posting talked about somebody wondering if the
new RISC machines were bullet-proof in user mode (essentially, based on
their wording -- something about "register paths" and such), and
proposed running a program that jumped to random data. The result of
such a program is the execution of random defined AND UNDEFINED
instructions.
2) This kind of program should ONLY be run under so-called "user-mode"
protection, i.e. under operating systems like UNIX, OS/2 (I think), VMS,
A/UX, and so on, and only on CPUs where those systems offer (and have
enabled) memory protection, fault catching, preemptive scheduling, and
such like. Thus it is NOT USEFUL to run the program on systems like
IBM PCs running DOS, XENIX (I think), or Macintoshes running Apple's non-
UNIX OS. (Maybe A/UX fits in this category too?) Why? Because no matter
what it does (short of reducing oil prices), any hand-written program could
have done the same -- including (caution!) erasing your hard disk!
Doubt I've lost anyone so far....
3) There's no doubt that jumping to random junk produces no useful productive
work in the normal sense; nobody is suggesting this is a good way to use
any kind of computer. BUT, by running random junk, one may increase the
likelihood of discovering a "hole" in the system (hardware or kernel,
usually) compared to running regular code generated by a compiler or even
regular assembly code written by users. It may even have a better chance
than examining the instruction set architecture and trying to purposely
write code that breaks the machine.
4) If such a program does anything that any normal user mode program may
conceivably do, then it should not be considered worth noting. This is
especially the case (even for weird things like deleting files) if the
program is run after some other useful program has been run and still has
parts of it sitting around in memory; the random program could easily jump
to it. Other things included in this "not interesting behavior", IMHO:
a) Putting the process into an infinite loop (but the system as a whole
still works to the same extent it would if one actually ran a
hand-coded infinite loop).
b) Spewing junk to the terminal screen, or hanging for input from the
terminal.
c) Signaling conditions caught by the OS.
d) Logging out, playing with files, network connections, or other things
like that.
e) Thrashing the swapper or pager (again, assuming any user program can
do it).
5) However, if the random program manages to do things clearly out of the
accepted realm of "user program", and assuming it (and thus the user
or "wetware") cannot invoke "superuser" or some other "give me direct access
to the kernel" function, such as "poke the kernel's memory" or "write
to raw disk sector", then one may conclude that either the operating system
in control has a security hole, or perhaps the hardware itself has a
security hole. THIS IS PART OF HOW RICHARD MORRIS'S PROGRAM TRASHED THE
NET: he knew passing a certain invalid value to a kernel-mode function from
user-mode would escape normal defensive programming (since there wasn't any
in that particular case), and allow his program to insinuate part of
itself (data/instructions) into the kernel's memory and then be executed
as kernel, not user, code.
I highlight this issue because it IS important: if your operating system
provides a "hole" through which any user (who can write and execute raw
machine code if even only via BASIC POKE instructions, but certainly via
use of assembler/loader) can do something not normally allowed in user
mode, then your operating system has a security hole. (I'm not talking
about the non-user-mode systems like MAC OS, PC/MS-DOS; I mean Unix, VMS,
PRIMOS, and so on.) Very likely, the hole can be found and fixed (though
the fix is painful if the "bug" is really a convenient "entry point" for
utilities needing special features; I've dealt with fixing this kind of
thing many times, usually involving timesharing systems' batch and printer
queue utilities).
But if the problem is that the underlying CPU allows a user mode program
to somehow circumvent documented user mode protection, then the problem
cannot be fixed without either switching to another kind of CPU (not easy;
porting is a problem) or preventing users from writing machine code (the
acceptable answer if you are providing only pure end-user services; for
example, the Prodigy on-line service allows no programming, so conceptually
could be implemented entirely on Apple IIs without having any architectural
exposure from a security perspective -- of course, performance is another
issue :-).
IF a system is "hackable" from the hardware perspective, the manufacturer
of that CPU better find and fix the problem fast, and perhaps even provide
inexpensive replacements to their customers. Otherwise their machine
becomes a "target" of evil hackers, and administrators will learn to avoid
any system based on their CPU especially when it comes to attaching such
a system to any network or putting any sensitive data on it.
SAMPLE WAYS TO TELL if your system has a "hole" like this, based on the
behavior of the "random-jumping" program:
a) Running the program crashes the entire system, but there is no known
way of so doing with a hand-written user program. (The culprit may
be the OS or the CPU, but check the OS carefully first.)
b) The program manages to rewrite part of the (protected) kernel
without crashing the system or calling any "may I write the kernel"
function. (Likely to be a CPU bug.)
c) The program somehow causes Iraq to unilaterally disarm.
Again, if the random program does something you cannot imagine ANY user
mode program doing (not just a correct or "well-written" one), then you
might well be looking at a security hole. ("Security" meaning either
a user can access things he/she shouldn't be able to, or is able to
trash or crash things he/she shouldn't have access to, like the CPU
itself.)
6) If you think the random-jumping program has exposed a hole in your system,
be it RISC or CISC, first determine (by reading the documentation or
asking an expert on your configuration of CPU and OS) whether your system
even ATTEMPTS to catch all possible user-mode violations. If your CPU
allows, for example, I/O instructions in user mode, then although it
wouldn't fit MY definition of "user mode", it would mean a user mode
program could do almost anything (including rewriting a swapping/paging
kernel or other kernel-mode programs right out from under themselves by
rewriting their disk images), so the random-jumping program would simply
be something to avoid running any more!
But, if you are running, say, under VAX/VMS, or on a 68030 running a
memory-protecting UNIX, or some such thing, and the random-jumper does
something out of bounds, then perhaps you've discovered one or more "holes"
in the system. If you can reproduce the problem reliably (if the program
always creates the same random data each time, for example), then you might
be able to step through it and find the actual instruction or instruction
sequence that causes the crash. (HINT: if it takes long in terms of
instruction steps, and your system provides a user-mode n-stepper, find
a large value of "n" to step the program that results in a crash, then use
a binary search technique to lower "n" until you have a value that falls
just short of the crash.)
Once you've narrowed down the problem to a few instructions, if they're
user mode and don't involve a kernel call, you might have a true-blue
CPU bug: document the problem and discuss with another expert on that CPU
(especially, try and reproduce on other chips in case yours just has a
local flaw, and on other slightly different models of the same CPU, e.g.
a 486 if the failure is on a 386, a 68030 if on a 68040, etc), then if
you still think it's a hardware problem, report it to the manufacturer.
However, it's likely that a supposed CPU problem is really an OS problem
if the offending instructions cause a valid trap to the OS that the OS
mishandles or fails to handle, so make sure the offending instructions
aren't trapping to kernel mode at all.
If you find the problem's in the OS, for example a call to an OS function
with absurd arguments that don't get "noticed" until it's too late, then
(again, after checking with experts and other copies and different versions
of the OS) let the OS writer know. And, depending on your own sense of
ethics, perhaps let everyone else (via a newsgroup) know as well, so they
can plan their own defenses if they are using that OS.
(I wouldn't personally recommed advertising a CPU hole; if you're wrong
about somebody's OS, it's fairly easy for them to prove assuming you've
narrowed the problem down adequately, and in any case people can actually
defend their systems fairly rapidly via patches, but if you're wrong about
someone's CPU, they can't show everyone the schematics to prove it and
you may have tarnished the manufacturer's image permanently, and meanwhile
there isn't much most people can do about it quickly. Wait for the
manufacturer to verify/refute the problem and take their own steps, IMHO.)
7) Remember that even if you're the ONLY USER of a system with a "hole", you've
still got a security problem unless you're also the ONLY PROGRAMMER of
every new (or recent) program running in user mode on your system. If
someone else knows of a hole in a particular OS/CPU combination, they might
use that knowledge to write a trojan horse program that pretends to be one
thing but, when it detects a system for which it has a "kernel access"
code to exploit a security hole, does bad things like attaching viruses
to other programs or erasing disks. (IMHO, the best protection against
this kind of situation is to only use "free software" that comes with source
code and never use the binaries, but always do the rebuilds yourself, and
only after inspecting the source code via a quick perusal: it is much
harder to hide a code missile in source code than in binary code. Be
suspicious of any data tables without adequate explanation, especially if
they can get "jumped" to. Unfortunately, scanning assembler code can
be much harder than scanning HLL code like C, Pascal, or (best of all due
to lack of pointers and such) Fortran and Cobol.)
I know this has been a long posting, but I've tried to explain what I think
are the important issues about a random-jumping program. Again, please don't
get excited if such a program goes into an infinite loop, or signals
conditions that your OS catches -- any user program can do those things.
DON'T run such a program on ANY system that doesn't offer full user mode
protections like memory, I/O, scheduling, and others; you might just end up
trashing your hard disk or some such thing. Finally, if you DO run the
program on an "interesting" (i.e. protecting) system and it produces
"interesting" (i.e. not-normally-allowed-in-user-mode) results, PLEASE look
into it further and, if possible, involve an expert -- you may have taken
a step towards preventing the next major virus or trojan horse infiltration!
(I mean, if YOU can find the problem, so can someone else who wants publicity
for being a mediocre, obnoxious hacker!)
James Craig Burley, Software Craftsperson burley at world.std.com
More information about the Comp.lang.c
mailing list