Help on deciphering crash

Ulf Tropp tropp at cthct.UUCP
Thu Jan 8 20:22:16 AEST 1987


In article <4914 at mimsy.UUCP> chris at mimsy.UUCP (Chris Torek) writes:
>>In article <3645 at sdcrdcf.UUCP> davem at sdcrdcf.UUCP (David Melman) writes:
>>>Our Vax 750 running 4.2BSD has occassionally been crashing with:
>>>machine check 2: cp tbuf par fault
>>>	va 80039728 errpc 8000394e mdr a smr 8 rdtimo 0 tbgpar 0 cacherr 5
>>>	busserr 6 mcesr 9 pc 8000394e ps1 40c0008 mcsr 80016
>
>Anyway, you could try disabling the cache:
>
>	mtpr(CADR, 1);	/* CADR is register 0x25 */
>
>but that will probably slow the machine to a crawl.  Disabling
>and reenabling the cache might well flush it, though.  If
>
>	mtpr(CADR, 1);
>	mtpr(CADR, 0);
>
>does not clear the problem, perhaps reenabling it after a long
>delay will.

We had a lousy cache once that would cause a mchk approximately
once an hour. Since DEC couldn't supply a new board in a week,
I had plenty of time to test recovery code. What I did was essentially:

		mtpr(CADR,1);
		if(mcf->mc5_cacherr&0xe){
			mtpr(CAER,0xf);
			/* fetch offending byte w/o cache */
			if(mcf->mc5_va&0x80000000)
				i = *((char *)mcf->mc5_va);
			else
				i = fubyte(mcf->mc5_va);
			if(mfpr(CAER)&0xe){
				return; /* run without cache */
			}
			printf("Cache reenabled\n");
			mtpr(CADR,0);
		}
		return;

Probably not entirely correct, but id did seem to work:
the sytem would mostly return orderly to the aborted instruction,
sometimes going directly into a new mchk a couple of times.

Anyway, does somebody know about which instructions that can be
restarted? Shouldn't anyone that can generate a page fault?

BTW, a comment in the 4.2 tbuf recovery code says "Should we use
pc or errpc.." (when looking at the instruction to return to).
Clearly it must be pc, since that is what we is returning to,
so I changed the 4.2 code.

In-Real-Life: 	Ulf Tropp
		Systems Administrator
		Dept. of Computer Engineering
		Chalmers Univ. of Technology
		S-412 96 Gothenburg
		Sweden

UUCP:	..mcvax!enea!chalmers!cthct!tropp
ARPA:	tropp%cthct.uucp at seismo.CSS.GOV (?)



More information about the Comp.unix.questions mailing list