Making A request to IBM

Robin Wilson robin at pensoft.UUCP
Thu Mar 21 02:08:40 AEST 1991


OK here it is... the complete (almost) description of how IBM software 
support works.

The customer is supposed to call the Systems Engineer (SE) for ANY problems
with their system.  The SE is responsible for Problem Determination (PD)
and Problem Source Identification (PSI).  Often times the SE is experienced
in systems other than the IBM AIX systems, so this PD/PSI is sometimes not
very complete when done by a "less expereinced" SE.  This is one area where
IBM sometimes has trouble.  If the SE is either not experienced enough, or
is unable to resolve your problem, he is supposed to determine whether the 
problem is a "DEFECT" or "HOW-TO".  Unfortunately, someone with little 
experience, usually doesn't know enough to determine if it is a "DEFECT" or
"HOW-TO".  So they call the channel that provides the fastest response:
AIX Software DEFECT Support Level 2.  For a second, lets assume they really
did know that the problem was "HOW-TO" and they followed the proper channel
to resolve the problem.  HOW-TO problems will go through the following chain
of people.  The SE contacts the Area-Specialist.  If the AS cannot solve the
problem, he directs the SE to the National Technical Support Center.  The
SE can contact this center through electronic mail ONLY, so this method
usually takes several days to resolve a problem.  (IBM is working on the
possiblity of making the Tech Support group accessable by phone, but right
now that is not a possibility.)  If the NTSC cannot resolve the problem,
they will contact Level 2 DEFECT support (to find out if "Anybody has 
heard of this problem"), and if Level 2 can't help, NTSC will contact
Level 3 (change team), and finally NTSC will contact development.  NOTE:
there is a distinct difference between Level 3 (CT) and Development.
Development writes the NEW code for the "next" or "future" releases, and
CT maintains the existing release(s).

Now here's what happens when the problem goes from SE to Level 2 DEFECT
support.  (This does not assume that the problem is a HOW-TO, but either
way it is started the same at Level 2.)  IBM Level 2 Software Defect 
Support (L2) is just what the name implies; DEFECT support.  They are 
contacted by calling the 1-800-237-5511 number.  The people answering the
phone are at one of several regional support centers.  They are Level 1
support.  And they receive calls for ALL IBM systems and OS's.  They ask
the called a few questions: "customer number", "type of machine", "operating
system", etc.  Then the level 1 person routes the information (entered into
a database system that is distributed across the nation) to the proper
L2 group for the product (well, ususally they do... sometimes they 
accidentally route calls to the wrong group, but usually a callback solves
that).  If your product happens to be the RS/6000 and AIX V.3, they route
you to Austin, Texas, and live transfer your call to a Level 2 representative
here in Austin.  (RT and AIX 370, and AIX PS/2 also get routed to Austin, but
their calls are not live transfered, they Level 2 person must call the
customer back.)  The person answering the call is a FULL Level 2 support
person, who has been assigned to answer incoming calls at that particular
time (or who just saw the light blinking on the wall -- that indicates an
incoming call has not yet been answered -- and decided to grab the call).
NOTE: Just for clarification L2 for the RS/6000 takes over 250 calls every
day so sometimes it takes several minutes to get to a specific call during a 
busy time.  IBM is committed to provide instant response to all calls, but
sometimes the system they use hits a glitch (when this happens they usually
are quick to make adjustments).  Anyway, when the L2 person takes the live
call, they have no way of knowing what the person who is calling is 
ahving trouble with, so they start by asking questions.  Sometimes, you get
lucky and happen to have your call answered by someone knowledgable in
your specific problem, but more often the first person you talk to will 
take down basic information and queue your problem over to someone who works
in the group that best understands your problem.  These people will review
the basic information provided by the call taker, and then proceed to resolve
the problem.  This ususally requires contacting the customer for more 
information, running testcases, attempting to re-create your problem locally
etc.  If the problem is a "HOW-TO" question, Level 2 is required to send the
customer back to the SE.  When I left L2 (about 2 months ago) we averaged
about a 60-40 HOW-TO to DEFECT ratio.  So you can see, that often times 
DEFECT support spends significant amounts of time determining if a problem 
is HOW-TO, and then contacting the customer to have them call the SE back.

When L2 has sufficiently tested a problem to determine that "there appears
to be a defect", they will pass the problem on to the Change Team (CT).
The CT (also called Level 3 (L3)) will then read through the problem record
and evaluate the problem as a possible code defect.  Try to remember that 
by-and-large the L2 person is less knowledgable (although not in all cases)
that the CT person, so some of the problems are rejected by CT as "User
Errors" (which is functionally the same as "HOW-TO").  Basically the CT
can close a problem in the following ways:

	USE: User error.  The customer is not using the program as it was 
	     intended/documented.

	IDD: Documentation error (IDD is IBM document group).  This is sometimes
	     used instead of a USE when the documentation is unclear, but the
	     code was not intended to be used like the customer was attempting
	     to use it.

	PER: Programming Error (in the IBM supplied code).  This indicates that
	     a software DEFECT was found, and corrected.

	PRS: Permanent Restriction.  There is a software DEFECT (or code error)
	     but it is not reasonable to fix it at this time.  (It may be fixed
	     in a later release.)

	SUG: Suggestion.  The code is working as designed, but the requested 
	     change is being evaluated for a possible future enhancement.

	MCH: Machine Error.  The provided debug information indicates a hardware
	     error.  This can either be a hardware design defect, or a specific
	     defective piece of hardware.

	UR5: Unreproducable at the described level.  Basically this is only used
	     when the problem is clear (ie. this program should do this but 
	     instead it does this...)

There may be a few more, but these are the most widely used closing codes.

When a DEFECT is corrected, CT reviews the code change, and then builds the
code change into an update.  The update is then tested by the Regression Test
Lab, to see if the update has any defects.  Of course the regression test is
not perfect, so problems sometimes slip through... Then the update is tested
by the CT people who made code fixes.  Each person tests their own fixes.  
Some programs require special equipment, and cannot be tested in the lab 
environment at IBM, so they are sent off to the customer to test before the
Regression Tests begin (just the specific program that was failing).  The 
customer then tests the new version, and the CT person merely verifies that
the code the customer tested is the same as the "REAL" update.

Some other proceedural thingys... 

When Level 1 takes your call, they create a PMR, and then queue that PMR to
L2.  Level 2 is responsible for the PMR until it is resolved.  When Level 2
decides that the problem is a possible DEFECT, they create an APAR.  The 
APAR is sent to CT, and a copy of the PMR is sent to the CT person who will
work the APAR.  CT is responsible for the resolution of the APAR.

When a PMR is created, it is given a PRIORITY.  This indicates the desired
responsiveness that the customer requires on the PMR FROM LEVEL 2.  This 
priority is set by the customer.  It is a number from 1-4, and indicates the
following:

	1) - 1 hour contact required... 

	2) - 2 hour contact required...

	3) - 1 day contact required...

	4) - 1 week contact required...

NOTE: there is no requirement that the problem be serious for the customer,
only that the customer be contacted within the specified time period.  On 
the live transfers, this number is irrelavent until the problem is queued up
to the Subject Matter group for the problem.  Then this number is used to 
determine the priority the call takes in getting a Level 2 response.

When an APAR is created, it is given a Severity.  This is also a number from
1-4, but it is not determined by the customer.  This number is determined by
the level 2 person who creates the APAR, and is based on several criteria:

	1) - The customer's machine is not operational.  This requires 24 hours-
	     a-day response.  CT must work round-the-clock to provide a
	     workaround solution to get the customer minimally operational.

	2) - The customer's operations are seriously impacted.  The CT must
	     provide a "code fix" (for DEFECT problems) within 10 days.

	3) - The customer is not seriously impacted, but the problem is 
	     affecting customer operations minimally.  The CT must provide
	     a "code fix" within 26 days.

	4) - Reserved for DOCUMENTATION errors.  The IDD group must accept the
	     problem and agree to a documentation change within 40 days.

The L2 person will attempt to work with the customer to get the proper 
severity set for a problem, but usually these are the criteria that he/she
must meet in order for CT management to work the problem.  NOTE: the SEV1
24 hour response is intended to be for a "workaround" to the problem.  That
means that CT will spend their efforts trying to find the FASTEST method 
to get the customer operational (minimally).  Sometimes this will involve 
the actual code fix, but often it does not.  Once the customer is operating
again, the proble should be lowered to Sev2.

Sorry for the length of this posting, but I hope this clears up some of the
mystery behind IBM software support.

+-----------------------------------------------------------------------------+
|The views expressed herein, are the sole responsibility of the typist at hand|
+-----------------------------------------------------------------------------+
|UUCP:     pensoft!robin                                                      |
|USNail:   701 Canyon Bend Dr.                                                |
|          Pflugerville, TX  78660                                            |
|          Home: (512)251-6889      Work: (512)343-1111                       |
+-----------------------------------------------------------------------------+



More information about the Comp.unix.aix mailing list