domain query subroutine res_search

Mark Whetzel markw at airgun.wg.waii.com
Tue Apr 16 05:51:12 AEST 1991


I am working with another programmer on porting the IDA sendmail
to the IBM RT running AIX 2.2.1.  (yes its yucky IBM, but it works and
it's paid for :-)

So far so good on making it work, but we may have found a bug with the
AIX at the latest maint level, and code that works on one RT (2705 level) won't
work on  another RT (1773 level) at a higher maint level.   

The problem area of code is dealing with domain server queries in the
routine domain.c, in particular it is using the res_search system
subroutine.  I can't find any documentation about this routine, and referencing
both AIX V3 (RS6000) and SUNOS (4.0.1) and CONVEXOS, none of these systems
also have documentation about this system subroutine.  I can find
res_init, res_mkquery, res_send, but not res_search.  What is happening
is res_search, called for looking for MX records is returning a -1 return code
and h_errno is set with TRY_AGAIN (value 2) rather than NO_DATA (value 4).
The NO_DATA value indicates that the host record is valid, but no records
of the requested type could be found.

The nameserver is reachable, and a piece of test code that 
queries with type = T_A work properly and return a valid query record, but
types of T_MX fail, with this TRY_AGAIN failure.  This causes sendmail to
defer the mail, waiting for a nameserver positive response.
We currently do not have many MX records in our nameserver
and the system name that is being queried, does not have any MX records on
file.

Here is the code fragment from domain.c from the IDA sendmail:
        [some code deleted]
typedef union {
        HEADER qb1;
        char qb2[PACKETSZ];
} querybuf;
extern int h_errno;
querybuf answer;
	[some code deleted]
        errno = 0;
        n = res_search(host, C_IN, T_MX, (char *)&answer, sizeof(answer));
        if (n < 0)
        {
          if (tTd(8, 1))
              printf("getmxrr: res_search failed (errno=%d, h_errno=%d)\n",
                            errno, h_errno);
           switch (h_errno)
                {
# ifndef NO_DATA
#  define NO_DATA       NO_ADDRESS
# endif /* NO_DATA */
                  case NO_DATA:
                  case NO_RECOVERY:
                        /* no MX data on this host */
                        goto punt;

                  case HOST_NOT_FOUND:
                        /* the host just doesn't exist */
                        *rcode = EX_NOHOST;
                        break;

                  case TRY_AGAIN:
                        /* couldn't connect to the name server */
                        if (!UseNameServer && errno == ECONNREFUSED)
                                goto punt;

                        /* it might come up later; better queue it up */
                        *rcode = EX_TEMPFAIL;
                        break;
                }

Any pointers on what may be wrong? Where is this routine discussed, is
all these systems documentation lacking? I am going to report this to IBM,
but with an undocumented routine, it may be tricky.  As I indicate, on a 
different system, all works ok. 

PS. I have verified the /etc/resolv.conf file to verify proper contents,
it is identical to other systems at our site, and other hostname lookups
are correctly working (telnet, rlogin, host, ect..). 
I have tested this on the RS6000 and also get the h_errno=4 just like the
2705 level RT.

Thanks for any light you can shed on this funny routine and its orgins.
Markw
-- 
Mark Whetzel     My comments are my own, not my company's.
Western Geophysical - A division of Western Atlas International,
A Litton/Dresser Company           DOMAIN addr: markw at airgun.wg.waii.com
				   UUNET address:  uunet!airgun!markw



More information about the Comp.unix.aix mailing list