Help needed with System V message queues

George Bogatko bogatko at lzga.ATT.COM
Thu Aug 9 02:36:56 AEST 1990


HI:

A while ago, I got sick of wondering what the tunables for messages meant.
The result was a writeup which was begun, interrupted, and never
finished.  This is what remains of it.  It is 500+ lines long, so you
may want to skip all this if message queues leave you cold.  After it,
in a second posting is a 'pic' file of how messages are stored in memory.

This may answer some questions about message queues, and how they work.

If I see enough interest, I may finish the writeup.

************

OVERVIEW

In /etc/master.d there is a file, called msg that contains
the tunable parameters for the device driver that handles
the UNIXTM message queue system.  This file contains lines
similar to the following:

    MSGMAP  = 100
    MSGMAX  = 2048
    MSGMNB  = 4096
    MSGMNI  = 50
    MSGSSZ  = 8
    MSGTQL  = 40
    MSGSEG  = 1024

The meaning of these parameters, as stated in the manual
Operations and Administration Series: Performance Mangement
under the chapter Tunable Parameters Definitions is:

MSGMAP	 specifies the size of the memory control map
	      used to manage message segments.  If this
	      value is insufficient to handle the message
	      type facilities, a warning message is sent ot
	      the console.

MSGMAX	 specifies in bytes the maximum size of a
	      message sent.  When receiving a message, a
	      value larger than this parameter can be used
	      to ensure that the whole message is received
	      and not truncated.

MSGMNB	 specifies the maximum length, in bytes of a
	      message queue.  The owner of a facility can
	      lower this value, but only the superuser can
	      raise it.

MSGSEG	 Specifies the number of message segments in
	      the system.  MSGSEG * MSGSSZ should be less
	      than 131,072 bytes (128 kilobytes).

MSGSSZ	 specifies the size in bytes of a message
	      segment as stored in memory.  Each message is
	      stored in a contiguous message segment.  The
	      larger the segments are, the greater the
	      chance of having wasted memory at the end of
	      each message.  MSGSSZ * MSGSEG should be less
	      than 131,072 bytes (128 kilobytes).

MSGTQL	 specifies the number of message queue headers
	      on all message queues sytem-wide, and thus,
	      the number of outstanding messages.


These values are stored in a structure called "struct msginfo"
which looks like this:

struct msginfo {
	int	msgmap,
		msgmax,
		msgmnb,
		msgmni,
		msgssz,
		msgtql;
	ushort	msgseg;
};

This structure is used when the MSG driver is initialized.

Notice that the values are stored as INTS.  This bites you
in the butt later on when the value of 'msgmnb' is put
into 'msg_qbytes'.

All the structures described here are found in '/usr/include/sys/msg.h'

There are four data structures used in the message queue
universe.

struct msgbuf
    This is actually a template used by the user to hold
    the message; it is assumed (incorrectly) that the user
    knows enough to re-write this template to suit their
    needs.  The structure given in the header file msg.h is
    never used.  The only thing that must be here is the
    member "long mtype".  which must always be the first
    member.  The template looks like this:


	 struct msgbuf {
		 long    mtype;	  /* message type */
		 char    mtext[1];       /* message text */
	 };


    Notice that the size of mtext is one (1). This is not
    enough size to hold anything useful.

    Newcomers to UNIXTM messages almost always bark their
    shins on this one.

struct msg
    This is the structure that points to the address of the
    actual message in the message pool.  It looks like
    this:

	 struct msg {
		 struct msg      *msg_next;
		 long	    msg_type;
		 short	   msg_ts;
		 short	   msg_spot;
	 };

    The message queue driver keeps these structures, called
    message headers in an array, whose size is determined
    by the tunable parameter MSGTQL.  Each outstanding
    message, i.e. a message that has been sent but not yet
    received, has one of these headers associated with it.
    Thus the number of outstanding messages handled by the
    driver is determined by the setting of MSGTQL.

    The members of the structure are used as:

    msg_next
	 Even though these headers are in an array, they
	 are handled like a linked list.  This member
	 points to the next header, which is located
	 somewhere in the array.

    msg_type
	 This corresponds to the long mtype member of the
	 msgbuf structure shown above.

    msg_ts
	 This is the precise length in bytes of the message
	 that is stored in the message pool.

    msg_spot
	 This is the location in the message pool of the
	 message.  Notice that it is not a pointer.  It is
	 really an offset.  

struct msqid_ds
    This is the structure that keeps vital statistics about
    a particular message queue that is being serviced by
    the driver.  It looks like this:

	 struct msqid_ds {
		 struct ipc_perm msg_perm
		 struct msg      *msg_first;
		 struct msg      *msg_last;
		 ushort	  msg_cbytes;
		 ushort	  msg_qnum
		 ushort	  msg_qbytes;
		 ushort	  msg_lspid;
		 ushort	  msg_lrpid;
		 time_t	  msg_stime;
		 time_t	  msg_rtime;
		 time_t	  msg_ctime;
	 };

    The driver keeps these structures in an array, whose
    size is determined by the tunable parameter MSGMNI.
    Thus the maximum number of message queues handled by
    the driver is determined by the setting of MSGMNI.

    The members of the structure are used as:

    msg_perm
	 This is a structure located in ipc.h that contains
	 various permissions and ids.  It looks like this:

	      struct ipc_perm {
		      ushort  uid;    /* owner's user id */
		      ushort  gid;    /* owner's group id */
		      ushort  cuid;   /* creator's user id */
		      ushort  cgid;   /* creator's group id */
		      ushort  mode;   /* access modes */
		      ushort  seq;    /* slot usage sequence number */
		      key_t   key;    /* key */
	      };

	 This structure is used by all the drivers in the
	 IPC system.  The meaning of these variables is
	 clear from their names, and the comments; except
	 for seq.

	 This variable holds a sequence number that is used
	 to determine the msqid that is returned from the
	 msgget() call.  By constantly incrementing this
	 number, one can be sure that the same message
	 queue header will not have the same msqid returned
	 when it is re-allocated.

    msg_first
	 This is a pointer to the first struct msg member
	 in the linked list of struct msg message headers.

    msg_last
	 This is a pointer to the last struct msg member in
	 the linked list of struct msg message headers.

    msg_cbytes
	 This is the total number of bytes currently on the
	 queue.  It represents the accumulated total of all
	 the values of the struct msg member msg_ts in the
	 linked list of outstanding message headers for
	 that particular queue.

    msg_qnum
	 This is how many messages are outstanding on the
	 queue; and thus how many struct msg message
	 headers are linked to this queue header.

    msg_qbytes
	 This is an upper limit of how many bytes can be
	 outstanding on the queue.  This is an arbitrary
	 value, which is set from the value of the tunable
	 parameter MSGMNB.  This means that raising or
	 lowering this value does not alter how many total
	 messages you can have handled by the driver.  That
	 is determined by the size of the message pool.  This
	 value can be altered by either the owner/creator of
	 the message queue, or the super-user, without having
	 to change the value of MSGMNB.

    msg_lspid
	 The process id of the last process to send a
	 message.

    msg_lrpid
	 The process id of the last process to receive a
	 message.

    msg_stime
	 The last time a message was put on this queue.

    msg_rtime
	 The last time a message was taken off this queue.

    msg_ctime
	 The last time anything at all happened regarding
	 this header.

The message pool
    There is no formal name for the pool in the msg.h
    structure.  The message pool is an amorphous blob of
    memory.  It is obtained by a call to the kernel function
    kseg() which returns the base address of a segment of kernel
    memory.  This address is cast to type paddr_t (physical
    address type) which on the 3B2 line is a long.  It is
    treated as a contiguous array of single bytes.  Think of it
    as a char array.

    This size of this blob is determined by multiplying the
    values in the two tunable parameters MSGSEG and MSGSSZ.
    The result of the multiplication is then rounded up to
    the nearest page size.  This size is passed as a
    parameter to kseg().

    The argument to kseg() is a request for pages of
    memory, in the range 1 - 64.  64 pages (128K) is the
    upper limit of memory that will be returned by kseg().
    This is why the tunable parameters guide mentioned
    above says:

    "MSGSSZ * MSGSEG should be less than 131,072 bytes (128 kilobytes)."


SENDING A MESSAGE

Before diving in to how the driver handles messages, it
might be best to present a general picture of how
outstanding messages are stored.

Recall from part 1 that there are four data structures
involved in the process:  struct msgbuf, struct msg, struct
msqid_ds, and the memory pool.  Briefly, the msqid_ds
structure points to the first instance of a message header,
which is a msg structure.  Each message header points to the
next message header in the linked list of outstanding
messages.  Each message header contains the offset in the
memory pool where the actual message is being stored.

The memory pool is logically divided into segments (of size
MSGSSZ), and total number of these segments is of size
MSGSEG.  When a message is actually stored, it will occupy
as much space as necessary, rounded up to the nearest
segment size.  From this it can be seen that unless the
message size aligns with the MSGSSZ segment size, there will
be some wasted bytes associated with each stored message.

The enclosed diagram Mapping a Message Queue ID to a 28 Byte
Message in Kernal Memory displays the association of all
these data structures in the job of holding a 28 byte
message in memory.

****  SEE FOLLOWING POSTING FOR PIC FILE ****

3.1  Conversion of a user supplied message queue ID (msqid)
    to a pointer to a struct msqid_ds queue header

This is a fundamental algorighm in the whole process, and is
called from many places in the driver;

algorithm 'msgconv'

   convert msqid to msqid_ds structure
   {
1.      pointer = address of array,
		 offset by msqid
		 modulo MSGMNI

2.      lock the reference to the structure

3.      if(the structure is not in use) ||
	  the sequence number doesn't equal
	  the msqid divided by MSGMNI)
	      return EINVAL

4.      return the pointer found in step 1.
   }

step 1. convert msqid to pointer
    Remember that the msqid_ds structures are held in an
    array of size MSGMNI.  Assuming that the call msgget()
    works correctly (it does),  the msqid that is returned
    from that call will always be the result of an
    incrementing sequence number (remember struct
    ipc_perm.ushort seq?)  times the value of MSGMNI, plus
    the offset into the msqid_ds array of the assigned
    message header.  Thus if MSGMNI is 100, the sequence
    number 3, and the offset 30, the message queue id
    returned by msgget() would be 330.

    The line that converts the msqid back to the offset is
    simply:

	    qp = &msgque[id % msginfo.msgmni];

    Which is to say that the offset of the array (here
    msgque) is msqid modulo MSGMNI.  In our example, this
    would be 30, which is indeed the offset of the array.

step 2. Lock the reference to the structure
    A parallel char array, of size MSGMNI is kept.  Once
    the offset is found, it is used to find the associated
    lock value in this lock array. As long as this value is
    1, the process sleeps (lines 1 and two following).

	    1  while (*lockp)
	    2       sleep(lockp, PMSG);
	    3  *lockp = 1;

    When another process, which is using this message
    header, is done, it sets the value of this lock to 0,
    and issues a wakeup().  Our process then wakes up and
    now finds the value to be 0. It then stops going to
    sleep, and locks the value (line 3).

    This is what allows message queues to act "atomicly"

step 3. check the msqid value for validity
    The value of qp->msg_perm.seq is checked against the
    value of msqid divided by MSGMNI and if they don't
    match, then errno is set to EINVAL and the system call
    returns with an error (-1).  In our example, if the
    msqid is 330, then 330/100 (in integer arithmetic)
    yields 3, which is indeed the sequence number.  Thus
    the msqid is successfully converted to a valid and
    active message header.

step 4. return the queue pointer

3.2  Sending_a_message

Sending a message consists of receiving a buffer from the
user, and putting it into the message pool, with proper
labeling so that a receive request can copy that message out
of the message pool.

algorithm 'msgsnd'

   send a message
   {
1.      convert msqid to msqid_ds pointer (algorighm msgconv)

2.      if(access denied by incorrect permissions)
	   return EACCES

3.      if(byte count <= 0 || byte count > MSGMAX)
	   return EINVAL

4.      copy the message type from the user area
	   return EFAULT on error

5.      if(message type <= 0)
	   return EINVAL

   GETRES:

6.      if(queue has been removed or changed)
	   return EIDRM

7.      if( (total bytes in queue > MSGMNB) ||
	   (no free msg headers available [MSGTQL])

       {
8.	  if(IPC_NOWAIT set)
	       return EAGAIN

9.	  sleep

10.	 if(sleep was interrupted)
	       return EINTR

11.	 goto GETRES
       }

12.     call 'malloc()' to find free slot in
       message pool.

13.     if(no free space in message pool)
       {
	   if(IPC_NOWAIT set)
	       return EAGAIN

	   sleep

	   if(sleep was interrupted)
	       return EINTR

	   goto GETRES
       }

14.     assuming all is OK, copy from user to
       message pool.

15.     if(system error during copy)
       {
	   call 'mfree' to mark slot as free
	   return EFAULT
       }

16.     update 'msqid_ds' header

17.     initialize 'msg' header

18.     link 'msg' header into chain of
       related 'msg' headers.

19.     return 0
   }

*****************

At this point I was interrupted by real work, and never returned to the
writeup. The "Bach Book" has a good writeup on this stuff.

A few points however:

MSGMAX is not MSGMNB.  MSGMAX is the largest message you can send.  You
can't find out the value of MSGMAX from the msqid_ds structure.

MSGMNB can be found out from "msg_qbytes".  Recall from the above, that
you can reset this to a higher value if you are super-user
and always to a lower value if you are the owner.
What is important to note however is that this number has nothing to
do with capacity of the driver.  It is just a number that is compared against.

MSGMNB is stored in "struct msginfo" as an INT, but when the driver is
initialized, it is transfered to "struct msqid_ds" in member "msg_qbytes"
which is a "ushort".  Thus while some documentation may say that you
can have a high message queue maximum, you can only have the upper
limits of a ushort (65535).  If you set MSGMNB to a higher value
than this, it will wrap, and you will wind up with a lower value.

If you want to increase the size of the message pool, increase
MSGSEG.  NEVER INCREASE MSGSSZ.  Recall from  above:

       "The memory pool is logically divided into segments (of size
       MSGSSZ), and total number of these segments is of size
       MSGSEG.  When a message is actually stored, it will occupy
       as much space as necessary, rounded up to the nearest
       segment size.  From this it can be seen that unless the
       message size aligns with the MSGSSZ segment size, there will
       be some wasted bytes associated with each stored message."

This means that if MSGSSZ is 50 bytes, and you have a 10 byte message,
you will occupy one slot, and have 40 bytes of wasted storage hanging
around.  But if you keep the value of 8, then you will occupy 2 slots
and have 6 bytes of wasted storage hanging around.

See the following posting for a pic file of how this storage works.

In the msgsnd description above is:

       12.     call 'malloc()' to find free slot in
	       message pool.

       13.     if(no free space in message pool)
	       {
		   if(IPC_NOWAIT set)
		       return EAGAIN

		   sleep

		   if(sleep was interrupted)
		       return EINTR

		   goto GETRES
	       }
Notice that there no escape from this if the size of the message you are
trying to send is greater then the total amount of memory in the message
pool.  Thus you can hang forever waiting for enough room to become
available.  This will be allowed if MSGMNB is set to be greater than
( MSGSSZ * MSGSEG * (sizeof page on your machine (2048 on 3B's) ) ).
Be careful when setting the tunables!!!

When sending messages, the third parameter must be the size of
the actual message, not the size of the 'struct msgbuf', which
will be sizeof(long) bytes bigger.  If you don't pay attention
to this, you will get message trashing.   I can't recall now how,
when, or why this happens, but I guarantee that it will be mysterious,
and usually fatal.

		 struct msgbuf {
			 long    mtype;	  /* message type */
			 char    mtext[1];       /* message text */
		 };

So the safe way to use 'msgsnd' is:

typedef struct {
	long mtype;
	struct {
		char buf[10];
		int xxx;
		double yyy;
		etc...
	} mtext;
} MSG;

MSG message;

	1. msgsnd( msqid, &message, sizeof(message.mtext), 0);
OR
	2. msgsnd( msqid, &message, sizeof(message)-sizeof(long), 0);

I prefer #1.

I also prefer to send message with the fourth parameter as 0.  This will
make the message block if there is no room at the time.  It is more normal
for the message to block in a heavily loaded system then not, so you
will probably NOT want to die if you can't send the message because
there's no room.  If you are worried about deadlock, put in an
alarm call so you can time out.

When receiving messages, check for EINTR if you get a -1.
You will usually want to just wrap around and try again if you get
an interrupt (usually from an alarm call, or some other friendly signal).


Hope this (long-winded) yakking helps somebody.


GB



More information about the Comp.unix.wizards mailing list