dynamically allocating array of struct

Chris Torek chris at mimsy.UUCP
Thu Apr 6 03:52:47 AEST 1989


In article <3658 at uhccux.uhcc.hawaii.edu> cs411s03 at uhccux.uhcc.hawaii.edu
(Cs411s03) writes:
>I am having difficulty writing a program which dynamically allocates
>an array of structs via malloc() and casting. ...

>struct ttst (*tptr)[];

% cdecl
explain struct ttst (*tptr)[]
declare tptr as pointer to array of struct ttst
declare tptr as pointer to array of struct ttst
Warning: Unsupported in C -- Pointer to array of unspecified dimension
struct ttst (*tptr)[]
%

C arrays *must* have a size.  What you really want, given the rest
of your example, is `struct ttst *tptr'.  Remember that a pointer to
an object that is part of an array can be used to access the entire
array.

Time for some replays.

From: chris at mimsy.UUCP (Chris Torek)
Subject: Re: pointers to arrays
Date: 18 Feb 89 04:32:47 GMT

If you think you want a pointer to an array allocated with malloc(),
you are probably wrong.  You really want a pointer that points *at*
(not `to') a block of memory (`array') containing a series of `char *'
objects each pointing at a block of memory containing a series of
`char's.  The type of such a pointer is `char **'.

You might ask, `what is the difference between a pointer that points
``at'' a block of memory and one that points ``to'' an array?'  The
distinction is somewhat artificial (and I made up the words for some
netnews posting in the past).  Given a pointer to array pa:

	int a[5];
	int (*pa)[5] = &a;	/* pANS C semantics for &a */

I can get a pointer that points `at' the array instead:

	int *p = &a[0];

The latter is the more `natural' C version of the former: typically
a pointer points at the first element of a group (here 5).  The rest
of the group can be reached via pointer arithmetic: *(p+3), aka p[3],
refers to the same location as a[3].

The pointer need not point at the first element, as long as it points
somewhere into the object:

	p = &a[2];

Now p[1] refers to a[3]; p[-2] refers to a[0].  To use pa to get at
a[3] one must write (*pa)[3] (or, equivalently, pa[0][3]).

The thing that is most especially confusing, but that really makes
the difference, is that *pa, aka pa[0], refers to the entire array
`a'.  *p refers only to one element of the array.  This can be seen
in the result produced by `sizeof': (sizeof *p)==(sizeof(int)), but
(sizeof *pa)==(sizeof(int[5]))==(5 * sizeof(int)).

Pointers to entire arrays are not particularly useful unless there
are several arrays:

	int twodim[3][5];

Now we can use pa to point to (not at) any of the three array-5-of-int
elements of twodim:

	pa = &twodim[1];	/* or pa = twodim + 1, in Classic C */

and now (*pa)[3] (or pa[0][3]) is an alias for twodim[1][3].  Note
especially that since pa[0] names the *entire* array-5-of-int at
twodim[1], pa[-1] names the entire array-5-of-int at twodim[0].
\bold{Pointer arithmetic moves by whole elements, even if those
elements are aggregates.}  Thus pa[-1][2] is an alias for twodim[0][2].

This is merely a convenience, for we can do the same with p:

	p = &twodim[1][0];

Now p points to the 0'th element of the 1'th element of twodim---the
same place that pa[0][0] names.  p[3] is an alias for twodim[1][3].  To
get at twodim[0][2], take p[(-1 * 5) + 2], or p[-3].  Arrays are are
stored in row-major order with the columns concatenated without gaps;
they can be `flattened' (viewed as linear, one-dimensional) with
impunity.  (The flattening concept extends to arbitrarily deep
matrices, so that a six-dimensional array can be viewed as a string of
five-D arrays, each of which can be viewed as a string of four-D
arrays, and so forth, all the way down to a string of simple values.%)

Once you understand this, and see why C guarantees that p[-3],
pa[-1][2], and twodim[0][4] are all the same, you are well on your way
to understanding C's memory model (not `paradigm': that means
`example').  You will also see why pa can only point to objects of type
`array 5 of int', not `array 17 of int', and why the size of the array
is required.

-----
% For fun: the six-D array `char big[2][3][5][4][6][10]' occupies
  7200 bytes (assuming one byte is one char).  If the first byte is at
  byte address 0xc400, find the byte address of big[1][0][3][1][5][5].
  I hid my answer as a message-ID in the references line.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris at mimsy.umd.edu	Path:	uunet!mimsy!chris



From: chris at mimsy.UUCP (Chris Torek)
Subject: Re: char ***pointer;
Keywords: allocating space
Date: 18 Nov 88 07:40:26 GMT

	char *p;

declares an object p which has type `pointer to char' and no specific
value.  (If p is static or external, it is initialised to (char *)NULL;
if it is automatic, it is full of garbage.)  Similarly,

	char **p;

declares an object p which has type `pointer to pointer to char' and
no specific value.  We can keep this up for days :-) and write

	char *******p;

which declares an object p which has type `pointer to pointer ... to char'
and no specific value.  But we will stop with

	char ***pppc;

which declares `pppc' as type `pointer to pointer to pointer to char',
and leaves its value unspecified.  None of these pointers point *to*
anything, but if I say, e.g.,

	char c = '!';
	char *pc = &c;
	char **ppc = &pc;
	char ***pppc = &ppc;

then I have each pointer pointing to something.  pppc points to ppc;
ppc points to pc; pc points to c; and hence, ***pppc is the character
'!'.

Now, there is a peculiar status for pointers in C: they point not only
to the object immediately at *ptr, but also to any other objects an
an array named by *(ptr+offset).  (The latter can also be written as
ptr[offset].)  So I could say:

	int i, j, k;
	char c[NPPC][NPC][NC];
	char *pc[NPPC][NPC];
	char **ppc[NPPC];
	char ***pppc;

	pppc = ppc;
	for (i = 0; i < NPPC; i++) {
		ppc[i] = pc[i];
		for (j = 0; j < NPC; j++) {
			pc[i][j] = c[i][j];
			for (k = 0; k < NC; k++)
				c[i][j][k] = '!';
		}
	}

What this means is perhaps not immediately clear%.  There is a two-
dimensional array of pointers to characters pc[i][j], each of which
points to a number of characters, namely those in c[i][j][0] through
c[i][j][NC-1].  A one-dimensional array ppc[i] contains pointers to
pointers to characters; each ppc[i] points to a number of pointers to
characters, namely those in pc[i][0] through pc[i][NPC-1].  Finally,
pppc points to a number of pointers to pointers to characters, namely
those in ppc[0] through ppc[NPPC-1].
-----
% :-)
-----

The important thing to note is that each variable points to one or
more objects whose type is the type derived from removing one `*'
from the declaration of that variable.  (Clear? :-)  Maybe we should
try it this way:)  Since pppc is `char ***pppc', what ppc points to
(*pppc) is of type `char **'---one fewer `*'s.  pppc points to zero
or more objects of this type; here, it points to the first of NPPC
objects.

As to malloc: malloc obtains a blob of memory of unspecified shape.
The cast you put in front of malloc determines the shape of the blob.
The argument to malloc determines its size.  These should agree, or you
will get into trouble later.  So the first thing we need to do is
this:

	pointer = (char ***)malloc(N * sizeof(char **));
	if (pointer == NULL) quit("out of memory... goodbye");

Pointer will then point to N objects, each of which is a `char **'.
None of those `char **'s will have any particular value (i.e., they
do not point anywhere at all; they are garbage).  If we make them
point somewhere---to some object(s) of type `char **'---and make
those objects point somewhere, then we will have something useful.

Suppose we have done the one malloc above.  Then if we use:

	pointer[0] = (char **)malloc(N1 * sizeof(char *));
	if (pointer[0] == NULL) quit("out of memory");

we will have a value to which pointer[0] points, which can point to
N1 objects, each of type `char *'.  So we can then say, e.g.,

	i = 0;
	while (i < N1 && fgets(buf, sizeof(buf), input) != NULL)
		pointer[0][i++] = strdup(buf);

(strdup is a function that calls malloc to allocate space for a copy
of its string argument, and then copies the string to that space and
returns the new pointer.  If malloc fails, strdup() returns NULL.)
We could write instead

	i = 0;
	while (i < N1 && fgets(buf, sizeof(buf), input) != NULL)
		*(*pointer)++ = strdup(buf);

Note that

		**pointer++ = strdup(buf);

sets **pointer (equivalently, pointer[0][0]), then increments the
value in `pointer', not that in pointer[0].  But using *(*pointer)++
means that we will later have to write

	pointer[0] -= i;

to adjust pointer[0] backwards by the number of strings read in and
strdup()ed, or else use negative subscripts to locate the strings.

Probably all of this will be somewhat clearer with a more realistic
example.  The following code creates an array of arrays of lines.

/* begin code (untested) */
/* this assumes prototypes are available */

#include <stddef.h>
#include <stdio.h>
#include <string.h>

static char nomem[] = "out of memory, exiting";

quit(char *msg) {
	(void) fprintf(stderr, "%s\n", msg);
	exit(1);
	/* NOTREACHED */
}

/*
 * Read an input string from a file.
 * Return a pointer to dynamically allocated space.
 */
char *readstr(FILE *f) {
	register char *s = NULL, *p;
	int more = 1, curlen = 0, l;
	char inbuf[BUFSIZ];

	/*
	 * The following loop is not terribly efficient if you have
	 * many long input lines.
	 */
	while (fgets(inbuf, sizeof(inbuf), f) != NULL) {
		p = strchr(inbuf, '\n');
		if (p != NULL) {	/* got it all */
			*p = 0;
			l = p - inbuf;
			more = 0;	/* signal stop */
		} else
			l = strlen(inbuf);

		/*
		 * N.B. dpANS says realloc((void *)NULL, n) => malloc(n);
		 * if your realloc does not work that way, you will
		 * have to fix this.
		 */
		s = realloc(s, curlen + l + 1);
		if (s == NULL)
			quit(nomem);
		strcpy(s + curlen, inbuf);
		if (more == 0)		/* done; stop */
			break;
		curlen += l;
	}
	/* should check for input error, actually */
	return (s);
}

/*
 * Read an array of strings into a vector.
 * Return a pointer to dynamically allocated space.
 * There are n+1 vectors, the last one being NULL.
 */
char **readfile(FILE *f) {
	register char **vec, *s;
	register int veclen;

	/*
	 * This is terribly inefficent, but it should be correct.
	 *
	 * malloc below is implicitly cast to (char **), but this
	 * depends on it returning (void *); old compilers need the
	 * cast, since malloc() returns (char *).  The same applies
	 * to realloc() below.
	 */
	vec = malloc(sizeof(char *));
	if (vec == NULL)
		quit(nomem);
	veclen = 0;
	while ((s = readstr(f)) != NULL) {
		vec = realloc(vec, (veclen + 2) * sizeof(char *));
		if (vec == NULL)
			quit(nomem);
		vec[veclen++] = s;
	}
	vec[veclen] = NULL;
	return (vec);
}

/*
 * Read a list of files specified in an argv.
 * Each file's list of lines is stored as a vector at p[i].
 * The end of the list of files is indicated by p[i] being NULL.
 *
 * It would probably be more useful, if less appropriate
 * for this example, to return a list of (filename, contents) pairs.
 */
char ***readlots(register char **names) {
	register char ***p;
	register int nread;
	register FILE *f;
	char **vp;
	extern int errno;

	p = malloc(sizeof(char **));
	if (p == NULL)
		quit(nomem);
	for (nread = 0; *names != NULL; names++) {
		if ((f = fopen(*names, "r")) == NULL) {
			(void) fprintf(stderr, "ThisProg: cannot read %s: %s\n",
				*names, strerror(errno));
			continue;
		}
		vp = readfile(f);
		(void) fclose(f);
		p = realloc(p, (nread + 2) * sizeof(char **));
		if (p == NULL)
			quit(nomem);
		p[nread++] = vp;
	}
	p[nread] = NULL;
	return (p);
}

/* e.g., instead:
struct file_data {
	char	*fd_name;
	char	**fd_text;
};
struct file_data *readlots(register char **names) {
	register struct file_data *p;
	register int nread;
	register FILE *f;
	char **vp;
	extern int errno;

	p = malloc(sizeof(*p));
	if (p == NULL)
		quit(nomem);
	for (nread = 0; *names != NULL; names++) {
		<...same file-reading code as above...>
		p = realloc(p, (nread + 2) * sizeof(*p));
		if (p == NULL)
			quit(nomem);
		p[nread].fd_name = *names;
		p[nread].fd_text = vp;
		nread++;
	}
	p[nread].fd_name = NULL;
	p[nread].fd_text = NULL;
	return (p);
}
*/
/* end of code */
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris at mimsy.umd.edu	Path:	uunet!mimsy!chris


-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris at mimsy.umd.edu	Path:	uunet!mimsy!chris



More information about the Comp.lang.c mailing list