Computer Science 010

Lecture Notes 5

Strings & Introduction to Dynamic Memory Management

Arrays of Strings

Suppose you wanted to have an array of strings. You could declare one as:

char *strArray[] = {"cat", "dog", "mouse"};

Here's some code to show how we might manipulate arrays of strings:

   /* Some simple strings */
  char *str1 = "abc";
  char *str2 = "aeiou";

  /* An array of 10 strings */
  char *strArray [10];

  /* An array of strings, equate to first string array */
  char **strings = strArray;

  /* Put strings in the first array.  This copies the address of the
     strings into our array. */
  strArray[0] = str1;
  strArray[1] = str2;

  /* Demonstrate that we can access the second array using a pair of 
     subscripts */
  printf ("%c\n", strings[0][1]);
  printf ("%c\n", strings[1][1]);

  /* Demonstrate that we can walk the array of strings with pointer 
     manipulation */
  printf ("%s\n", *strings);
  strings++;
  printf ("%s\n", *strings);

So, if "char *" is eqivalent to a char array, then it would seem that we should be able to do the following:

  /* Some simple strings */
  char *str1 = "abc";
  char *str2 = "aeiou";

  /* An array of 10 strings, each of length 20 */
  char strArray [10][20];

  /* An array of strings, equate to first string array */
  char **strings = strArray;

  /* Put strings in the first array */
  strArray[0] = str1;
  strArray[1] = str2;

  /* Demonstrate that we can access the second array using a pair of subscripts */
  printf ("%c\n", strings[0][1]);
  printf ("%c\n", strings[1][1]);

  /* Demonstrate that we can walk the array of strings with pointer manipulation */
  printf ("%s\n", *strings);
  strings++;
  printf ("%s\n", *strings);

This code looks tanalizingly similar to what we had above but it does not compile. Neither the initialization of strings, nor the setting of the elements of strArray compile. Here's the difference. In the first case, when we declared strArray, we declared it as an array of 10 elements, each of which was a pointer. As a result memory was allocated to hold 10 pointers. The type of strArray[0] is char *, thus I can assign a string to strArray[0].

In the second version, we have declared strArray to be an array with 10 entries. Each entry in the array is an array of 20 characters. This declaration has allocated 10 * 20 bytes. The type of strArray[0] is an array. Arrays cannot be assigned to, thus the assignments to strArray[0] and strArray[1] are illegal.

Let's consider another variant.

  /* Some simple strings */
  char *str1 = "abc";
  char *str2 = "aeiou";

  /* An array of 10 strings, each of length 20 */
  char strArray [10][20];

  /* An array of strings, equate to first string array */
  char (*strings)[20] = strArray;

  /* Put strings in the first array */
  strcpy (strArray[0], str1);
  strcpy (strArray[1], str2);

  /* Demonstrate that we can access the second array using a pair of subscripts */
  printf ("%c\n", strings[0][1]);
  printf ("%c\n", strings[1][1]);

  /* Demonstrate that we can walk the array of strings with pointer manipulation */
  printf ("%s\n", *strings);
  strings++;
  printf ("%s\n", *strings);

Look at the declaration of strings here. That very peculiar syntax indicates that it is a pointer to an array containing 20 characters. Now, I can assign strArray to it, because it is also a poitner to 20 characters! Also, note the way to fix the assignment of a string to an element of this array. I can't modify the array as a whole, but I can modify the contents of the array. The strcpy function calls do that, copying str1 (and str2) into the first two elements of the array. Once I've gotten everything initialized, I can then use the strings variable is before to index into a 2D array and to use pointer arithmetic to walk through the array of strings.

When working with multidimensional arrays, only the first array index can be replaced with a pointer. The sizes of all the other indexes must be statically known so that the compiler knows how much memory to allocate for an element of the array.

Memory Management

Memory management involves the allocation of memory to hold data and the deallocation of memory when it is no longer used. Java programmers are familiar with memory allocation through the use of constructors and the new keyword. This allocates memory to hold a new object. In Java, memory deallocation is done automatically by the runtime system using a mechanism called garbage collection. The runtime system keeps track of where an object is referenced and when there are no more references to an object, it frees the memory associated with the object.

C also provides a mechanism for memory allocation. It does not provide an automatic mechanism for memory deallocation but instead requires the programmer to do so specifically. As a result, the programmer must be very aware of what their pointers are pointing to so that they can deallocate memory appropriately. Incorrect memory deallocation is the source of many C programming errors and a type of programming error that can be extremely difficult to track down.

Memory Allocation

Recall that types in C can either be pointer types or non-pointer types. Non-pointer types can be simple values like ints or complex values like structs. Pointer types contain * in the type specification. Memory is automatically allocated when non-pointer types are used. You can immediately assign to variables with non-pointer types:

int i;
typedef struct {
    int month;
    int day;
    int year;
} date;
date today;
   
i = 0;
today.month = 1;
today.day = 14;
today.year = 2003;

Variables declared with a pointer type have enough memory to hold a pointer, but not enough memory to hold the thing pointed to. Instead a pointer variable can take on a value in one of 3 ways:

We have already seen the first two of these so we'll focus on the third here. First, we declare a variable with a pointer type. The next line allocates memory to hold a value. malloc stands for memory allocation. We need to tell malloc how much memory to allocate and it returns a pointer to the newly allocated memory. The amount of memory we need is enough to hold an instance of the type we want to store at that memory. To find out how big an instance is, we use the sizeof function telling it the name of the type we plan to store there. sizeof returns the number of bytes needed and passes this value to malloc. malloc returns a pointer to this memory but since malloc can be used in lots of contexts it doesn't know what type of pointer we need. As a result, the return type of malloc is void *. This simply means it is returning a pointer but is not specific as to what type of thing is being pointed to. In order to assign the pointer to our variable we need to tell the assignment operator what pointer type we are using. To do this, we just precede the malloc call with the pointer type in parentheses. Also, note that we need to include stdlib.h which is where the prototype for malloc is defined.

Pointers and Arrays Revisited

Recall that last time we said that pointers and arrays are very similar. In fact, when we declare an array variable, C allocates enough memory to hold a pointer to the array and the array elements themselves. It can do this because we need to declare the size of the array when we declare the array:

char name[20];
char school[] = "Williams";

In the first case, we tell it how big to make the array with a constant. In the second case we tell it what value we want to give the array and it allocates just enough memory to hold that value and the terminating null character.

Here is an equivalent way of doing the above using pointers:

char *name = (char *)malloc (20);
char *school = "Williams";

If we don't know how big we want the array initially, we could do the following:

char *school;
...
school = (char *)malloc (strlen ("Williams") + 1);
strcpy (school, "Williams");

First we declare a string pointer but do not allocate memory to hold the characters in the string. We use malloc later to allocate the memory. We use strlen to determine how long the string is and then add 1 for the terminating null character. Finally, we copy the string value in using strcpy. The null character is automatically copied for us.


Return to CS 010 Home Page