C Programming/Arrays

From Wikiversity
Jump to navigation Jump to search

Objective[edit | edit source]

  • Learn about arrays and how to use them.
    • Declaring arrays.
    • Initializing arrays.
    • Passing arrays to functions.

Lesson[edit | edit source]

Introduction[edit | edit source]

An array is a series of contiguously allocated variables of the same type. Let us start with an analogy. Suppose that you have a cupboard in your house. You would like to keep books on each shelf of the cupboard. It would be nice to keep books related to one subject on one rack, books of another subject on another rack. A rack is an array. A rack is used to store books of the same subject. Similarly, an array can store data of the same type. When you declare an array, you must specify the type of data this array will hold. Every element of the array must be of that data type.

The syntax for declaring an array is:

data_type array_name[size];

For example:

int x[5];

Is an array of integers, with size 5, called x. Similarly,

float y[6];

is an array of floating point numbers, with size 6, called y.

To access the elements of an array, you must use array notation. The first element of an array is at position 0, the second at position 1, etc. The last element of an array is at position size - 1.

x[1] = 10;

As you can see, once you have selected an element using array notation, you can manipulate that element as you would any variable. Here is a more complex example:

int z[2];
z[0] = 10;
z[1] = 50;
z[0] = z[0] + z[1]; /* z[0] now contains the integer 60 */

Array Initialization I[edit | edit source]

When an array is declared, it is initially 'empty'; it does not contain any values. We can initialise an array as follows:

int x[5] = {5, 7, 2, 3, 8};

{5, 7, 2, 3, 8} is called an array-initialization block. When we use an array initialization block, we do not need to specify the size of the array. The following is equivalent:

int x[] = {5, 7, 2, 3, 8};

When have created an array with 5 elements. The first element is 5, the second is 7, etc.

Array Initialization II[edit | edit source]

Let us look at a small code snippet that prints the number of days per month.

#include<stdio.h>
#include<conio.h>
#define MONTHS 12

int main(void)
{
     int days[MONTHS] = {31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31};
     int i;

     for(i = 0; i < MONTHS; i++)
     {
          printf("Month %d has %d days.\n", i + 1, days[i]);
     }

     return 0;
}

OUTPUT

Month 1 has 31 days.
Month 2 has 28 days.
Month 3 has 31 days.
Month 4 has 30 days.
Month 5 has 31 days.
Month 6 has 30 days.
Month 7 has 31 days.
Month 8 has 31 days.
Month 9 has 30 days.
Month 10 has 31 days.
Month 11 has 30 days.
Month 12 has 31 days.

(You may not have seen #define before. It is a preprocessor instruction that defines a constant called MONTHS and assigns it the value 12.)

If you lack faith in your ability to count, we can let the computer give us the size of an array, by using sizeof. Just replace

for(i = 0; i < MONTHS; i++)

with

for (i = 0; i < sizeof days / sizeof (int); i++)

Variable-length array[edit | edit source]

A variable-length array, also called variable-sized or runtime-sized, is an array data structure whose length is determined at run time (instead of at compile time). In C99, VLAs (Variable-Length Arrays, as they are called) are a mandatory feature, that then became optional in later standards. Note that the major compilers all allow VLAs.

void array(int n)
{
    int value[n];
    for (int i = 0; i < n; ++i) value[i]=i;
    for (int i = 0; i < n; ++i) printf("%d ", value[i]);
    printf("\n");
    return;
}

int main(){
    array(10);
}

Assigning Array Values[edit | edit source]

We can assign values to array members by using array index.

int x[5];
int i;

for(i = 0; i < 5; i++)
{  
     x[i] = i + 1;
}

Thus the array x will contain the elements 1, 2, 3, 4, and 5.

Passing Arrays to a Function[edit | edit source]

Arrays cannot be passed to functions. If an array name is used as an argument in a function call, the address of the first element of the array is passed. The function can access the array through that address.

Suppose we want to write a function that returns the sum of elements of the array.

#include <stdio.h>

/* prototype */
int sum(int a[]);

int main(void) 
{
     int marbles[] = {5, 6, 2, 3, 7};
     int ans;

     ans = sum(marbles);
     printf("The total number of marbles: %d", ans);

     return 0;
}

int sum(int arr[])
{
     int i, s = 0;
     for(i = 0; i < 5; i++)
     {
          s = s + arr[i];
     }

     return s;
}

Note that the array name marbles is used as an argument to the function. The function is expecting an array because its parameter is defined as int arr[]. What is actually passed is the address of marbles[0]. The function then can access the array using array notation.

How are arrays stored in memory?[edit | edit source]

When we declare an array, space is reserved in the memory of the computer for the array. The elements of the array are stored in these memory locations. The important thing about arrays is that array elements are always stored in consecutive memory locations. We can verify this fact by printing the memory addresses of the elements. (Just like every person has a street address, every location in the memory has a memory address, usually a number, by which it can be uniquely identified.)

#include<stdio.h>

int main(void)
{
     int a[] = {1,2,3,4,5,6,7,8,9,10};
     int i;

     /* Print the Addresses */
     for (i = 0; i < 10; i++)
     {
          printf("\nAddress of a[%d] : %p", i, (void *) &a[i]);
     }

     return 0;
}

OUTPUT

Address of a[0] : ffe2
Address of a[1] : ffe4
Address of a[2] : ffe6
Address of a[3] : ffe8
Address of a[4] : ffea
Address of a[5] : ffec
Address of a[6] : ffee
Address of a[7] : fff0
Address of a[8] : fff2
Address of a[9] : fff4

As we can see from the output, the elements are stored at ffe2, ffe4, ffe6, etc. You might be wondering why the numbers are not consecutive. The reason is very simple. The size of the int data type in C is at least 2 bytes (depending on the implementation). In this example it is 2 bytes wide, so a[0] will be stored at ffe2, ffe3, a[1] will be stored at ffe4, ffe5, a[2] will be stored at ffe6, ffe7 and so on.

If instead we declared an array of floats (which usually take around 4 bytes each), we will find a[0] will be stored at ffe2, ffe3, ffe4, ffe5, a[1] will be stored at ffe6, ffe7, ffe8, ffe9 and so on. Note that you may see entirely different numbers that represent the address locations.

As you can see, the elements are consecutive. This concludes the lesson on arrays.

The way arrays are placed in memory is so that the name of the array is actually the address of the lowest element in memory, so that higher elements are accessed by doing addition, as may be intuitive. However, a confusing part of this is that the stack grows downwards. What is meant by this is that the stack pointer (sp/esp/rsp - for accessing 16/32/64 its repectively) points to the top of the stack (often the top of the memory) allocated to the program, unless you use special settings (which you shouldn't), or you are on a special system. When you allocate to the stack, variables are allocated going downwards - so when you push (add) a value (or variable) onto the stack, sp/esp/rsp decreases to be pointing to available unused memory, and when you pop (remove) a value (or variable), esp increases to be pointing to the memory above the variable you popped. When you allocate an array, sp/esp/rsp is decreased by that array's size, and the name for the array is saved as it's offset from the frame of course. This means that, while allocated backwards, because it is allocated all at once, the array is accessed upwards. Note that the reason for the direction of stack growth is dynamic memory: dynamic memory, or heap memory, must grow upwards, as otherwise increases/decreases would have to apply to the start of the chunks of memory. Noting this, having both segments grow towards each other is the only safe way to properly allocate memory to actually maximise availability, otherwise there would be a split, making programs that utilise one more than the average suffer due to lack of available memory.

Note: In terms of actual, physical addresses, things get more complicated due to paging, segments and something called a GDT. However, there are literally 2 things that require knowledge about these things: writing an OS and writing drivers, which is part of writing an OS.

Assignments[edit | edit source]

Completion status: Almost complete, but you can help make it more thorough.