Monday, October 6, 2008

Combination of Arrays and pointers

Array elements are just like other variables: they have addresses.
int ar[20], *ip;

ip = &ar[5];
*ip = 0;        /* equivalent to ar[5] = 0; */
The address of ar[5] is put into ip, then the place pointed to has zero assigned to it. By itself, this isn't particularly exciting. What isinteresting is the way that pointer arithmetic works. Although it's simple, it's one of the cornerstones of C.
Adding an integral value to a pointer results in another pointer of the same type. Adding n gives a pointer which points n elements further along an array than the original pointer did. (Since n can be negative, subtraction is obviously possible too.) In the example above, a statement of the form
*(ip+1) = 0;
would set ar[6] to zero, and so on. Again, this is not obviously any improvement on ‘ordinary’ ways of accessing an array, but the following is.
int ar[20], *ip;

for(ip = &ar[0]; ip < &ar[20]; ip++)      *ip = 0; That example is a classic fragment of C. A pointer is set to point to the start of an array, then, while it still points inside the array, array elements are accessed one by one, the pointer incrementing between each one. The Standard endorses existing practice by guaranteeing that it's permissible to use the address of ar[20] even though no such element exists. This allows you to use it for checks in loops like the one above. The guarantee only extends to one element beyond the end of an array and no further. Why is the example better than indexing? Well, most arrays are accessed sequentially. Very few programming examples actually make use of the ‘random access’ feature of arrays. If you do just want sequential access, using a pointer can give a worthwhile improvement in speed. In terms of the underlying address arithmetic, on most architectures it takes one multiplication and one addition to access a one-dimensional array through a subscript. Pointers require no arithmetic at all—they nearly always hold the store address of the object that they refer to. In the example above, the only arithmetic that has to be done is in the for loop, where one comparison and one addition are done each time round the loop. The equivalent, using indexes, would be this: int ar[20], i;  for(i = 0; i <>
The same amount of arithmetic occurs in the loop statement, but an extra address calculation has to be performed for every array access.
Efficiency is not normally an important issue, but here it can be. Loops often get traversed a substantial number of times, and every microsecond saved in a big loop can matter. It isn't always easy for even a smart compiler to recognize that this is the sort of code that could be ‘pointerized’ behind the scenes, and to convert from indexing (what the programmer wrote) to actually use a pointer in the generated code.
If you have found things easy so far, read on. If not, it's a good idea to skiped. What follows, while interesting, isn't essential. It has been known to frighten even experienced C programmers.
To be honest, C doesn't really ‘understand’ array indexing, except in declarations. As far as the compiler is concerned, an expression like x[n] is translated into *(x+n) and use made of the fact that an array name is converted into a pointer to the array's first element whenever the name occurs in an expression. That's why, amongst other things, array elements count from zero: if x is an array name, then in an expression, x is equivalent to &x[0], i.e. a pointer to the first element of the array. So, since *(&x[0]) uses the pointer to get to x[0], *(&x[0] + 5) is the same as *(x + 5) which is the same as x[5]. A curiosity springs out of all this. If x[5] is translated into *(x + 5), and the expression x + 5 gives the same result as 5 + x (it does), then 5[x] should give the identical result to x[5]! If you don't believe that, here is a program that compiles and runs successfully:
#include
#include
#define ARSZ 20
main(){
   int ar[ARSZ], i;
   for(i = 0; i < now =" %d\n">

No comments: