linuxcandyclone: Week 4 – Introduction to generic functions in C

Week 4 – Introduction to generic functions in C

Posted by Unknown Posted on - - 0 comments

Previously we had seen the implementation of swap function, which however was specific to int data
type. Today we make a few points on that on introduce ourselves to generic functions in C. Generic
functions, which we'll be writing shortly, are analogous to the templates that we see in C++. We'll also
chalk out the major differences, pros and cons of generic functions.

We are quite familiar with the following code:

void swap(int* aptr,int* bptr)
{
int tmp;
tmp = *aptr;
aptr = *bptr;
*bptr = tmp;
}

Needless to say, the function swap(), swaps the contents of the two variables passed. This is done by
passing the addresses of the two variables and assigning each others' values by the usual swapping
algorithm using the temporary 'tmp' variable.

So to make an attempt to write a generic swap(), we need to understand this : swap() must work for any
type of data we pass. For this purpose, we use void pointers.
Void pointers (void*) are pointers that can point to any addresses of data type.

Thus,

int a;
float b;
double c;
void *vptr = &a;
vptr = &b;
vptr = &c;

... are all perfectly legal statements.
The versatility of void* makes it useful in writing generic functions.
So coming back to writing the generic swap() function...

Attempting to write the generic swap(), a natural tendency would be...

void gswap(void *aptr,void *bptr)
{
void tmp;
tmp = *aptr;
aptr = *bptr;
*bptr = tmp;
}

However, this code simply wouldn't work for two reasons – one obvious and one subtle.

The obvious reason would be that no variable can be declared as void.
The other, more subtle, reason is that void pointers cannot be deference!

Lets go one by one.

Our first problem of storing the unknown data type : To solve this, we ask for the size of the data type.
Knowing the size of the data type, we can ask for that many bytes of memory from the heap. So the question is which data type do we use to store the unknown data? The best answer would be char. This is because char just take one byte in the memory and thus its pointers arithmetic are much simpler than any other data type. So how do we store the unknown variable using char which can only hold one byte of data? We don't use just one char variable but an array of char. The number of elements of the array would depend on the size of the unknown data.

Our second problem of differencing the void pointer : Well, in the first place, we wont be doing the usual assignment using ' = ' .This is because the tmp variable, as we discussed, would be an array we simply cannot do an assignment to an array. Hence, we use the memcpy() function from string.h

memcpy()prototype : void * memcpy ( void* destination, const void * source, size_t num );

Sticking to our topic, it is enough to know that the function can take two void* and an integer. (An integer could be passed in place ofsize_t). Note that memcpy() doesn't look for any special character that would mark the end of array (like strcpy() does). It simply copies the number of bytes specified in numfrom
source to destination.

Hence our gswap() looks like (including the complete code) : -

//------------------------------------- Code--------------------------------------------------- //
#include <stdio.h>
#include <malloc.h>
#include <string.h>

void swap(void *a,void *b,int sz);

int main()
{
int a = 44;
int b = 5;

int *aptr = &a;
int *bptr = &b;

printf(”Beforeswapping...\na = %d\nb = %d”,a,b);swap(aptr,bptr,sizeof(int));
printf(”\nAfterswapping...\na = %d\nb = %d”,a,b);

return 0;
}

void swap(void *a,void *b,int sz)
{
char *buf =(char*)malloc(sz);
memcpy(buf,a,sz);
memcpy(a,b,sz);
memcpy(b,buf,sz);
}

//---------------------------------------------------------------------------------------------------------//

The advantage generic C functions have over C++ templates is seen during compilation of the code.
In C++, when templates are compiled, various copies of function are created depending upon the number of data types using the template. Each copy differs from the other only in the aspect of handling the data – the code more or less remains the same.

To put it simply, consider that we used a template to handle anint, float and a double in our C++ code. The compiler would create a duplicate copy of the function, each for int, float and double. The copy handling int would only differ in handling the data – rest of the code doesn't change.

This wouldn't matter much when its just a matter of few data types. However, in a huge a code-base having 30-40 data types (structures,classes etc. ) the issue can get serious as the size of the executable would simply magnify.

Shortcomings of generic function are that generic functions are run-time error prone. Its is possible that while compiling the code may not show any errors but the output isn't the desired result.Generic functions can lead to buggy codes if care is not taken.

Now consider the following code:

#include<stdio.h>
#include<string.h>

voidswap(void *a,void *b,int sz)
{
char*buf = (char*)malloc(sz);
memcpy(buf,a,sz);
memcpy(a,b,sz);
memcpy(b,buf,sz);
}

intmain()
{
char*husband = strdup("Fred");
char*wife = strdup("Wilma");

printf("Before swapping...\n");
printf("Husband : %s\n",husband);
printf("Wife : %s\n",wife);
swap(&husband,&wife,sizeof(char*));
printf("After swapping...\n");
printf("Husband : %s\n",husband);
printf("Wife : %s\n",wife);
return 0;
}

Well the question here would be about the call to swap() as in to send husband,wife,size of(char*) or &husband,&wife,size of(char*) as'husband' and 'wife' themselves are pointers. The answer to this question lies in swap() itself.

Inswap(), we copy the contents of the variable pointed by a to a temporary buffer 'buf' (the char array) using memcpy(). memcpy(), aswe know, simply copies the memory contents from the address,irrespective of the data type. Passing '&husband','&wife' would help us swap the contents of 'husband' and 'wife' - which are addresses to strings 'Fred' and 'Wilma' respectively.

It"s to be noted that, when we swap two strings, we swap their perspective pointers and not the individual characters. This is because strings, as we know, are handled using the pointers to the base address of the string residing in the memory. Thus swapping of the addresses is sufficient ( and convenient too!).

So what would happen if we pass husband,wife instead of &husband,&wife?

Well the output would look like this :

Explanation:

When we simply pass husband,wife to swap(), 'husband' and 'wife' which contain base addresses to the string(address pointing to the first character in the string) are passed on to memcp(). memcpy() simplycopies four bytes (because sizeof(char*) is 4) thus swapping on fourbytes of data between 'husband' and 'wife'. Hence four bytes from husband ('F','r','e','d') and four from wife ('W','i','l','m') are swapped. [ Leaving the trailing 'a' from 'Wilma' as it is ]

Considerthe following function call:

swap(husband,&wife,sizeof(char*));

What happens here is that, pointer to first character in the string 'Fred' and pointer to base pointer of 'Wilma' (base pointer, I repeat, is the pointer pointing to the first character in 'Wilma') are passed to the function. Thus swapping of 4 bytes of data takes place between these two memory locations. Hence four bytes of husband (namely'F','r','e','d') are placed in pointer that is supposed to point to wife (which is the base pointer to the string 'Wilma') and pointer to wife is copied to husband( i.e. the memory address pointing to wife is copied to husband). As a result, illegal values get swapped -address get copied where a string is supposed to reside and vice versa.

Since the void* b, which supposed to contain address where string 'Wilma'is stored, is written with 4 bytes of 'F', 'r', 'e', 'd', memcpyassumes these 4 bytes to be a memory location (which is invalidly pointing to some other location which is not accessible). This leads to the segmentation fault.

Finally,we discuss about the generic linear search function.

//---------------------------------------Code -----------------------------------//
#include<malloc.h>
#include<stdio.h>
#include<string.h>

intlsearch(void *arr,int n,void *ele,int Dsz)
{
inti;
for(i= 0; i < n; ++i)
{
void *addr = (char*)arr+(i*Dsz);
if(memcmp(addr,ele,Dsz) == 0 )
{
return i;
}
}

return-1;
}

intmain()
{
intno,i,pos;
float*Arr = NULL,key;

printf("Enter the number of elements youwant : ");
scanf("%d",&no);
Arr= (float*) malloc(sizeof(float)*no);
printf("Enter the %d elements : \n",no);
for(i= 0; i < no; ++i)
{
printf("Element %d : ",i);
scanf("%f",&Arr[i]);
}
printf("Enter the number element tosearched : ");
scanf("%f",&key);

pos= lsearch(Arr,no,&key,sizeof(float));

if(pos!= -1)
printf("Element %f found at position(starting from zero) %d",key,pos);

else
printf("Element not found!");

printf("\n");
return0;
}

Theprimitive linear search algorithm used to look like this:

for(i=0;i<n;++i)
{
if(arr[i]== ele)
returni;
}

return-1;
}

Obviously,in the generic case, arr[i] does not make any sense as 'arr' is avoid*. The '==' isn't much helpful either. Hence we use memcmp(), again from the string.h

int memcmp ( const void * ptr1, const void * ptr2, size_t num );

memcmp() compares the two memory blocks pointed by the ptr1 and ptr2and returns 0 if the two have identical data else returns a non zero value. Note that the size of the memory blocks in consideration has to be specified in the last parameter.

Next thing, we need to find the alternative for arr[i]. This is all about pointer arithmetic and hence we again have to choose a data type (anyarbitrary choice) as we know no pointer arithmetic is possible with void pointers. We choose char for the sake of simplicity and flexibility of the one byte data type.

All we do is calculate the i'th memory block's address. For this, we need to realize that each memory block's size is known to us (the intDsz parameter!). We need to jump those many memory blocks.Hence we have to add Dsz to (char*)arr to move one block, 2*Dsz to(char*)arr to move two and so on.

hence

address= (char*)arr + Dsz*i;

The rest is self explanatory.

Categories: c, c++, class, linux, linux class, system programming

linuxcandyclone

Popular Posts

Week 4 – Introduction to generic functions in C

Leave a Reply

About Me

Category List

Blog Archive

Followers