linuxcandyclone: c

Showing posts with label c. Show all posts

Week 4 – Introduction to generic functions in C

Previously we had seen the implementation of swap function, which however was specific to int data
type. Today we make a few points on that on introduce ourselves to generic functions in C. Generic
functions, which we'll be writing shortly, are analogous to the templates that we see in C++. We'll also
chalk out the major differences, pros and cons of generic functions.

We are quite familiar with the following code:

void swap(int* aptr,int* bptr)
{
int tmp;
tmp = *aptr;
aptr = *bptr;
*bptr = tmp;
}

Needless to say, the function swap(), swaps the contents of the two variables passed. This is done by
passing the addresses of the two variables and assigning each others' values by the usual swapping
algorithm using the temporary 'tmp' variable.

So to make an attempt to write a generic swap(), we need to understand this : swap() must work for any
type of data we pass. For this purpose, we use void pointers.
Void pointers (void*) are pointers that can point to any addresses of data type.

Thus,

int a;
float b;
double c;
void *vptr = &a;
vptr = &b;
vptr = &c;

... are all perfectly legal statements.
The versatility of void* makes it useful in writing generic functions.
So coming back to writing the generic swap() function...

Attempting to write the generic swap(), a natural tendency would be...

void gswap(void *aptr,void *bptr)
{
void tmp;
tmp = *aptr;
aptr = *bptr;
*bptr = tmp;
}

However, this code simply wouldn't work for two reasons – one obvious and one subtle.

The obvious reason would be that no variable can be declared as void.
The other, more subtle, reason is that void pointers cannot be deference!

Lets go one by one.

Our first problem of storing the unknown data type : To solve this, we ask for the size of the data type.
Knowing the size of the data type, we can ask for that many bytes of memory from the heap. So the question is which data type do we use to store the unknown data? The best answer would be char. This is because char just take one byte in the memory and thus its pointers arithmetic are much simpler than any other data type. So how do we store the unknown variable using char which can only hold one byte of data? We don't use just one char variable but an array of char. The number of elements of the array would depend on the size of the unknown data.

Our second problem of differencing the void pointer : Well, in the first place, we wont be doing the usual assignment using ' = ' .This is because the tmp variable, as we discussed, would be an array we simply cannot do an assignment to an array. Hence, we use the memcpy() function from string.h

memcpy()prototype : void * memcpy ( void* destination, const void * source, size_t num );

Sticking to our topic, it is enough to know that the function can take two void* and an integer. (An integer could be passed in place ofsize_t). Note that memcpy() doesn't look for any special character that would mark the end of array (like strcpy() does). It simply copies the number of bytes specified in numfrom
source to destination.

Hence our gswap() looks like (including the complete code) : -

//------------------------------------- Code--------------------------------------------------- //
#include <stdio.h>
#include <malloc.h>
#include <string.h>

void swap(void *a,void *b,int sz);

int main()
{
int a = 44;
int b = 5;

int *aptr = &a;
int *bptr = &b;

printf(”Beforeswapping...\na = %d\nb = %d”,a,b);swap(aptr,bptr,sizeof(int));
printf(”\nAfterswapping...\na = %d\nb = %d”,a,b);

return 0;
}

void swap(void *a,void *b,int sz)
{
char *buf =(char*)malloc(sz);
memcpy(buf,a,sz);
memcpy(a,b,sz);
memcpy(b,buf,sz);
}

//---------------------------------------------------------------------------------------------------------//

The advantage generic C functions have over C++ templates is seen during compilation of the code.
In C++, when templates are compiled, various copies of function are created depending upon the number of data types using the template. Each copy differs from the other only in the aspect of handling the data – the code more or less remains the same.

To put it simply, consider that we used a template to handle anint, float and a double in our C++ code. The compiler would create a duplicate copy of the function, each for int, float and double. The copy handling int would only differ in handling the data – rest of the code doesn't change.

This wouldn't matter much when its just a matter of few data types. However, in a huge a code-base having 30-40 data types (structures,classes etc. ) the issue can get serious as the size of the executable would simply magnify.

Shortcomings of generic function are that generic functions are run-time error prone. Its is possible that while compiling the code may not show any errors but the output isn't the desired result.Generic functions can lead to buggy codes if care is not taken.

Now consider the following code:

#include<stdio.h>
#include<string.h>

voidswap(void *a,void *b,int sz)
{
char*buf = (char*)malloc(sz);
memcpy(buf,a,sz);
memcpy(a,b,sz);
memcpy(b,buf,sz);
}

intmain()
{
char*husband = strdup("Fred");
char*wife = strdup("Wilma");

printf("Before swapping...\n");
printf("Husband : %s\n",husband);
printf("Wife : %s\n",wife);
swap(&husband,&wife,sizeof(char*));
printf("After swapping...\n");
printf("Husband : %s\n",husband);
printf("Wife : %s\n",wife);
return 0;
}

Well the question here would be about the call to swap() as in to send husband,wife,size of(char*) or &husband,&wife,size of(char*) as'husband' and 'wife' themselves are pointers. The answer to this question lies in swap() itself.

Inswap(), we copy the contents of the variable pointed by a to a temporary buffer 'buf' (the char array) using memcpy(). memcpy(), aswe know, simply copies the memory contents from the address,irrespective of the data type. Passing '&husband','&wife' would help us swap the contents of 'husband' and 'wife' - which are addresses to strings 'Fred' and 'Wilma' respectively.

It"s to be noted that, when we swap two strings, we swap their perspective pointers and not the individual characters. This is because strings, as we know, are handled using the pointers to the base address of the string residing in the memory. Thus swapping of the addresses is sufficient ( and convenient too!).

So what would happen if we pass husband,wife instead of &husband,&wife?

Well the output would look like this :

Explanation:

When we simply pass husband,wife to swap(), 'husband' and 'wife' which contain base addresses to the string(address pointing to the first character in the string) are passed on to memcp(). memcpy() simplycopies four bytes (because sizeof(char*) is 4) thus swapping on fourbytes of data between 'husband' and 'wife'. Hence four bytes from husband ('F','r','e','d') and four from wife ('W','i','l','m') are swapped. [ Leaving the trailing 'a' from 'Wilma' as it is ]

Considerthe following function call:

swap(husband,&wife,sizeof(char*));

What happens here is that, pointer to first character in the string 'Fred' and pointer to base pointer of 'Wilma' (base pointer, I repeat, is the pointer pointing to the first character in 'Wilma') are passed to the function. Thus swapping of 4 bytes of data takes place between these two memory locations. Hence four bytes of husband (namely'F','r','e','d') are placed in pointer that is supposed to point to wife (which is the base pointer to the string 'Wilma') and pointer to wife is copied to husband( i.e. the memory address pointing to wife is copied to husband). As a result, illegal values get swapped -address get copied where a string is supposed to reside and vice versa.

Since the void* b, which supposed to contain address where string 'Wilma'is stored, is written with 4 bytes of 'F', 'r', 'e', 'd', memcpyassumes these 4 bytes to be a memory location (which is invalidly pointing to some other location which is not accessible). This leads to the segmentation fault.

Finally,we discuss about the generic linear search function.

//---------------------------------------Code -----------------------------------//
#include<malloc.h>
#include<stdio.h>
#include<string.h>

intlsearch(void *arr,int n,void *ele,int Dsz)
{
inti;
for(i= 0; i < n; ++i)
{
void *addr = (char*)arr+(i*Dsz);
if(memcmp(addr,ele,Dsz) == 0 )
{
return i;
}
}

return-1;
}

intmain()
{
intno,i,pos;
float*Arr = NULL,key;

printf("Enter the number of elements youwant : ");
scanf("%d",&no);
Arr= (float*) malloc(sizeof(float)*no);
printf("Enter the %d elements : \n",no);
for(i= 0; i < no; ++i)
{
printf("Element %d : ",i);
scanf("%f",&Arr[i]);
}
printf("Enter the number element tosearched : ");
scanf("%f",&key);

pos= lsearch(Arr,no,&key,sizeof(float));

if(pos!= -1)
printf("Element %f found at position(starting from zero) %d",key,pos);

else
printf("Element not found!");

printf("\n");
return0;
}

Theprimitive linear search algorithm used to look like this:

for(i=0;i<n;++i)
{
if(arr[i]== ele)
returni;
}

return-1;
}

Obviously,in the generic case, arr[i] does not make any sense as 'arr' is avoid*. The '==' isn't much helpful either. Hence we use memcmp(), again from the string.h

int memcmp ( const void * ptr1, const void * ptr2, size_t num );

memcmp() compares the two memory blocks pointed by the ptr1 and ptr2and returns 0 if the two have identical data else returns a non zero value. Note that the size of the memory blocks in consideration has to be specified in the last parameter.

Next thing, we need to find the alternative for arr[i]. This is all about pointer arithmetic and hence we again have to choose a data type (anyarbitrary choice) as we know no pointer arithmetic is possible with void pointers. We choose char for the sake of simplicity and flexibility of the one byte data type.

All we do is calculate the i'th memory block's address. For this, we need to realize that each memory block's size is known to us (the intDsz parameter!). We need to jump those many memory blocks.Hence we have to add Dsz to (char*)arr to move one block, 2*Dsz to(char*)arr to move two and so on.

hence

address= (char*)arr + Dsz*i;

The rest is self explanatory.

[...]

Week - 3 Memory Mechanics (Structures and Array of structures)

Last week we saw the memory mechanics of floating point numbers, typecasting the pointers, and all those

asterisk-ampersand tricks and consequences. We have covered how primitive data-types are handled

at lower levels - more or less. All of this eventually makes us better programmers and designers. Today we discuss about memory mechanics of structures and the array of structures.

Starting off, consider the following structure;

struct fraction
{
int num;
int denom;
};

struct fraction pi;

In the memory, 'pi' would look this :

pi.num = 22;

pi.denom = 7;

Just assigning values. Now,

((struct fraction*)&(pi.denom)) -> num = 12;

Confused?! :D

Well what happens is as follows:

 address of pi.denom is sought

 &pi.denom which is an integer pointer, is type-cast to a fraction pointer.

 Now the num field of this pointer is assigned the 12. Question is where is this num field?

To explain this, we need to recollect what happened when a float pointer was converted to int, which is what we did last week. The contents of the memory remain the same. But the compiler thinks of the memory location as an int.Same thing happens here. The compiler thinks of &pi.denom to be the starting address of a structure (even if it really isn't).

Hence if we display pi.denom we get 12.

// ------------------------------- Code ------------------------------------ //

#include
struct fraction
{
int num;
int denom;
};
int main()
{
struct fraction pi;
pi.num = 22;
pi.denom = 7;
printf("Numerator : %d",pi.num);
printf("\nDenominator : %d",pi.denom);
((struct fraction*)&(pi.denom))->num = 12;
printf("\nAfter manipulating with memory...\n");
printf("Numerator : %d",pi.num);
printf("\nDenominator : %d",pi.denom);
return 0;
}

Moving on to the array of structures, consider the following structure;

struct Students
{
char *name;
char SUID[8];
int no_of_units;
};

Memory reserved when a variable is declared would look something like this :

Note that we read the contents in these sort of diagrams from left to right, bottom to top.

Array of such structures, in the memory would look like the following:

struct Student pupils[4];

Consider this:

pupils[0].no_of_units = 21;

This simply puts the value 21 in pupil[0] 's 'no_of_units' field.

Likewise,

pupils[2].name = strdup("Adam");

Here strdup() is a function defined in that it dynamically allocates the required memory

for the string and passes the pointer to the location where the memory was allocated. This pointer is stored in name field of pupils[0].

Things get tricky now:

pupils[3].name = pupils[0].SUID + 6;

This would pupils[3].name to point to the memory location 6 bytes away from

pupils[0].SUID. Thus,

strcpy(pupils[0].SUID,"1234567");
pupils[0].no_of_units = 0;
printf("%s",pupils[3].name);

Output : 7

Note that strcpy() does not do any kind of bounds-checking - it goes on copying bytes until it

encounters a '\0' character. Hence,

strcpy(pupils[0].SUID,"12345678");
pupils[0].no_of_units = 0;
printf("%s",pupils[3].name);

... simply displays 78. Note that pupils[0].no_of_units is assigned 0 so that it acts as

the '\0' character.( '\0' has a ASCII code of 0). Else there could be segmentation fault on

some systems with all sorts of junk values being displayed on the screen!

Extending this further,

strcpy(pupils[0].SUID,"12345678");
pupils[0].no_of_units = 65;
printf("%s",pupils[3].name);

would surprisingly display 78A.

// ---------------------------------- Complete Code ------------------------------------ //

#include
#include
struct Student
{
char *name;
char SUID[8];
int no_of_units;
};
int main()
{
struct Student pupils[4];
pupils[3].name = pupils[0].SUID + 6;
strcpy(pupils[0].SUID,"12345678");
pupils[0].no_of_units = 65;
printf("%s",pupils[3].name);
return 0;
}

// -------------------------------------------------------------------------------------- //

It has become apt to make a small point about strings in C.

Consider the following code:

#include
int main()
{
char name[] = {"Free and Open Source Software"};
printf("%s",&name[5]);
return 0;
}

Thus printf() function was lead into believing that the string started from the letter 'a'. This code makes one point clear: all the functions that deal with strings treat the given char pointer as the base address of the first character of the string ( regardless of the the bytes before the specified address) and treat the rest of the memory till the null character '\0' as the string. This '\0' character is actually what distinguishes an array of characters from strings.

Having discussed all the memory mechanics of the various data types, we are now in a position to write generic functions. Generic functions are functions that can handle any data type(Just like a template does in C++). A detailed account generic functions are will be taken care of next week.

We start off with the swap() function to swap two integers and the code (as most of us would know) is

as below:

void swap(int *aptr,int *bptr)
{
int temp;
temp = *aptr;
*aptr = *bptr;
*bptr = temp;
}

We see this function handles with integer data-types ( thus being int specific). Our aim is to write avgeneric swap() function - a function that can swap values of two memory locations regardless of what the data-types are. First of all lets see what happens in this function.

This function basically swaps the memory contents using their addresses. This is mandatory as there is no way out. (At least at this level of programming!). Using the memory address, we manipulate the individual variables. Swapping thus happens in the following 3 steps;

• contents of memory location pointed by aptr ( which contains address of the first variable

passed) are copied to a temporary variable temp.

• memory location pointed by aptr are over written with contents of memory location pointed

by bptr;

• contents of memory pointed bptr are replaced with value of temp ( which originally had

contents of first memory location)

Thus using the memory addresses we swap the contents of the memory.

Using all this as the background, next week we see how a generic function will be written for this int specific swap() function. And not just this, but grab some fundamentals, the do-and-don'ts of generic functions. So stay tuned!

[...]

Week – 2 Low Level Memory Mechanics – Part 2

Last week we saw how data is stored in the memory. If you've missed it, read it here. The data we dealt with was mostly of integral type: mainly integers, char, short and long. Today we'll see how floating point data is handled at the low level memory. This is done by considering the float datatype.

We know an integer 5 in a 32 bit integer is stored in the memory as :

Consider a floating point number, say, 5.5. We'd have to develop a mechanism to store the fractional part as well as the integral part. As we know in binary representation,

5.5 = (101.1)₂

Thus we need to set our memory block in two parts(as far we understand right now) - one block for the integral part and one for the fractional. Hence, the following could be done :

< --------------- Integral Part ------------------------------------ >|<- Fractional Part- >

Thus all we have to do is convert the number into its decimal representation and fill the integral and fractional parts in their respective blocks.

More examples :

7.5

< ------------------------------- Integral Part --------------------- >|< -Fractional Part - >

3.5

< --------------- Integral Part ----------------------------------- >|< -- Fractional Part-- >

However, it is necessary to note that just one byte cannot represent the fractional part accurately enough. Hence the IEEE came up with the following format.

1 or 0

|< ----------- exp --------- >|< ------------------- 1.xxxxxx------------------------ >;

In this format we have to take the binary equivalent then write in the standard scientific form using base 2. An example would make this clear.

Ex: 15.5₁₀= (1111.1)₂

1111.1 should written as 1.1111 x 2³

Once this is done, we ought to remember that

exp = (power of 2) + 127.

In this case, exp = 130

And the 1.1111 we got is equivalent to 1.xxxxx...

Thus xxxxxx... = 111100....(filing the rest with zeros)

|< ----------- exp --------- >|< ---------------------- 1.xxxxxx--------------------------- >

Here is a C code to demonstrate this :

#include <stdio.h>

void DecimalToBinary(char *b,long int d,int nbits)
{
int testbit = 1,i=0,ctr = 0;
testbit = testbit << nbits-1;

while(i<(nbits+nbits/8))
{
if(d & testbit)
{
b[i++] = '1';
}

else
{
b[i++] = '0';
}

ctr++;
if(ctr == 8)
{
ctr = 0;
b[i++] = ' ';
}
d = d << 1;
}

b[i] = '\0';
}

int main()
{
char bin[100];
long int *lptr;
float num = 15.5;
float *fptr = & num;
lptr = (long*) fptr;
DecimalToBinary(bin,*lptr,32);
printf("Float point number : %f\n",num);
printf("IEEE format : %s",bin);
return 0;
}

The function DecimalToBinary() simply converts an integer to binary form and stores in a string.

Additionally it displays in blocks of 1 byte.

The lines that are relevant to our discussion are in the main().

long int *lptr;
float num = 15.5;
float *fptr = & num;

lptr = (long*) fptr;

What we do here is typecast the address of float variable 'num' to long*. Here, only the pointer is cased and not the data itself. Thus the format of float is preserved. By typecasting we instruct the compiler to treat the same memory location like a long int (long because 4 bytes). We send this number to our DecimalToBin().

Lastly, we need to make things clear about what happens when an int variable is assigned to a float.

int i= 5;
float f=i;
printf(“%f”,f);

Here we better not confuse ourselves with all the formatting we learnt. Ouput would still be 5. What happens at line 2. is '5' is formatted accordingly and stored in the memory.

Consider another piece of code.

Int I = 37;
float f =*(float *)&i;

In these lines, again, we don't alter the data – we just typecast the address and assign to f. Now whenever 'f' is used, compiler treats it like a float. This also implies that whatever data be there, it will be treated according to the float format.

Int format.

Float format.

| < ---------- exp --------- >|< -------------------- 1.xxxxxx------------------------------ >

Thus when evaluated in this form, value of 'f' is very small, often shown as 0.

[...]

linuxcandyclone

Popular Posts

Vestibulum quis diam velit, vitae euismod ipsum

Aliquam vel dolor vitae dui tempor sollicitudin

Nam ullamcorper iaculis erat eget suscipit.

Week 4 – Introduction to generic functions in C

Week - 3 Memory Mechanics (Structures and Array of structures)

Week – 2 Low Level Memory Mechanics – Part 2

About Me

Category List

Blog Archive

Followers