Lesson 8 - Strings in the C language

C and C++ The C language Basics Strings in the C language

We've successfully avoided working with texts in our course. Until now, we have only worked with numbers and single characters. However, we'll need to work with text in most real-world applications. Texts are referred to as strings in programming (a string of characters). The reason we've put this topic off to the side is that C, as a low-level language, doesn't have any data type for storing text and it actually almost doesn't support them. Of course, we can work with texts in the C language, but it's a bit more complicated.

Char array

There are several ways to work with strings in the C language. We're going to introduce the simplest approach in this lesson - a static string which is an array of chars. Consider that we want to store the string "ICT.social", we would need to create the following array of the char type in memory:

'I' 'C' 'T' '.' 's' 'o' 'c' 'i' 'a' 'l' '\0'

In each "box", there is a single character stored. Notice the extra box at the end, containing the \0 character which is the null character. All strings must end with it. Although C doesn't support strings as a language, it contains standard libraries for working with them. This is why we have to store strings as it's expected they'll look like. Meaning that an array representing a string must always be 1 item longer than the length of the text we're storing!.

Note: Although it's beyond the range of today's lesson, let's mention that the null character is there to determine where the string ends. Aside from static arrays, we can also store strings of any length using pointers, as memory blocks of any length, and it wouldn't be possible without this little aid. We'll teach you everything further along in the courses. An alternative way to specify a string's length is to store it as a number before the first character. This system was used by the Pascal language, however, the null character is a much more common solution.

Let's create a simple example. We'll store some text into a variable and print it to the console:

char text[5] = {'m', 'o', 'o', 'n', '\0'};
printf("%s", text);

The result:

Console application
moon

The good news is that the C language allows us to enter text in quotes which it then converts to a so-called string constant (a char array terminated by the \0 character). The code above can be rewritten to the following form:

char text[5] = "moon";
printf("%s", text);

Notice that the array has to be 5 characters long even though the word "moon" is only 4 letters long. We can even let determining the length up to the C language:

char text[] = "moon";
printf("%s", text);

Unfortunately, we're not able to assign a string constant to an already existing array:

char text[5];
text = "moon"; // This line causes an error
printf("%s", text);

This is because it isn't possible for us to assign an array to another array. However, nothing is stopping us from assigning it character by character using a loop or to use functions for copying strings (more on that later on).

Working with single characters

We can work with strings in the same manners as with arrays (because they're actually arrays). :) Therefore, we are able to change the first character or shorten the string:

char text[] = "moon";
text[0] = 'f';
text[3] = '\0';
printf("%s", text);

The result:

Console application
foo

Changing the 4th character to \0 made the string terminate before that character. Always keep the null character in mind when editing strings, if you forget to assign it, the program won't know where the string ends and it'll access memory which doesn't belong to it.

Reading/writing strings

We can read or print strings as we're used to with other data types (we'll use the %s modifier). We'll create a string variable as a char array and specify a maximal length, e.g. 50 characters (which is 51 items). We omit the & characters when scanning variables using the %s modifier because we're already passing an address when passing arrays.

The following program will let you enter your name and greet you:

printf("Enter your name: ");
char name[51];
scanf("%50s", name);
printf("Hi %s, welcome!", name);

Notice how the maximal length is specified in the format of a string in the scanf() function. If we didn't specify it and encountered an exotic or just mean user, the characters would overflow from the array and break the program.

Unfortunately, the scanf() function terminates the text when there is an empty space somewhere. If we wanted to read something like "John Smith" into a single variable, we'd need to modify the format string to not stop at anything other than line endings. Modify your line with the scanning to the following (the space at the beginning is really important since it won't keep white characters in the buffer):

scanf(" %50[^\n]s", name);

You may also encounter the functions gets() or fgets() used for reading text from the console. Avoid gets() since it doesn't allow us to limit the length of the text being entered, and fgets() has to be redirected to the standard input. Therefore, we'll get along using scanf() just fine.

Standard string functions

The C language specification provides many functions for working with strings which will make our programs more simple. To be able to use them, we need to include the string.h header file at the beginning of our file:

#include <string.h>

Note: since functions are named using abbreviations, I'll mention the original name for you as well to help you remember them better.

strlen() - STRing LENgth

We can determine the string's length using strlen(). It's the length of the visible part excluding the \0.

printf("%d", strlen("moon")); // returns 4

strcat - STRing conCATenate

We're able to concatenate 2 strings into a single one using the strcat() function. Keep in mind that there has to be enough space for it in the first string.

char text[20] = "moon";
strcat(text, " is in the sky"); // stores the string "moon is in the sky" to the text variable
printf("%s", text);

strcpy() - STRing CoPY

Since whole arrays cannot simply be copied, there is a function to clone a string into another variable.

char text[5];
strcpy(text, "moon");
printf("%s", text);

strchr() - STRing CHaR

We can search for a character in a string. It'll be searched from the beginning to the end and a pointer to it will be returned if the character is found. Although we can't work with pointers yet, it's enough for us now to know that if we subtract the string from the pointer, we'll get the position of the character we're looking for. If the text doesn't contain the character, we'll get a NULL value.

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(int argc, char** argv) {
        char text[] = "Mr. X strikes again.";
        char *p = strchr(text, 'X'); // stores a pointer to the 'X' character in the string
        int position = p - text;
        if (p != NULL)
        {
                printf("Found at the position %d", position);
        }
        else
        {
                printf("Not found");
        }
        return (EXIT_SUCCESS);
}

I guess you won't be surprised that the position is zero-based.

strstr() - STRing subSTRing

We can also search for a string (substring) in a string in the same manner as we would search for a single character. The function for it is called strchr() and is used in the exact same way.

strcmp() - STRing CoMPare

Compares 2 strings alphabetically and returns a negative number if the first string is before the second one, 0 if they're equal, and a positive number if the first one is after the second one.

printf("%d", strcmp("alpha", "bravo")); // returns a negative number

We can also find other versions of the mentioned functions. If we wanted the C language to work with a string from the end (e.g. search for it starting from the end), the function for it contains a letter r in its name (as in reverse). Specifically, the function is named strrchr(). We can also limit the number of characters being processed using the letter n (as in number) and specifying said number as an additional parameter. If the string is longer, it'll only return the part which will be long as specified. Beware, this part doesn't contain the \0 character. Specifically, this function is called strncat().

In the next lesson, Strings in The C language - Working with single characters, we'll continue working with strings in the C language and make several example applications.


 

 

Article has been written for you by David Capka
Avatar
Do you like this article?
No one has rated this quite yet, be the first one!
The author is a programmer, who likes web technologies and being the lead/chief article writer at ICT.social. He shares his knowledge with the community and is always looking to improve. He believes that anyone can do what they set their mind to.
Unicorn College The author learned IT at the Unicorn College - a prestigious college providing education on IT and economics.
Activities (5)

 

 

Comments

To maintain the quality of discussion, we only allow registered members to comment. Sign in. If you're new, Sign up, it's free.

No one has commented yet - be the first!