Lesson 9 - Strings in The C language - Working with single characters

C and C++ The C language Basics Strings in The C language - Working with single characters

In the last lesson, Strings in the C language, we learned that strings (texts) in the C language are just char arrays terminated by the null character. In today's lesson, we'll work with individual string characters, learn to use ASCII values, and create a sentence analyzer and a cipher program.

Printing text character by character

First of all, let's test that we can really treat text as a char array. We'll start by printing a string, character by character:

int i;
char sentence[] = "Hello ICT.social";
for (i = 0; sentence[i] != '\0'; i++)
        printf("%c ", sentence[i]);

The output:

Console application
H e l l o   I C T . s o c i a l

The loop iterates over all the string characters till it encounters the null character at the end of the string. In the result, all of the characters are printed to the console. I've added an empty space after each character to make the result more illustrative.

The ASCII value

Maybe you've already heard of the ASCII table. In the MS-DOS era, there was practically no other way to store text. Individual characters were stored as numbers of the byte datatype, i.e. of a range from 0 to 255. The system provided the ASCII table which had 256 characters and each ASCII code (numerical code) was assigned to one character.

Perhaps you understand why this method is no longer as relevant. The table simply could not contain all the characters for all international alphabets, now we use Unicode (UTF-8) encoding where characters are represented in a different way. However, the C language still works with ASCII values by default. If we wanted to use Unicode characters (UTF-8), we'd have to use so-called wide characters. The key advantage of using plain ASCII codes to represent characters is that the characters are stored in a table next to each other, alphabetically. For example, at position 97 we'd find "a", at 98 "b" etc. It is the same with numbers, but unfortunately, the accent characters are messed up.

Now, let's convert a character into its ASCII value, and then create the character according to its ASCII value:

char c; // character
int i; // ordinal (ASCII) value of a character
// conversion from text to ASCII value
c = 'a';
i = (int)c;
printf("The character %c was converted to its ASCII value of %d\n", c, i);
// conversion from an ASCII value to text
i = 98;
c = (char)i;
printf("The ASCII value of %d was converted to its textual value of %c\n", c, i);

Console application
The character a was converted to its ASCII value of 97
The ASCII value of 98 was converted to its textual value of b

Character occurrence in a sentence analysis

Let's write a simple program that analyzes a given sentence for us. We'll search for the number of vowels, consonants, digits, and other characters (e.g. spaces or punctuation marks).

We'll hard-code the input string into our code, so we won't have to write it again every time. Once the program is complete, we'll replace the string with scanf(). We'll iterate over characters using a loop. I should start out by saying that we won't focus as much on program speed here, we'll choose practical and simple solutions.

First, let's define vowels, consonants, and digits. We don't have to count other characters since it'll be the string length minus the number of vowels, consonants, and digits. Let's set up variables for the individual counters, also, since it is a more complex code, we'll add in comments.

// Counters initialization
int vowels_count = 0;
int consonants_count = 0;
int digits_count = 0;

// the string that we want to analyze
char s[] = "A programmer gets stuck in the shower because the instructions on the shampoo were: Lather, Wash, and Repeat.";

// definition of character groups
char vowels[] = "aeiouyAEIOUY";
char consonants[] = "bcdfghijklmnpqrstvwxzBCDFGHIJKLMNPQRSTVWXZ";
char digits[] = "0123456789";

// indexes
int i;

printf("The original message: %s\n", s);

// the main loop iterating over characters till in meets the end of it
for (i = 0; s[i] != '\0'; i++)
{

}

First of all, we reset the counters. For the definition of characters groups, we only need ordinary char arrays. The main loop iterates over each character in the char array s.

Now, let's increment the counters. For simplicity's sake, I'll focus on the loop instead of rewriting the code:

// the main loop iterating over characters until it gets to the end
for (i = 0; s[i] != '\0'; i++)
{
        if (contains_character(s[i], vowels) == 1)
                vowels_count++;
        else if (contains_character(s[i], consonants) == 1)
                consonants_count++;
        else if (contains_character(s[i], digits) == 1)
                digits_count++;
}

Notice that we use the contains_characĀ­ter() function which determines whether a string contains a given character. We'll get to functions like that at the end of this course, however, we'll skip ahead a little bit here and add the contains_characĀ­ter() function to make our program a bit more interesting.

Insert the following code block above the main() function. If you have any problems doing so, just download the attached source code at the end of the article.

int contains_character(char c, char s[])
{
        int i;
        for (i = 0; s[i] != '\0'; i++)
                if (s[i] == c)
                        return 1;
        return 0;
}

We won't describe the function now, let's get back to our code in the main() function. We have to look for the current character of the sentence in the vowels first and eventually increase their counters. If we don't find it in vowels, we'll look in consonants and eventually increase their counter. We'll do the same with digits.

Now, all we're missing is the printing part at the end, i.e. displaying text:

printf("Vowels: %d\n", vowels_count);
printf("Consonants: %d\n", consonants_count);
printf("Digits: %d\n", digits_count);
printf("Other characters: %d\n", strlen(s) - vowels_count - consonants_count - digits_count);

Console application
A programmer gets stuck in the shower because the instructions on the shampoo were: Lather, Wash, and Repeat.
Vowels: 33
Consonants: 55
Digits: 0
Other characters: 21

That's it, we're done!

The Caesar cipher

Let's create a simple program that encrypts text. If you've ever heard of the Caesar cipher, then you already know exactly what we're going to program. This form of text encryption is based on shifting characters in the alphabet by a certain fixed number of characters. For example, if we shift the word "hello" by 1 character forwards, we'd get "ifmmp". The user will be allowed to select the number of character shifts.

Let's get right into it! We need variables for the original text, the encrypted message, and the shift. Then, we need a loop iterating over each character and printing an encrypted message. We'll also have to hard-code the message defined in the code, so we won't have to write it over and over during the testing phase. After we finish the program, we'll replace the contents of the variable with the scanf() function. The cipher doesn't work with accent characters, spaces, and punctuation marks. We'll just assume the user will not enter them. We'll also assume the user will enter lowercase letters only to keep things simple. Ideally, we should remove accent characters before encrypting, as well as anything other than letters, and convert all letters to lowercase.

// variable initialization
char s[] = "blackholesarewheregoddividedbyzero";
int shift = 1;
int i;

printf("Original message: %s\n", s);

// loop iterating over characters
for (i = 0; s[i] != '\0'; i++)
{

}

// printing
printf("Encrypted message: %s", s);

Now, let's move to the loop. We'll increase the value of the current character by however many shifts.

        s[i] = s[i] + shift;

Console application
Original message: blackholesarewheregoddividedbyzero
Encrypted message: cmbdlipmftbsfxifsfhpeejwjefecz{fsp

Let's try it out! The result looks pretty good. However, we can see that the characters after "z" overflow to ASCII values of other characters ("{" in the picture). Therefore, the characters are no longer just alphanumeric, but other nasty characters. Let's set our characters up as a cyclical pattern, so the shifting could flow smoothly from "z" to "a" and so on. We'll get by with a simple condition that decreases the ASCII value by the length of the alphabet so we'd end back up at "a".

// loop iterating over characters
for (i = 0; s[i] != '\0'; i++)
{
        s[i] = s[i] + shift;
        if (s[i] > 'z') // overflow control
                s[i] = s[i] - 26;
}

If i exceeds the ASCII value of 'z', we reduce it by 26 characters (the number of characters in the English alphabet). It's simple and our program is now operational. Notice that we don't use direct character codes anywhere. There's a 'z' in the condition even though we could write 122 there directly. I set it up this way so that our program is fully encapsulated from explicit ASCII values, so it'd be clearer on how it works. Try to code a decryption program as practice for yourself.

In the next lesson, Multidimensional arrays in the C language, we'll see that there are still a couple more things we haven't touched base on that strings can do. Spoiler: We'll learn how to decode "Morse code".


 

 

Article has been written for you by David Capka
Avatar
Do you like this article?
No one has rated this quite yet, be the first one!
The author is a programmer, who likes web technologies and being the lead/chief article writer at ICT.social. He shares his knowledge with the community and is always looking to improve. He believes that anyone can do what they set their mind to.
Unicorn College The author learned IT at the Unicorn College - a prestigious college providing education on IT and economics.
Thumbnail
Previous article
Strings in the C language
Thumbnail
All articles in this section
The C language basic constructs
Activities (6)

 

 

Comments

To maintain the quality of discussion, we only allow registered members to comment. Sign in. If you're new, Sign up, it's free.

No one has commented yet - be the first!