Lesson 9 - Strings in Java - Split

Java Basic constructs Strings in Java - Split

In the previous tutorial, Strings in Java - Working with single characters, we made clear that Java Strings are essentially an array of characters. In today's lesson, we're going to explain other String methods that I have intentionally kept from you because we didn't know that strings are similar to arrays :)

When you create an arbitrary variable and write a dot after it, the NetBeans will show us all of the available methods and variables, that we can call on that variable (we'll go deeper into this in the OOP course). Let's try it out:

Java String methods in NetBeans IDE

The same suggestion can also be accessed by pressing CTRL + Spacebar when the text cursor is on the dot. Of course, this applies to all variables and classes (we'll use it further along the way, as well). The methods are ordered alphabetically and we can list them using the arrow keys. NetBeans shows us the description of the methods, what they do, and what parameters do they need.

Let's talk about the following methods and demonstrate them on simple examples:

Additional String methods

Substring()

Returns a substring from the given start position to the given end position.

System.out.println("I would not banish all of these Internets.".substring(2, 7));

The output:

Console application
would

CompareTo()

It allows us to compare two strings alphabetically. Returns -1 if the first string is before the string in the parameter, 0 if they are equal and 1 if the string is after one in the parameter:

System.out.println("alpha".compareTo("bravo"));

The output:

Console application
-1

Now let's look at one more, very useful, String method.

Split()

From the previous tutorial, we know that parsing strings character by character can be rather complicated. Even though we made a fairly simple example. Of course, we'll encounter strings all the time, both in user inputs, e.g. from the console or from input fields in windows form applications, and in TXT and XML files. Very often, we're given one long string, a line in a file or in the console, in which there are multiple values separated by separators, e.g. commas. In this case, we're talking about the CSV format (Comma-Separated Values). To be sure that we all know what we're talking about, let's look at some sample strings:

Jessie,Brown,Wall Street 10,New York,130 00
.. ... .-.. .- -. -.. ... --- ..-. -
(1,2,3;4,5,6;7,8,9)

The first string represents a user. We could, for example, store users into a CSV file (one per line).
The second string is Morse code characters and uses a space character as a separator.
The third string is a matrix of 3 columns and 3 rows. The column separator is a comma, whereas the row separator is a semicolon.

We can call the split() method on a String, which takes a separator. It'll then split the original string using separators into an array of substrings and return it. Which will greatly simplify value extraction from strings for our current intents and purposes.

Right then, let's see what we've got up until now. We still don't know how to declare objects, users, or even work with multidimensional arrays, i.e. matrices. Nevertheless, we want to make something cool, so we'll settle with making a Morse code message decoder.

Morse code decoder

We'll start out by preparing the structure of the program, as always. We need two strings for the messages, one for a message in Morse code, the other one will be empty for now and we'll store the results of our efforts there. Next, we need letter definitions (as we had with vowels). Of course, it'll be a definition based off of the ones in Morse code. Letters can be stored into a single String since they only consist one character. Morse code characters consist of multiple characters, that we have to specify using an array.

The structure of our program should now look something like this:

// a string which we want to decode
String s = ".. -.-. - ... --- -.-. .. .- .-..";
System.out.println("The original message: " + s);
// a string with a decoded message
String message = "";

// array definitions
String alphabetChars = "abcdefghijklmnopqrstuvwxyz";
String[] morseChars = {".-", "-...", "-.-.", "-..", ".", "..-.", "--.", "....",
"..", ".---", "-.-", ".-..", "--", "-.", "---", ".--.", "--.-", ".-.", "...", "-", "..-",
"...-", ".--", "-..-", "-.--", "--.."};

We could also add other Morse characters such as numbers and punctuation marks, but won't worry about them for now. We'll split the String s with the split() method into an array of substrings containing the Morse characters. We'll split it by the space character. Then we'll iterate over the array using a foreach loop:

// splitting a string into Morse characters
String[] characters = s.split(" ");

// iteration over Morse characters
for (String morseChar : characters) {

}

Ideally, we should somehow deal with cases when the user enters e.g. multiple spaces between characters (users often do things of the sort). In this case, split() creates one more empty substring in the array. We should then detect it in the loop and ignore it, but we won't deal with that in this lesson.

In the loop, we'll attempt to find the current Morse character in the morseChars array. We'll be interested in its index because when we look at that same index in the alphabetChars array, there will be a corresponding letter. This is mainly because both the array and the string contain the same characters which are ordered alphabetically. Let's place the following code into the loop body:

char alphabetChar = '?';
int index = -1;
for (int i = 0; i < morseChars.length; i++) {
        if (morseChars[i].equals(morseChar)) {
                index = i;
        }
}
if (index >= 0) { // character was found
        alphabetChar = alphabetChars.charAt(index);
}
message += alphabetChar;

First, the alphabetical character is set to '?' since it may very well be that we don't have it defined in our array. Then we try to determine its index. Java arrays unfortunately have no indexOf() method and I don't want to bother you with advanced data structures now. So we'll write the searching for a string by ourselves. It's quite simple.

As first, we set the index to -1 since we can't be sure if the array even contains a given String (Morse character) at all. Then we iterate over the array items and compare each String with our String (character). We already know that we have to use the equals() method for that. If the Strings are equal, we store the current index.

If we found the character (index > 0), we assign it from alphabetic characters at its index to alphabetChar. Finally, we add the character to the message. The += operator works the same as message = message + alphabetChar.

Now, we'll print a message:

System.out.println("The decoded message: " + message);

Console application
The original message: .. -.-. - ... --- -.-. .. .- .-..
The decoded message: ictsocial

Done! If you want to train some more, you can create a program which would encode a string to the Morse code. The code would be very similar. We'll use the split() method several more times throughout our courses.

Special characters and escaping

Strings can contain special characters which are prefixed with backslash "\". Mainly, the \n character, which causes a line break anywhere in the text, and \t, which is the tab character.

Let's test them out:

System.out.println("First line\nSecond line");

"\" character indicates a special character sequence in a string and can be used also e.g. to write Unicode characters as "\uxxxx" where xxxx is the character code.

The problem might be when we want to write "\" itself, in this case we've to escape it by writing one more "\":

System.out.println("This is a backslash: \\");

We can escape a quotation mark in the same way, so Java wouldn't misinterpret it as the end of the string:

System.out.println("This is a quotation mark: \"");

Inputs from the console and input fields in Windows form applications are, of course, escaped automatically, so the user doesn't need to enter \n, \t, etc.. Programmers are allowed to write these characters in the code, so we have to keep escaping in mind.

Today we basically finished the on-line course on the Java language basic structures. In the next lesson, Multidimensional arrays in Java, we'll look at a bonus episode about multidimensional arrays and we'll briefly talk about the Math class. Nothing will surprise you from the basic language constructs anymore :) In fact, you could potentially start working with objects now, but I would suggest for you to read next few lesson. You all still have a long way to go, but your future looks bright!


 

Download

Downloaded 37x (16.77 kB)
Application includes source codes in language Java

 

 

Article has been written for you by David Capka
Avatar
Do you like this article?
No one has rated this quite yet, be the first one!
The author is a programmer, who likes web technologies and being the lead/chief article writer at ICT.social. He shares his knowledge with the community and is always looking to improve. He believes that anyone can do what they set their mind to.
Unicorn College The author learned IT at the Unicorn College - a prestigious college providing education on IT and economics.
Activities (8)

 

 

Comments

Avatar
Sajjad Sajjad Khan:6. September 23:46

i did not understand how dots convert into characters ??

 
Reply 6. September 23:46
Avatar
David Capka
ICT.social team
Avatar
Replies to Sajjad Sajjad Khan
David Capka:9. September 10:32

The whole message is split using spaces which results in an array of Morse code characters. These are then converted to characters based on the Morse character definitions, see the attached source code.

Reply  +1 9. September 10:32
You can walk through a storm and feel the wind but you know you are not the wind.
Avatar
Replies to David Capka
Sajjad Sajjad Khan:10. September 5:27

thank you sir

 
Reply 10. September 5:27
To maintain the quality of discussion, we only allow registered members to comment. Sign in. If you're new, Sign up, it's free.

3 messages from 3 displayed.