with - string split java 8




How to split a string in Java (20)

String Split with multiple characters using Regex

public class StringSplitTest {
     public static void main(String args[]) {
        String s = " ;String; String; String; String, String; String;;String;String; String; String; ;String;String;String;String";
        //String[] strs = s.split("[,\\s\\;]");
        String[] strs = s.split("[,\\;]");
        System.out.println("Substrings length:"+strs.length);
        for (int i=0; i < strs.length; i++) {
            System.out.println("Str["+i+"]:"+strs[i]);
        }
     }
  }

Output:

Substrings length:17
Str[0]:
Str[1]:String
Str[2]: String
Str[3]: String
Str[4]: String
Str[5]: String
Str[6]: String
Str[7]:
Str[8]:String
Str[9]:String
Str[10]: String
Str[11]: String
Str[12]:
Str[13]:String
Str[14]:String
Str[15]:String
Str[16]:String

But do not expect the same output across all JDK versions. I have seen one bug which exists in some JDK versions where the first null string has been ignored. This bug is not present in the latest JDK version, but it exists in some versions between JDK 1.7 late versions and 1.8 early versions.

I have a string, "004-034556", that I want to split into two strings:

string1="004";
string2="034556";

That means the first string will contain the characters before '-', and the second string will contain the characters after '-'. I also want to check if the string has '-' in it. If not, I will throw an exception. How can I do this?


An alternative to processing the string directly would be to use a regular expression with capturing groups. This has the advantage that it makes it straightforward to imply more sophisticated constraints on the input. For example, the following splits the string into two parts, and ensures that both consist only of digits:

import java.util.regex.Pattern;
import java.util.regex.Matcher;

class SplitExample
{
    private static Pattern twopart = Pattern.compile("(\\d+)-(\\d+)");

    public static void checkString(String s)
    {
        Matcher m = twopart.matcher(s);
        if (m.matches()) {
            System.out.println(s + " matches; first part is " + m.group(1) +
                               ", second part is " + m.group(2) + ".");
        } else {
            System.out.println(s + " does not match.");
        }
    }

    public static void main(String[] args) {
        checkString("123-4567");
        checkString("foo-bar");
        checkString("123-");
        checkString("-4567");
        checkString("123-4567-890");
    }
}

As the pattern is fixed in this instance, it can be compiled in advance and stored as a static member (initialised at class load time in the example). The regular expression is:

(\d+)-(\d+)

The parentheses denote the capturing groups; the string that matched that part of the regexp can be accessed by the Match.group() method, as shown. The \d matches and single decimal digit, and the + means "match one or more of the previous expression). The - has no special meaning, so just matches that character in the input. Note that you need to double-escape the backslashes when writing this as a Java string. Some other examples:

([A-Z]+)-([A-Z]+)          // Each part consists of only capital letters 
([^-]+)-([^-]+)            // Each part consists of characters other than -
([A-Z]{2})-(\d+)           // The first part is exactly two capital letters,
                           // the second consists of digits


For simple use cases String.split() should do the job. If you use guava, there is also a Splitter class which allows chaining of different string operations and supports CharMatcher:

Splitter.on('-')
       .trimResults()
       .omitEmptyStrings()
       .split(string);

Here are two ways two achieve it.

WAY 1: As you have to split two numbers by a special character you can use regex

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class TrialClass
{
    public static void main(String[] args)
    {
        Pattern p = Pattern.compile("[0-9]+");
        Matcher m = p.matcher("004-034556");

        while(m.find())
        {
            System.out.println(m.group());
        }
    }
}

WAY 2: Using the string split method

public class TrialClass
{
    public static void main(String[] args)
    {
        String temp = "004-034556";
        String [] arrString = temp.split("-");
        for(String splitString:arrString)
        {
            System.out.println(splitString);
        }
    }
}

I just wanted to write an algorithm instead of using Java built-in functions:

public static List<String> split(String str, char c){
    List<String> list = new ArrayList<>();
    StringBuilder sb = new StringBuilder();

    for (int i = 0; i < str.length(); i++){
        if(str.charAt(i) != c){
            sb.append(str.charAt(i));
        }
        else{
            if(sb.length() > 0){
                list.add(sb.toString());
                sb = new StringBuilder();
            }
        }
    }

    if(sb.length() >0){
        list.add(sb.toString());
    }
    return list;
}

One way to do this is to run through the String in a for-each loop and use the required split character.

public class StringSplitTest {

    public static void main(String[] arg){
        String str = "004-034556";
        String split[] = str.split("-");
        System.out.println("The split parts of the String are");
        for(String s:split)
        System.out.println(s);
    }
}

Output:

The split parts of the String are:
004
034556

Please don't use StringTokenizer class as it is a legacy class that is retained for compatibility reasons, and its use is discouraged in new code. And we can make use of the split method as suggested by others as well.

String[] sampleTokens = "004-034556".split("-");
System.out.println(Arrays.toString(sampleTokens));

And as expected it will print:

[004, 034556]

In this answer I also want to point out one change that has taken place for split method in Java 8. The String#split() method makes use of Pattern.split, and now it will remove empty strings at the start of the result array. Notice this change in documentation for Java 8:

When there is a positive-width match at the beginning of the input sequence then an empty leading substring is included at the beginning of the resulting array. A zero-width match at the beginning however never produces such empty leading substring.

It means for the following example:

String[] sampleTokensAgain = "004".split("");
System.out.println(Arrays.toString(sampleTokensAgain));

we will get three strings: [0, 0, 4] and not four as was the case in Java 7 and before. Also check this similar question.


The requirements left room for interpretation. I recommend writing a method,

public final static String[] mySplit(final String s)

which encapsulate this function. Of course you can use String.split(..) as mentioned in the other answers for the implementation.

You should write some unit-tests for input strings and the desired results and behaviour.

Good test candidates should include:

 - "0022-3333"
 - "-"
 - "5555-"
 - "-333"
 - "3344-"
 - "--"
 - ""
 - "553535"
 - "333-333-33"
 - "222--222"
 - "222--"
 - "--4555"

With defining the according test results, you can specify the behaviour.

For example, if "-333" should return in [,333] or if it is an error. Can "333-333-33" be separated in [333,333-33] or [333-333,33] or is it an error? And so on.


To split a string, use String.split(regex):

String phone = "004-034556";
String[] output = phone.split("-");
System.out.println(output[0]);
System.out.println(output[1]);

Output:

004
034556

To summarize: there are at least five ways to split a string in Java:

  1. String.split():

    String[] parts ="10,20".split(",");
    
  2. Pattern.compile(regexp).splitAsStream(input):

    List<String> strings = Pattern.compile("\\|")
          .splitAsStream("010|020202")
          .collect(Collectors.toList());
    
  3. StringTokenizer (legacy class):

    StringTokenizer strings = new StringTokenizer("Welcome to EXPLAINJAVA.COM!", ".");
    while(strings.hasMoreTokens()){
        String substring = strings.nextToken();
        System.out.println(substring);
    }
    
  4. Google Guava Splitter:

    Iterable<String> result = Splitter.on(",").split("1,2,3,4");
    
  5. Apache Commons StringUtils:

    String[] strings = StringUtils.split("1,2,3,4", ",");
    

So you can choose the best option for you depending on what you need, e.g. return type (array, list, or iterable).

Here is a big overview of these methods and the most common examples (how to split by dot, slash, question mark, etc.)


Use org.apache.commons.lang.StringUtils' split method which can split strings based on the character or string you want to split.

Method signature:

public static String[] split(String str, char separatorChar);

In your case, you want to split a string when there is a "-".

You can simply do as follows:

String str = "004-034556";

String split[] = StringUtils.split(str,"-");

Output:

004
034556

Assume that if - does not exists in your string, it returns the given string, and you will not get any exception.


You can simply use StringTokenizer to split a string in two or more parts whether there are any type of delimiters:

StringTokenizer st = new StringTokenizer("004-034556", "-");
while(st.hasMoreTokens())
{
    System.out.println(st.nextToken());
}

You can split a string by a line break by using the following statement:

String textStr[] = yourString.split("\\r?\\n");

You can split a string by a hyphen/character by using the following statement:

String textStr[] = yourString.split("-");

You can use Split():

import java.io.*;

public class Splitting
{

    public static void main(String args[])
    {
        String Str = new String("004-034556");
        String[] SplittoArray = Str.split("-");
        String string1 = SplittoArray[0];
        String string2 = SplittoArray[1];
    }
}

Else, you can use StringTokenizer:

import java.util.*;
public class Splitting
{
    public static void main(String[] args)
    {
        StringTokenizer Str = new StringTokenizer("004-034556");
        String string1 = Str.nextToken("-");
        String string2 = Str.nextToken("-");
    }
}

 String string = "004^034556-34";
 String[] parts = string.split(Pattern.quote("^"));

If you have a special character then you can use Patter.quote. If you simply have dash (-) then you can shorten the code:

 String string = "004-34";
 String[] parts = string.split("-");

If you try to add other special character in place of dash (^) then the error will generate ArrayIndexOutOfBoundsException. For that you have to use Pattern.quote.


String s="004-034556";
for(int i=0;i<s.length();i++)
{
    if(s.charAt(i)=='-')
    {
        System.out.println(s.substring(0,i));
        System.out.println(s.substring(i+1));
    }
}

As mentioned by everyone, split() is the best option which may be used in your case. An alternative method can be using substring().


String str="004-034556"
String[] sTemp=str.split("-");// '-' is a delimiter

string1=004 // sTemp[0];
string2=034556//sTemp[1];

String[] result = yourString.split("-");
if (result.length != 2) 
     throw new IllegalArgumentException("String not in correct format");

This will split your string into 2 parts. The first element in the array will be the part containing the stuff before the -, and the 2nd element in the array will contain the part of your string after the -.

If the array length is not 2, then the string was not in the format: string-string.

Check out the split() method in the String class.

https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#split-java.lang.String-int-


import java.io.*;

public class BreakString {

  public static void main(String args[]) {

    String string = "004-034556-1234-2341";
    String[] parts = string.split("-");

    for(int i=0;i<parts.length;i++) {
      System.out.println(parts[i]);
    }
  }
}




string