How do I replace a character in a string in Java?


Answers

The simple answer is:

token = token.replace("&", "&");

Despite the name as compared to replaceAll, replace does do a replaceAll, it just doesn't use a regular expression, which seems to be in order here (both from a performance and a good practice perspective - don't use regular expressions by accident as they have special character requirements which you won't be paying attention to).

Sean Bright's answer is probably as good as is worth thinking about from a performance perspective absent some further target requirement on performance and performance testing, if you already know this code is a hot spot for performance, if that is where your question is coming from. It certainly doesn't deserve the downvotes. Just use StringBuilder instead of StringBuffer unless you need the synchronization.

That being said, there is a somewhat deeper potential problem here. Escaping characters is a known problem which lots of libraries out there address. You may want to consider wrapping the data in a CDATA section in the XML, or you may prefer to use an XML library (including the one that comes with the JDK now) to actually generate the XML properly (so that it will handle the encoding).

Apache also has an escaping library as part of Commons Lang.

Question

Using Java, I want to go through the lines of a text and replace all ampersand symbols (&) with the XML entity reference &.

I scan the lines of the text and then each word in the text with the Scanner class. Then I use the CharacterIterator to iterate over each characters of the word. However, how can I replace the character? First, Strings are immutable objects. Second, I want to replace a character (&) with several characters(amp&;). How should I approach this?

CharacterIterator it = new StringCharacterIterator(token);
for(char ch = it.first(); ch != CharacterIterator.DONE; ch = it.next()) {
       if(ch == '&') {

       }
}



//I think this will work, you don't have to replace on the even, it's just an example. 

 public void emphasize(String phrase, char ch)
    {
        char phraseArray[] = phrase.toCharArray(); 
        for(int i=0; i< phrase.length(); i++)
        {
            if(i%2==0)// even number
            {
                String value = Character.toString(phraseArray[i]); 
                value = value.replace(value,"*"); 
                phraseArray[i] = value.charAt(0);
            }
        }
    }



You may also want to check to make sure your not replacing an occurrence that has already been replaced. You can use a regular expression with negative lookahead to do this.

For example:

String str = "sdasdasa&amp;adas&dasdasa";
str = str.replaceAll("&(?!amp;)", "&amp;");

This would result in the string "sdasdasa&adas&dasdasa".

The regex pattern "&(?!amp;)" basically says: Match any occurrence of '&' that is not followed by 'amp;'.




If you're using Spring you can simply call HtmlUtils.htmlEscape(String input) which will handle the '&' to '&' translation.




Escaping strings can be tricky - especially if you want to take unicode into account. I suppose XML is one of the simpler formats/languages to escape but still. I would recommend taking a look at the StringEscapeUtils class in Apache Commons Lang, and its handy escapeXml method.




Links