Compact Strings in Java 9

Java 9 has brought the concept of compact Strings back. This means that whenever we create a String if all the characters of the String can be represented using a byte — LATIN-1 representation, a byte array will be used internally, such that one byte is given for one character.

Before Java 9, Strings was internally represented by a char array containing the characters of the string. Since Java internally uses UTF-16, every char occupies two bytes, also if a single character can be represented using a single byte (LATIN-1 representation), then there is potential to improve performances and memory consumption.

Java 9 has introduced this concepts of Compact Strings :

Whenever we create a String, if all its characters can be represented using a single byte (LATIN-1), a byte array will be used internally to save half space required.

This means that if there is only one character requiring more than 8 bits for the representation, Java will use UTF-16 with a char array.

String implementation

Until Java 9, a String was stored as a char array, as follows :

private final char[] value;

From Java 9, a String is represented with a byte array with the help of a coder field, as follows :

private final byte[] value;
/*can be LATIN1 = 0 or UTF16 = 1 */
private final byte coder;

Most of the String operations must check the coder value and dispatch to the specific implementation. The change does not affect any public interfaces of String or any other related classes. Many of the classes working with Strings (such as StringBuffer or StringBuilder) were updated to support the new String representation.

