Description
.contentToString is used in NettyResponse.getResponseBody(charset) to convert a chunked HTTP response into a string.
It first converts each chunk into a string with the given character set and then concatenates those strings.
This fails when the border between two chunks is in the middle of a multibyte utf-8 encoded character, in which case the character is split in two parts each of which are invalid utf-8 characters and get decoded into the unicode REPLACEMENT CHARACTER.
A correct implementation would first concatenate all the chunks and then create a string:
public final static String contentToString(Collection bodyParts, String charset) throws UnsupportedEncodingException {
return new String(contentToByte(bodyParts), charset);
}
We found this when using version 1.6.4 where the .contentToString method is part of com.ning.http.client.providers.netty.NettyResponse. In 1.7.0 this method has been moved to com.ning.http.util.AsyncHttpProviderUtils without change in implementation, so we are pretty sure that the bug still exists in 1.7.0.