HttpUtils method urlEncodeFormParams Charset #1444

keyhunter · 2017-07-31T09:39:59Z

When the charset is not utf-8 , this code (org.asynchttpclient.util.HttpUtils) will produce the wrong result

public static ByteBuffer urlEncodeFormParams(List<Param> params, Charset charset) { return StringUtils.charSequence2ByteBuffer(urlEncodeFormParams0(params), charset); }

This Code specifies the utf-8 encoding， and can't change, this will result in errors under other encodings(GBK etc).

The text was updated successfully, but these errors were encountered:

slandelle · 2017-07-31T11:17:51Z

@keyhunter Could you provide a failing test case please?

keyhunter · 2017-07-31T12:13:55Z

Of course, @slandelle .
`
String beEncodeValue = "中文";

    List<Param> params = new ArrayList<>();

    params.add(new Param("language", beEncodeValue));

    ByteBuffer result1 = HttpUtils.urlEncodeFormParams(params, Charset.forName("GBK"));

    ByteBuffer result2 = HttpUtils.urlEncodeFormParams(params, Charset.forName("UTF-8"));

`

They all invoke the method urlEncodeFormParams0, the method use Utf8UrlEncoder, so produced the same result.

But the real result is:
URLEncoder.encode(beEncodeValue, "GBK");// should be "%D6%D0%CE%C4"; URLEncoder.encode(beEncodeValue, "UTF-8");//should be "%E4%B8%AD%E6%96%87"
They are different.

slandelle · 2017-07-31T14:36:49Z

@keyhunter Should be fixed (even though there's room for perf improvement). Could you please check on your side?

Motivation: form urlencoding doesn’t properly honor charset. It uses it for converting the bytes while those are supposed to be already in the US-ASCII range. It should be using it the first encode into bytes, which should be then escaped. Modifications: Use current optimized code for UTF-8 and fall back to URLEncoder for other charsets. Results: Proper encoding when charset is different from UTF-8, eg GBK

keyhunter · 2017-07-31T14:58:02Z

OK, thanks.
👍

slandelle closed this as completed in 1e1de6d Jul 31, 2017

keyhunter mentioned this issue Jul 31, 2017

How can i create Request with other UrlEncoding(like GBK), Not UTF-8 #1443

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HttpUtils method urlEncodeFormParams Charset #1444

HttpUtils method urlEncodeFormParams Charset #1444

keyhunter commented Jul 31, 2017 •

edited

Loading

slandelle commented Jul 31, 2017

Uh oh!

keyhunter commented Jul 31, 2017 •

edited

Loading

Uh oh!

slandelle commented Jul 31, 2017

Uh oh!

keyhunter commented Jul 31, 2017

Uh oh!

HttpUtils method urlEncodeFormParams Charset #1444

HttpUtils method urlEncodeFormParams Charset #1444

Comments

keyhunter commented Jul 31, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

slandelle commented Jul 31, 2017

Uh oh!

keyhunter commented Jul 31, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

slandelle commented Jul 31, 2017

Uh oh!

keyhunter commented Jul 31, 2017

Uh oh!

keyhunter commented Jul 31, 2017 •

edited

Loading

keyhunter commented Jul 31, 2017 •

edited

Loading