Skip to content

Non english characters not deserialized successfully #908

@tronurg

Description

@tronurg

When some non-english characters are serialized and deserialized back, the output is not as expected. All those non-english characters are lost and replaced with some generic symbols. But this is not the case with all msgpack and Java versions.
The following piece of code behaves differently accross jackson-dataformat-msgpack and Java versions.

ObjectMapper objectMapper = new ObjectMapper(new MessagePackFactory());
String inputStr = "çÇğĞıİöÖşŞüÜ";
byte[] arr = objectMapper.writeValueAsBytes(inputStr);
String outputStr = objectMapper.readValue(arr, String.class);
System.out.println(outputStr);

The case was tested with Turkish characters in Java 8 and Java 21, with jackson-dataformat-msgpack versions 0.9.8 to 0.9.10.
Here are the results:

  • In Java 8, jackson-dataformat-msgpack versions 0.9.8, 0.9.9 and 0.9.10 all give the correct result, ie. outputStr=inputStr (çÇğĞıİöÖşŞüÜ)
  • In java 21, jackson-dataformat-msgpack version 0.9.8 give the correct result, ie. outputStr=inputStr (çÇğĞıİöÖşŞüÜ)
  • In java 21, jackson-dataformat-msgpack versions 0.9.9 and 0.9.10 give the wrong result, ie. outputStr is something like çÇğÄ?ıİöÖşÅ?üÜ

Anything I'm missing here, like some new settings, or is it a bug?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions