Skip to content

standard UTF8 encoding for MediaWiki databases #373

@Tigerfell

Description

@Tigerfell

I would like to ask you to change the character encoding of the MySQL databases for MediaWiki installations to utf8mb4 encoding. It looks like they currently use utf8 which means that it uses three Bytes to store a character. It is not a real UTF8 encoding, many characters can not be stored. This results in wiki pages being trimmed when someone enters a non-supported character [1]. There are currently two use cases which require "standard" UTF8 encoding.

  • A user wants to write their OSM user name in the wiki. [1]
  • A MediaWiki gadget is used and the internal UTF8 test fails, which renders it useless. [2]

Additionally, the current encoding is deprecated according to MySQL 8 documentation [3] and will be removed. The wiki currently uses MySQL 5.7.29.

[1] https://wiki.openstreetmap.org/w/index.php?title=Bot&diff=prev&oldid=1784135
[2] https://wiki.openstreetmap.org/wiki/MediaWiki:Gadget-HotCat.js
[3] https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-sets.html

Metadata

Metadata

Assignees

No one assigned

    Labels

    service:wikiThe project wiki on wiki.openstreetmap.org

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions