Re: PHP6 wiki page
On 02/14/2014 11:39 AM, Rowan Collins wrote:
> Lester Caine wrote (on 14/02/2014):
>> But more fundamentally I don't think there was agreement on whether we
>> simply standardise on unicode in the core, or allow a single byte
>> mode? 8 years on, I feel that the amount of utf8 material that is
>> floating around, the easiest route IS unicode only?
>
> The question is not whether to be "Unicode only", it's *how* to
> implement Unicode. It's not just a case of making all your strings
> wider, every function that manipulates a string in any way has to be
> thought through, and every input and output has to be converted to/from
> whatever encoding is chosen as the internal implementation.
>
> While updating the Wikipedia article [1] I came across this slide set
> [2], which has a fairly decent explanation of the issues and why the
> previous implementation was abandoned.
>
> If somebody comes up with an implementation proposal of Unicode strings,
> whether to have a mode that doesn't use it can be discussed, but right
> now there doesn't seem to be such a live proposal.
What we really need is an awesome small and fast Unicode library that
does everything ICU does but faster and in less code while using UTF-8
as its internal storage so we don't have to convert on each and every
operation. There are a ton of non-obvious things beyond simple string
manipulation. String collation alone is massively complicated, for example.
-Rasmus
Thread (17 messages)