Re: [php6] Unicode support, options?

From: Date: Thu, 27 Feb 2014 06:13:38 +0000
Subject: Re: [php6] Unicode support, options?
References: 1  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message
On Thu, Feb 20, 2014 at 6:54 AM, Pierre Joye <[email protected]> wrote:

> * ICU:
> U_CHARSET_IS_UTF8 allows to force ICU to use UTF-8 by default. It is a
> ICU compile time setting.It is is not possible to set it at PHP
> configure time. It means that users will have to create their own
> build. Alternatively we can bundle ICU but this will be awkward, a
> maintenance nightmare for both php and the distros.
>
> Alternatively UText can be used to create UTF-8 string. APIs accepting
> UText allow almost everything we need. However the counterpart is that
> a UTF-8 UText is readonly. Any operation altering its content will
> require duplication, clones or conversions. That may kill all gains we
> got from using UTF-8 only.
>
> The  U_CHARSET_IS_UTF8 is very appealing but to bundle ICU is actually
>  show stopper. Asking users to custom build ICU is not an option
> either. I do not know if the distros will be ready to provide two
> different builds of ICU either, it may add a lot of issues with all
> projects using ICU.

Here is a 1st reply from ICU:

http://sourceforge.net/p/icu/mailman/message/32031609/

It sounds like this flag could be a good option for PHP's Unicode support.

Btw, I created a sub page for Unicode support:

https://wiki.php.net/ideas/php6/unicode

> Thoughts, comments or ideas?

I found another C++ library to do the basic UTF-8 operations, easl:

https://code.google.com/p/easl/

It could be a nice one to use in combination with ICU, small and fast
(1st tests).


Cheers,
-- 
Pierre

@pierrejoye | http://www.libgd.org


Thread (34 messages)

« previous php.internals (#72835) next »