On Thu, Feb 20, 2014 at 6:54 AM, Pierre Joye <[email protected]> wrote:
> * ICU:
> U_CHARSET_IS_UTF8 allows to force ICU to use UTF-8 by default. It is a
> ICU compile time setting.It is is not possible to set it at PHP
> configure time. It means that users will have to create their own
> build. Alternatively we can bundle ICU but this will be awkward, a
> maintenance nightmare for both php and the distros.
>
> Alternatively UText can be used to create UTF-8 string. APIs accepting
> UText allow almost everything we need. However the counterpart is that
> a UTF-8 UText is readonly. Any operation altering its content will
> require duplication, clones or conversions. That may kill all gains we
> got from using UTF-8 only.
>
> The U_CHARSET_IS_UTF8 is very appealing but to bundle ICU is actually
> show stopper. Asking users to custom build ICU is not an option
> either. I do not know if the distros will be ready to provide two
> different builds of ICU either, it may add a lot of issues with all
> projects using ICU.
Here is a 1st reply from ICU:
http://sourceforge.net/p/icu/mailman/message/32031609/
It sounds like this flag could be a good option for PHP's Unicode support.
Btw, I created a sub page for Unicode support:
https://wiki.php.net/ideas/php6/unicode
> Thoughts, comments or ideas?
I found another C++ library to do the basic UTF-8 operations, easl:
https://code.google.com/p/easl/
It could be a nice one to use in combination with ICU, small and fast
(1st tests).
Cheers,
--
Pierre
@pierrejoye | http://www.libgd.org