Re: [php6] Unicode support, options?

From: Date: Thu, 20 Feb 2014 16:10:25 +0000
Subject: Re: [php6] Unicode support, options?
References: 1  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message
On 20/02/2014 06:54, Pierre Joye wrote:
hi, Hello :-),
Unicode still remains one of the top requested features in PHP. However as Rasmus and other stated earlier, it is not a trivial job. Some of the keys point we need to take care of are: - UTF-8 storage - UTF-8 support for almost (if not all) existing string APIs - Performance As of today, I did not find any library covering at least two of these key points. [snip] I would like to begin to discuss our option now already. I am not asking to get in all implementation details from a userland point of view (like u"some text" or addng new APIs or not) but only to see what we can do internally to work with UTF-8 string. Just a little note: using a u"foobar" syntax would help to switch from one to another light or heavy implementation internally, and thus, it would help to cover at least two of the key points described above.
I would mention the Rust implementation of UTF-8 strings [1, 2]. It's fast, it's safe and it has a nice large API. I don't say I want to see PHP using Rust. I think it would be hard to do (even if it will certainly benefit PHP), but the algorithms they used can be a source of inspiration for us. Maybe we should consider it if we decide to have our own implementation instead of using a third library. Cheers. [1] https://github.com/mozilla/rust/blob/master/src/libstd/str.rs [2] http://static.rust-lang.org/doc/master/std/str/index.html -- Ivan Enderlin Developer of Hoa http://hoa-project.net/ PhD. student at DISC/Femto-ST (Vesontio) and INRIA (Cassis) http://disc.univ-fcomte.fr/ and http://www.inria.fr/ Member of HTML and WebApps Working Group of W3C http://w3.org/

Thread (34 messages)

« previous php.internals (#72714) next »