Re: Re: [php6] Unicode support, options?

From: Date: Thu, 27 Feb 2014 10:51:50 +0000
Subject: Re: Re: [php6] Unicode support, options?
References: 1 2 3 4  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message
Pierre Joye wrote:
That what ever is used will need to be both tailored for PHP and transparent
as far as ICU is concerned is as you have identified - a given. ICU is still built using 32bit string lengths ( I think? ) which does add to the fun, but I don't see any reason not to be using functions like compareUTF8() and ucasemap_utf8ToLower() from ICU in which case the strings need to be standard ICU UTF-8 strings? I can see the advantage of the 'fast' compare that I have been banging on about elsewhere, which looks for a simple match between two raw strings of bytes. UTF-8 only comes into that when you need to add 'rank'? But much of the core processing CAN simply ignore that as long as the generic calls don't have dead tails which activate it?
We may use our own functions (or other lib) to covers operations not implemented in ICU or too slow because of the conversions. That's why investigating in other tools is still a good thing to do.
The bit I'm still missing here is 'operations not implemented in ICU'? As soon as conversions are required then speed is always going to be compromised, but where the platform is already UTF-8 based, which is a growing situation, then all we are looking for is to handle UTF-8 strings quickly. For the best performance conversions can simply be avoided. So I'm currently looking at conversion as a secondary problem - probably less important than case! - and just trying to identify what is missing from ICU's UTF-8 that needs to be added? It may well be that windows is a special case that needs it's own conversion layer, but that should not form part of any core upgrade. It is not needed for many installations? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Thread (34 messages)

« previous php.internals (#72839) next »