Re: [RFC] Multibyte char handling

From: Date: Mon, 20 Jan 2014 08:57:48 +0000
Subject: Re: [RFC] Multibyte char handling
References: 1 2 3 4 5 6 7 8  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message
Hi Pierre,

On Mon, Jan 20, 2014 at 3:38 PM, Pierre Joye <[email protected]> wrote:

> > On UNIXes, UTF-8 encoding is popular terminal encoding, but there
> > would be systems using other encoding such as EUC, or even SJIS, BIG5.
>
> Right, and as far as I remember UTF-8 does not suffer from this problem.
>
UTF-8 does not have this issue if terminal handles encoding correctly.
I think almost all termianls handle UTF-8 properly, otherwise it is
considered as
security hole :)

> > Windows uses different encoding for terminal encoding according to
> locale,
> > so it's much more complex.
> >
>
> Let me provide a function to detect it, but we need something to normalize
> the names. Do we have such thing in mbstring?
>
Yes. mbstring has ID for supported encoding and there is normalize function
to set encoding ID.

> > This is the reason why I would use locale. However, this implementation
> > is debatable.
> >
>
> Yes :)
>
We need to decide what to do :)

Regards,

--
Yasuo Ohgaki
[email protected]


Thread (31 messages)

« previous php.internals (#71312) next »