Hi all,
On Fri, Mar 14, 2014 at 8:33 PM, Alexey Zakhlestin <[email protected]>wrote:
> > Nothing is wrong with it, PCRE has very good support for UTF-8 (including
> > character properties and extended grapheme clusters). Can we just
> deprecate
> > mb_ereg? It seems totally useless and just confuses people. If you want
> to
> > match regular expressions on non-UTF-8 just do a conversion beforehand
> (or
> > use a sane encoding right away, you know).
>
> Several years ago mb_ereg was slightly faster than pcre. It could have
> changed since then
Besides unneeded conversion is better to be avoided, we also should
consider the case encoding is broken some how. Conversion should fail or
replace broken bytes, but it changes original data.
Regards,
--
Yasuo Ohgaki
[email protected]