Re: default charset confusion

From: Date: Mon, 12 Mar 2012 07:52:41 +0000
Subject: Re: default charset confusion
References: 1 2 3  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message
Hi!

Ignoring 5.4 for a second, if you in 5.3 do this: echo htmlspecialchars($string); echo htmlspecialchars($string, NULL, "ISO-8859-1"); echo htmlspecialchars($string, NULL, "UTF-8"); You will see that the first two output the escaped string with the GB2312 bytes intact within it and the UTF-8 calls returns false because it correctly recognizes that GB2312 is not UTF-8. We don't have any such check for 8859-1, so yes, saying UTF-8 and 8859-1 are the same for htmlspecialchars() is wrong for PHP 5.3 as well as for 5.4.
So the difference is that ISO8859-1 does not validate but UTF-8 validates? I'm not sure what GB2312 encoding does but isn't it dangerous to do htmlspecialchars() with wrong encoding? Wouldn't htmlentities() also produce wrong result when used with wrong encoding? -- Stanislav Malyshev, Software Architect SugarCRM: http://www.sugarcrm.com/ (408)454-6900 ext. 227

Thread (39 messages)

« previous php.internals (#58861) next »