Hi!
Ignoring 5.4 for a second, if you in 5.3 do this:
echo htmlspecialchars($string);
echo htmlspecialchars($string, NULL, "ISO-8859-1");
echo htmlspecialchars($string, NULL, "UTF-8");
You will see that the first two output the escaped string with the
GB2312 bytes intact within it and the UTF-8 calls returns false because
it correctly recognizes that GB2312 is not UTF-8. We don't have any such
check for 8859-1, so yes, saying UTF-8 and 8859-1 are the same for
htmlspecialchars() is wrong for PHP 5.3 as well as for 5.4.
So the difference is that ISO8859-1 does not validate but UTF-8 validates?
I'm not sure what GB2312 encoding does but isn't it dangerous to do htmlspecialchars() with wrong encoding? Wouldn't htmlentities() also produce wrong result when used with wrong encoding?
--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227