That's not a hole in the design. It was quite deliberate and
it had
little to do with Unicode at the time. It was a deliberate
effort to not
artificially limit identifiers beyond that which the language
syntax
naturally prevented. Think <space> ; , { } ( ) etc.
IMHO it was the right decision to no artificially limit
identifiers and it is a fair trade-off for case-insensitivity
without unicode (class ß{} class SS{}).
With unicode identifiers there is at least one more problem
through normalization to consider. somewhat simplified: $☀☁ and
$⛅ (=== in unicode)
Good point, but users should use NFC UTF-8 without BOM for variable/function names.
It would be documentation issue.
in the languages i know combining diacritics are not common so can't evaluate how practical it is to type those. Would it be impossible to change code with a dumb editor?
$café !== $café
0x63 0x61 0x66 0xC3 0xA9
0x63 0x61 0x66 0x65 0xCC 0x81
cryptocompress