Re: Revisiting case-sensitivity in PHP

From: Date: Thu, 13 Jun 2024 22:24:48 +0000
Subject: Re: Revisiting case-sensitivity in PHP
References: 1 2  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message
On Friday, 14 June, 2024 г. at 00:04, Timo Tijhof <[email protected]>
wrote:

> Would this affect unserialize()?
>
> I ask because MediaWiki's main "text" database table is an
> immutable/append-only store where we store the text of each page revision
> since ~2004. It is stored as serialised blobs of a value class. There have
> been a number of different implementations over the past twenty years of
> Wikipedia's existence (plain text, gzip-compressed, diff-compressed, etc.).
>
> When we adopted modern autoloading in MediaWiki, we quickly found that
> blobs originally serialized by PHP 4 actually encoded the class in
> lowercase, regardless of the casing in source code.
>
> From https://3v4l.org/jl0et:
>
>> class ConcatenatedGzipHistoryBlob {…}
>> print serialize($blob);
>> # PHP 4.x: O:27:"concatenatedgziphistoryblob":…
>> # PHP 5/7/8: O:27:"ConcatenatedGzipHistoryBlob":…
>
>
> It is of course the application's responsibility to load these classes,
> but, it is arguably PHP's responsiblity to be able to construct what it
> serialized. I suppose anything is possible when announced as a breaking
> change for PHP 9.0. I wanted to share this as something to take into
> consideration as part of the impact. Potentially worthy of additional
> communicating, or perhaps worth supporting separately.
>
> --
> Timo Tijhof,
> Principal Engineer,
> Wikimedia Foundation.
> https://timotijhof.net/
>
>
Hi, Timo!

Thank you very much for bringing up this important case.

Here's how I see this. If PHP gets class case-sensitivity, unserialization
of classes with lowercase names will fail. This is because the engine will
start putting MyClass class entry with key MyClass (not
myclass) into
the loaded classes table and serialization will not be able to find it as
myclass.
Even if some deprecation layer is introduced (that puts both myclass and
MyClass keys into the table), you will first have a ton of notices and
then eventually end up with the same problem, when transition to case
sensitivity is complete. Hence I propose no deprecation layer — it does not
really help.

However, you will be able to use class_alias() to solve your issue. If
classes are case-sensitive, class_alias(MyClass::class, 'myclass');
should work, since MyClass != myclass anymore. And serialization works
perfectly with class aliases, see https://3v4l.org/1n1as .

--
Valentin Udaltsov


Thread (18 messages)

« previous php.internals (#123597) next »