Re: [Early Feedback] Pattern matching

From: Date: Fri, 21 Jun 2024 16:39:14 +0000
Subject: Re: [Early Feedback] Pattern matching
References: 1 2 3 4 5  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message


> On Jun 21, 2024, at 11:42, Niels Dossche <[email protected]> wrote:
> 
> On 21/06/2024 14:43, Robert Landers wrote:
>> On Fri, Jun 21, 2024 at 5:08 AM Andreas Hennings <[email protected]> wrote:
>>> 
>>> E.g. should something like array<int> be added to the type system in
>>> the future, or do we leave the type system behind, and rely on the new
>>> "guards"?
>>> public array $values is array<int>
>>> OR
>>> public array<int> $values
>>> 
>>> The concern here would be if in the future we plan to extend the type
>>> system in a way that is inconsistent or incompatible with the pattern
>>> matching system.
>>> 
>>> --- Andreas
>> 
>> I'm always surprised why arrays can't keep track of their internal
>> types. Every time an item is added to the map, just chuck in the type
>> and a count, then if it is removed, decrement the counter, and if
>> zero, remove the type. Thus checking if an array is array<int>
>> should be a near O(1) operation. Memory usage might be an issue (a
>> couple bytes per type in the array), but not terrible.... but then
>> again, I've been digging into the type system quite a bit over the
>> last few months.
> 
> And every time a modification happens, directly or indirectly, you'll
> have to modify the counts too. Given how much arrays / hash tables are
> used within the PHP codebase, this will eventually add up to a lot of
> overhead. A lot of internal functions that work with arrays will need
> to be audited and updated too. Lots of potential for introducing bugs.
> It's (unfortunately) not a matter of "just" adding some counts.


This is straying a bit for this RFC's discussion, but, I'm wondering if a better approach
to generics for arrays would be to just not do generics for arrays.

Instead, have generics be a class-only thing, and add new built-in types (along the lines of the
classes/interfaces in the Data Structures extension) specifically to provide collection support.
This would accomplish several things:

* Separate object types (e.g. Array, Map, OrderedMap, Set, SparseArray, etc) rather than one
"array" type that does everything. Each could have underlying storage and accessors
optimized for one specific use-case, rather than having to be efficient with several different
use-cases.
* No BC breaks. array and all the existing array_* functions remain untouched and unchanged.
Somewhere years down the line, they can be discouraged in favor of the new interfaces.
* Being objects, these new data types would all have a fancy OOP interface, which could make
chaining operations easy.

The major interoperability concern in this model would be the cost of translating between the new
types and legacy array types at API boundaries for legacy code. Possibly this might limit utility to
greenfield development. But since it'd be entirely new and opt-in types, there's no direct
BC concerns, and maybe some of the typechecking perf hit when you validate inserts/updates could be
elided by the optimizer in the presence of typehints. (e.g. you have an Array<int> and you
insert a value the compiler or optimizer can prove is an int, you don't need to do a runtime
type check.) There'd also probably have to be something done to maintain the COW semantics that
array has without having to have explicit clone operations.

-John


Thread (79 messages)

« previous php.internals (#123738) next »