Re: Request for opinions: bug vs feature - change intokenization of yield from

From: Date: Sat, 20 Jul 2024 15:42:56 +0000
Subject: Re: Request for opinions: bug vs feature - change intokenization of yield from
References: 1 2 3 4 5 6  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message
On 20-7-2024 16:51, Tim Düsterhus wrote:
Hi On 7/19/24 07:22, Juliette Reinders Folmer wrote:
More than anything, I find it concerning that this change sets a precedent for tokens to include comments. Just as an example: what does this mean for the PHP 8.0 nullsafe object operator ? Should we now suddenly allow that to be written as `? /*comment*/ ->` ? Or what about a cast token ? Should that be allowed to be `(string /*for reasons*/)` ?
The difference between yield from and ?-> is that the former looks and feels like it would be two separate keywords, because of the *required* whitespace between the yield and the from. The fact that a yield keyword actually exists also contributes to that. ?-> on the other hand looks and feels like a single operator, just like ++, !==, <=> and others. Except for yield from the rule where comments may be placed as far as I can tell is "comments may appear where whitespace may appear", which is easy enough to explain and understand. So it makes sense to allow for comments between yield and from, but I agree that ideally those would be emitted as separate tokens.
Tim, "comments may appear where whitespace may appear" ? You'd think so, except it isn't true. I already mentioned cast tokens before. Whitespace is perfectly acceptable within the parentheses of these. Comments are not: https://3v4l.org/A6Sgj and https://3v4l.org/nE9H8 Now you may argue that cast tokens "feel like" a single operator, but that's subjective and there's even a sniff to enforce no spacing within cast parentheses as apparently people do pad them with spaces - and doing so is allowed in PHP. * _* Qualifier: spaces and tabs are allowed inside cast parentheses, but new lines are not..._ Along the same lines and I'm beginning to repeat myself, the PHP 8.0 RFC which changed the tokenization of namespaced names explicitly disallowed whitespace and comments _inside_ namespaced names tokenized as a single token, while in the previous, multi-token situation, whitespace and comments were allowed in namespaced names. So to get back to my original point, as of PHP 8.3 is the **only** token which allows for a comment to be tokenized as part of the token. There is no other token which allows that. Whitespace is one thing, comments is a different matter. And even the whitespace is an interesting one as I've seen bug reports in PHPCS about a sniff breaking on yield from with a new line and indentation between the keywords. PHP allows this, the sniff in question does not handle it correctly. So, what "feels" natural (whitespace-wise) to one person may not be the same for the next, but comments _within_ tokens is different thing and should in my opinion, not be allowed. Smile, Juliette

Thread (32 messages)

« previous php.internals (#124516) next »