Skip to content

[RFC] Allow arbitrary expressions in static variable initializer #9301

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

iluuu1994
Copy link
Member

@iluuu1994 iluuu1994 commented Aug 11, 2022

https://wiki.php.net/rfc/arbitrary_static_variable_initializers

Fixes GH-8959.

Instead of using constant expressions the initializer of static variables in this PR is compiled down to OPCodes that are conditionally JMPed over via ZEND_JMP_STATIC_DEF if a static variable has been initialized. If it has not, the initializer is executed and the result is then stored in zend_op_array.static_variables through ZEND_BIND_STATIC which now accepts OP2.

Notable implications:

  • The default values of static variables are only observable through reflection once the function has been called.
  • Multiple declarations of the same static variables are now forbidden. Previously, the last declaration would win, whereas with this change the first would win. Furthermore, problems could arise if a static variable with no initializer was shadowed by one with an initializer. To avoid these issues multiple declarations are disallowed.

/cc @nicolas-grekas

@nicolas-grekas
Copy link
Contributor

Love it thanks! Both implications make sense to me. I'm fine with them.

@nikic
Copy link
Member

nikic commented Nov 6, 2022

Might be better to have ZEND_JMP_STATIC_DEF assign the CV if already initialized, rather than jump to a BIND_STATIC that does that?

@iluuu1994
Copy link
Member Author

iluuu1994 commented Nov 6, 2022

Might be better to have ZEND_JMP_STATIC_DEF assign the CV if already initialized, rather than jump to a BIND_STATIC that does that?

@nikic Sounds reasonable. I did not consider that. This avoids both the ZEND_JMP_STATIC_DEF and the inner ZEND_JMP.

@iluuu1994
Copy link
Member Author

iluuu1994 commented Nov 6, 2022

ZEND_JMP_STATIC_DEF should probably also be called ZEND_JMP_STATIC_VAR_DEF (or INITIALIZED).

Copy link
Contributor

@TysonAndre TysonAndre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any bugs and this seems useful, but I had some minor comments

  • It might be possible to reduce the size of the compiled code for static variables (e.g. static $x = 42;) when the ast already compiles/evaluates to a regular zval, and avoid generating a branch? Or is there a test case that would fail.
  • some minor nits

@xparq
Copy link

xparq commented Dec 15, 2022

The RFC status is "Under discussion", for 8.3, but at the same time it says "This poll has been closed."... (Also with 0 results. And no dates.)

Is that just a confusing quirk of the wiki's poll module, and is it actually trying to say "This poll has not been started yet"? Thanks for the clarification!

@iluuu1994 iluuu1994 force-pushed the static-var-arbitrary-expr branch from 55207b1 to 0cf6c11 Compare March 5, 2023 20:49
@iluuu1994 iluuu1994 force-pushed the static-var-arbitrary-expr branch from 0cf6c11 to 6168fd7 Compare March 12, 2023 22:35
@dstogov
Copy link
Member

dstogov commented Mar 13, 2023

This is definitely not a bug but a new feature. I see you already wrote an RFC. (just update the PR comment).

Actually you propose a syntax sugar that may save one line of PHP code.

function old_foo() {
    static $var;
    if (!isset($var)) $var = new stdClass;
}

function new_foo() {
    static $var = new stdClass;
}
?>

The result may be clear in some cases or really not in case of recursive calls, exceptions, etc (like this https://github.com/php/php-src/pull/9301/files#diff-3a293b86b65681c12e67ccc0ad8230b59ae7edf26cbeacee56fb8a726db48dc6).

The proposed implementation should lead to performance degradation of existing scripts with static variables initialized by constants. (I didn't check this).

Despite, of the main idea, that I don't like because of complication, I like the idea of disabling "static redeclaration".

@iluuu1994
Copy link
Member Author

iluuu1994 commented Mar 13, 2023

The proposed implementation should lead to performance degradation of existing scripts with static variables initialized by constants. (I didn't check this).

Probably, as that would execute multiple opcodes instead of just one. I think that should be mostly negligible as the initializer only runs once per request. However, if you're concerned I can do some benchmarking.

Actually you propose a syntax sugar that may save one line of PHP code.

IMO this is mostly about language consistency and confusion. For most people who don't know the internals of PHP it's not obvious why some expressions work in that place and some don't.

@dstogov
Copy link
Member

dstogov commented Mar 13, 2023

Probably, as that would execute multiple opcodes instead of just one. I think that should be mostly negligible as the initializer only runs once per request. However, if you're concerned I can do some benchmarking.

Of course, it make sense to make benchmark on some real-life app and micro-benchmark, just not to pass a significant degradation.

Actually you propose a syntax sugar that may save one line of PHP code.

IMO this is mostly about language consistency and confusion. For most people who don't know the internals of PHP it's not obvious why some expressions work in that place and some don't.

I'm not completely sure. It's OK for languages to require a constant expression in initializers.
Are you going to make a next logical step? Allowing run-time evaluation of expressions in initializer for default values of arguments?

@nicolas-grekas
Copy link
Contributor

nicolas-grekas commented Mar 13, 2023

Would it be more clear if the syntax were static $foo ??= $expr;? It's quite natural syntax to me, being explicit about the "run one" semantics (if we accept that null means uninitialized).

(I'm wondering because then the code could just expand static $foo ??= $expr; into static $foo; $foo ??= $expr; and this might be trivial to implement)

@dstogov
Copy link
Member

dstogov commented Mar 13, 2023

@nicolas-grekas your idea is great for simple implementation, but not for the language consistency. It introduces the third case for expression evaluation semantic.

@iluuu1994
Copy link
Member Author

@dstogov I appreciate your comments. Note that the RFC isn't new, it was announced 4 months ago.

Of course, it make sense to make benchmark on some real-life app and micro-benchmark, just not to pass a significant degradation.

Ok, will do. I'll also think about how we could automate this process in CI and trigger on demand for PRs that look like they could affect performance.

Are you going to make a next logical step? Allowing run-time evaluation of expressions in initializer for default values of arguments?

I think most people would associate static variable initializers more closely with normal variable initializres rather than something like property or parameter default values, and so might be surprised by the restricted functionality. Generally I'd say the more places support all expressions the better. Some language constructs are missing from constant expressions not because they need to but just because nobody has bothered implementing them.

Is there something in particular you find bad about the implementation? Do you see any way it could be improved? I think it's important our goals align.

@kocsismate
Copy link
Member

Ok, will do. I'll also think about how we could automate this process in CI and trigger on demand for PRs that look like they could affect performance.

IMO, would be wonderful! I've suggested somewhat similar a while ago (https://externals.io/message/116323), but didn't get anywhere with it. I'd be happy to work on it should the Foundation sponsored the initiative. Please note that I wasn't able to make my framework work very reliable between consecutive runs.

@dstogov
Copy link
Member

dstogov commented Mar 13, 2023

Ok, will do.

I missed, that patch won't affect the most usual case static $a = 0;. Right?
It may start making slight degradation from static $a = CONST_NAME;.
So I don't expect any visible slowdown of real-life apps, and then don't care about this degradation.
The result on some micro-benchmark would be interesting anyway.

I'll also think about how we could automate this process in CI and trigger on demand for PRs that look like they could affect performance.

I had a number of apps (Wordpress, Symfony Demp, Symfony Hello, PHP-Parser) that I run under valgrind/callgrind to measure the difference between PHP builds, but I almost stopped running them.

Is there something in particular you find bad about the implementation?

The main part of implementation in zend_vm_def.h looks good.
I'm not sure about IS_STATIC_VAR_UNINITIALIZED that shares bit with IS_REFCOUNTED.
Shouldn't JMP_STATIC_DEF treat any refcounted as uninitialized.
I must be missing something...

I'm a bit afraid of a number of places where the new instruction has to be checked.

Do you see any way it could be improved?

No I don't. The implementation is not very complex and it seems shouldn't introduce problems because of side effects during expression evaluation.

@iluuu1994
Copy link
Member Author

iluuu1994 commented Mar 13, 2023

IMO, would be wonderful! I've suggested somewhat similar a while ago (https://externals.io/message/116323), but didn't get anywhere with it. I'd be happy to work on it should the Foundation sponsored the initiative. Please note that I wasn't able to make my framework work very reliable between consecutive runs.

I don't doubt the foundation would sponsor this, but you can ask Roman to be sure. I've been thinking about building something analogous to https://llvm-compile-time-tracker.com/ for a long time. There are some challenges, namely the runner needs to have consistent performance which GitHub actions likely does not. Using Valgrind (with some flags like CPU cache simulation) might be an option, although from a quick Google search I couldn't figure out if it accounts for different cycles per instruction. Maybe @dstogov could provide some guidance on how to gather the most accurate results.

I missed, that patch won't affect the most usual case static $a = 0;. Right?

Correct. If possible the constant will be stored in the static variable array directly. Otherwise (in the cases where today it would be a const AST) it will be NULL with the special IS_STATIC_VAR_UNINITIALIZED flag.

I'm not sure about IS_STATIC_VAR_UNINITIALIZED that shares bit with IS_REFCOUNTED.

IS_STATIC_VAR_UNINITIALIZED is stored in zval.u1.v.u.extra which is unused so far. The reason for that is to not break the optimized Z_TYPE_FLAGS macro:

php-src/Zend/zend_types.h

Lines 690 to 692 in a00e4a3

/* This optimized version assumes that we have a single "type_flag" */
/* IS_TYPE_COLLECTABLE may be used only with IS_TYPE_REFCOUNTED */
#define Z_REFCOUNTED(zval) (Z_TYPE_FLAGS(zval) != 0)

I also couldn't use zval.u2.extra because I'm depending on the fact that the assignment clears the flag. I can't clear it from ZEND_JMP_STATIC_DEF because if the initializer throws the initializer won't be run next time, which is not consistent with our other initializers.

@kocsismate
Copy link
Member

I've been thinking about building something analogous to https://llvm-compile-time-tracker.com/ for a long time. There are some challenges, namely the runner needs to have consistent performance which GitHub actions likely does not.

I'm not sure you saw it but I worked quite a lot to create a benchmark framework which can be automatically create benchmarks on AWS instances via Terraform (https://github.com/kocsismate/php-version-benchmarks). Unfortunately, as you also said, reliability is its major problem, even though I went to great lengths to fix the problem... I hope that I wasn't too optimistic that I can once make AWS services provide consistent results.

Copy link
Member

@dstogov dstogov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this, but I don't object.

@iluuu1994 iluuu1994 force-pushed the static-var-arbitrary-expr branch 3 times, most recently from 3227bab to 8e3812a Compare May 23, 2023 19:10
@iluuu1994 iluuu1994 force-pushed the static-var-arbitrary-expr branch from 8e3812a to 81f3065 Compare May 23, 2023 19:51
@iluuu1994 iluuu1994 closed this in 0b1d750 May 24, 2023
DanielEScherzer added a commit to DanielEScherzer/php-src that referenced this pull request Feb 21, 2025
While reviewing the existing tests in the `constexpr` directory, I found that
some of the names were not updated to reflect the contents when the contents
were changed in php#9301.

Follow-up to php#15638
Girgias pushed a commit that referenced this pull request Feb 21, 2025
…es (#17872)

While reviewing the existing tests in the `constexpr` directory, I found that
some of the names were not updated to reflect the contents when the contents
were changed in #9301.

Follow-up to #15638
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

static $var = xxx should support any expression
8 participants