Re: Revert Tokenizer behavior for 5.4

From: Date: Fri, 09 Sep 2011 08:46:17 +0000
Subject: Re: Revert Tokenizer behavior for 5.4
References: 1 2  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message
Thank you Nikita for take this subject here!

On Fri, Sep 9, 2011 at 10:01, Ferenc Kovacs <[email protected]> wrote:

> don't break there but for the next ';'.


You can also just count the number of semantic token after T_HALT_COMPILER
(ie excluding whitespace and comments) and once you hit 3, halt.

less confusing solution would be to explicitly add '(', ')' and ';' to
the
> result in the T_HALT_COMPILER condition before breking out of the
> loop.
>

If you mean verifying that '(', ')' and (';' or T_CLOSE_TAG) are
effectively
following T_HALT_COMPILER, I think that's part of the syntax analyser's job,
not tokenizer's.
If you're ok with this argument, then just couting 3 tokens is really the
most basic "syntax analysis" we have to do to fix the pb, don't you think?


> could there be other important tokens after the __halt_compiler()
> which should be present in the token_get_all() result?
>

Maybe the binary data itself, as a big T_INLINE_HTML for example ?

Also, if token_get_all you be made binary safe, that would be very cool !
(no more eating of \x00-\x1F inside regular code) :)

Nicolas


Thread (17 messages)

« previous php.internals (#55328) next »