Re: New JSON parser

From: Date: Tue, 18 Mar 2014 15:48:58 +0000
Subject: Re: New JSON parser
References: 1 2  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message
On Tue, Mar 18, 2014 at 9:09 AM, Ivan Enderlin @ Hoa <
[email protected]> wrote:

> On 13/03/2014 20:48, Jakub Zelenka wrote:
>
>> Hi,
>>
> Hi Jakub,
>
>
>
>  I have create a new JSON parser using conditional re2c and pure pull Bison
>> parser. It's a native UTF-8 parser licensed under PHP license (it can be
>> used for Evil though :) ). The extension is available at
>>
>> https://github.com/bukka/php-jsond
>>
> Very nice work, thanks!
>
>
>  [snip]
>>
>>
>> I need to do more testing before creating RFC for replacing the current
>> parser. There is still space for further improvements. If anyone has any
>> ideas, please let me know. Or if you could test it, that would be great
>> too! ;)
>>
> For my PhD thesis in the automatic testing domain, I have created some
> grammar-based testing algorithms, based on our dedicated LL(k) compiler
> compiler (with its dedicated grammar description language called PP).
> Please, see the article [1] (along with the presentation [2] and all the
> details [3] about the article and the conference) and also the tool [4]
> (called Hoa\Compiler). In this article, my experimentation consisted to
> generate a lot of JSON strings (based on the JSON grammar [5] written in
> PP) and compared them against the JSON parsers of Gecko and PHP. Now I
> re-play this experimentation but I compare all the generated data with
> ext/json and ext/jsond to see if there is no potential regression. Also, I
> test it with a bounded exhaustive algorithm: it means we generate all
> possible JSON strings up to a given size (the unit is the number of tokens
> in a sequence, so {, true or foo are tokens). Note that
> we have two
> ther algorithms: uniform random generation and coverage-based generation.
>
> I have created a little repository to share my work [6]. I have generated
> all sequences up to 15 tokens, which represents 356'327 data and no one has
> failed. Congrats!
> Just for the record, a good test is a test that fails. Here, I have
> detected no regeression, and because I have previously compared the
> ext/json with the JSON parser from Gecko, we can consider your
> implementation as "safe".
>
> This is my little contribution of the morning :-). You can use this work
> to generate data in a static file and use them to compare the memory and
> CPU usage between ext/json and ext/jsond also.
>
>
> Best regards.
>
> [1] http://hoa-project.net/Literature/Research/Amost12.html
> [2] http://keynote.hoa-project.net/Amost12/EDGB12.pdf
> [3] http://hoa-project.net/Event/Amost12.html
> [4] https://github.com/hoaproject/Compiler
> [5] https://github.com/hoaproject/Json/blob/master/Grammar.pp
> [6] https://github.com/Hywan/jsond-test
>

Hi! That looks really nice! I've been actually looking for something like
this. It's cool that I can have a bunch of JSON sub-grammars for generating
test data for benchmarks. Need to have a look properly what it can do but
think that it will be really useful...

Thanks


Thread (29 messages)

« previous php.internals (#73272) next »