Hi all,
On Mon, Feb 10, 2014 at 9:24 AM, Yasuo Ohgaki <[email protected]> wrote:
> On Fri, Feb 7, 2014 at 10:39 AM, Yasuo Ohgaki <[email protected]> wrote:
>
>> On Fri, Feb 7, 2014 at 8:05 AM, Yasuo Ohgaki <[email protected]> wrote:
>>
>>> I made SipHash version of str_compare() as a sample.
>>> There is timing safe php_compare(), which is stolen from BSD.
>>>
>>> https://github.com/yohgaki/php-src/compare/PHP-5.6-rfc-hash-compare
>>>
>>> [yohgaki@dev github-php-src]$ ./php-bin -r 'var_dump(str_compare("abc",
>>> "abc"));'
>>> bool(true)
>>> [yohgaki@dev github-php-src]$ ./php-bin -r
>>> 'var_dump(str_compare("asfasdf", "slkjojoeiwrj"));'
>>> bool(false)
>>>
>>> It's quick patch made less than 30 min.
>>> So it can be improved, I suppose.
>>>
>>
>> I thought it would be better to compare performance difference.
>> Added more functions to play with.
>> There are
>>
>> bool str_siphash_compare(str, str) - siphash. timing safe. (64bit)
>> bool str_xxhash32_compare(str, str) - xxhash. timing safe. (32bit)
>> bool str_md5_compare(str, str) - md5. Timing safe (128bit)
>> bool str_byte_compare(str, str) - Byte compare. Timing safe. No
>> division.
>> bool str_byte_compare2(str, str) - Byte compare. Timing safe. With
>> division. (Modulo as this RFC)
>> bool str_compare(str, str) - plain strncmp(). Not timing safe.
>>
>> I didn't took bench mark and did minimum tests.
>> I appreciate if anyone take benchmark.
>>
>
> Added yet another function to compare suggested by Lester.
>
> bool str_word_compare(str, str)
>
> This function compares data word by word rather than byte by byte.
> It supposed to be faster for large data.
>
I took a benchmark. str_compare() is not timing safe. It's there for
reference.
str_siphash_compare Elapsed: 1.389824 Iterations: 1000000 DataSize: 8
str_xxhash32_compare Elapsed: 1.241737 Iterations: 1000000 DataSize: 8
str_md5_compare Elapsed: 3.029127 Iterations: 1000000 DataSize: 8
str_byte_compare Elapsed: 1.236183 Iterations: 1000000 DataSize: 8
str_byte_compare2 Elapsed: 1.269901 Iterations: 1000000 DataSize: 8
str_word_compare Elapsed: 1.273266 Iterations: 1000000 DataSize: 8
str_compare Elapsed: 1.181425 Iterations: 1000000 DataSize: 8
str_byte_compare() is the winner for small data.
I'm a little surprised that str_xxhash32_compare() is the second.
str_word_compare() is marginally slower.
str_siphash_compare Elapsed: 2.341025 Iterations: 1000000 DataSize: 128
str_xxhash32_compare Elapsed: 1.560131 Iterations: 1000000 DataSize: 128
str_md5_compare Elapsed: 6.055007 Iterations: 1000000 DataSize: 128
str_byte_compare Elapsed: 1.799050 Iterations: 1000000 DataSize: 128
str_byte_compare2 Elapsed: 2.163229 Iterations: 1000000 DataSize: 128
str_word_compare Elapsed: 1.337508 Iterations: 1000000 DataSize: 128
str_compare Elapsed: 1.194582 Iterations: 1000000 DataSize: 128
str_word_compare() is the winner for relatively large data.
It seems str_word_compare() is the way to go.
Regards,
--
Yasuo Ohgaki
[email protected]