Hi,
On 10 February 2014 09:29, Tjerk Meesters <[email protected]> wrote:
>> str_word_compare() is the winner for relatively large data.
>>
>> It seems str_word_compare() is the way to go.
>>
>
> Quite honestly, I don't think that an absolute difference of 0.46 μs per
> function call should be used to pick the "winner" from the line-up you're
> made.
>
> That said, the implementation of str_word_compare
introduces a MIN()
> branch and the second loop should loop until MAX(b1_len, b2_len)
instead
> of b2_len
alone; imo you've just made it harder to prove that it actually
> prevents timing attacks.
Which is a significant risk. This is extremely easy to get wrong which
is why adopting an established solution that has actually withstood
examination is better than going with something that perhaps hasn’t..
There’s also Keep It Simple, Stupid ;).
> From what has been discussed so far, to me, the question that needs an
> answer is:
>
> "Should we have a function that attempts to provide a constant comparison
> for strings of different lengths?"
>
> If not, reduce the function to just a length mismatch branch and a loop;
> length information is leaked as long as both strings are of difference
> lengths only.
>
> If so, then we need to create and run a simulation program to prove that a
> particular algorithm is indeed suitable with a certain amount of confidence
> for the most common scenarios.
We do not need to protect length information because we’re not
protecting strings of arbitrary length. We’re protecting hashes which
have fixed lengths across a small range of commonly used types, i.e. a
hacker should already have a good idea of what the length is. So what
is length hiding? It’s security through obscurity.
The function docs should just state that it only accepts strings of
equal length or will throw an error. Users can then make merry,
bikeshedding workarounds on their own time, like hashing both strings
to an equal length and then comparing the hashes. That sort of thing.
We may even find that we already do that…
Paddy
--
Pádraic Brady
http://blog.astrumfutura.com
http://www.survivethedeepend.com
Zend Framework Community Review Team
Zend Framework PHP-FIG Representative
On 10 February 2014 09:29, Tjerk Meesters <[email protected]> wrote:
> Hi,
>
> On Mon, Feb 10, 2014 at 9:15 AM, Yasuo Ohgaki <[email protected]> wrote:
>
>> Hi all,
>>
>> On Mon, Feb 10, 2014 at 9:24 AM, Yasuo Ohgaki <[email protected]> wrote:
>>
>> > On Fri, Feb 7, 2014 at 10:39 AM, Yasuo Ohgaki <[email protected]>
>> wrote:
>> >
>> >> On Fri, Feb 7, 2014 at 8:05 AM, Yasuo Ohgaki <[email protected]>
>> wrote:
>> >>
>> >>> I made SipHash version of str_compare() as a sample.
>> >>> There is timing safe php_compare(), which is stolen from BSD.
>> >>>
>> >>> https://github.com/yohgaki/php-src/compare/PHP-5.6-rfc-hash-compare
>> >>>
>> >>> [yohgaki@dev github-php-src]$ ./php-bin -r
>> 'var_dump(str_compare("abc",
>> >>> "abc"));'
>> >>> bool(true)
>> >>> [yohgaki@dev github-php-src]$ ./php-bin -r
>> >>> 'var_dump(str_compare("asfasdf",
>> >>> "slkjojoeiwrj"));'
>> >>> bool(false)
>> >>>
>> >>> It's quick patch made less than 30 min.
>> >>> So it can be improved, I suppose.
>> >>>
>> >>
>> >> I thought it would be better to compare performance difference.
>> >> Added more functions to play with.
>> >> There are
>> >>
>> >> bool str_siphash_compare(str, str) - siphash. timing safe. (64bit)
>> >> bool str_xxhash32_compare(str, str) - xxhash. timing safe. (32bit)
>> >> bool str_md5_compare(str, str) - md5. Timing safe (128bit)
>> >> bool str_byte_compare(str, str) - Byte compare. Timing safe. No
>> >> division.
>> >> bool str_byte_compare2(str, str) - Byte compare. Timing safe. With
>> >> division. (Modulo as this RFC)
>> >> bool str_compare(str, str) - plain strncmp(). Not timing safe.
>> >>
>> >> I didn't took bench mark and did minimum tests.
>> >> I appreciate if anyone take benchmark.
>> >>
>> >
>> > Added yet another function to compare suggested by Lester.
>> >
>> > bool str_word_compare(str, str)
>> >
>> > This function compares data word by word rather than byte by byte.
>> > It supposed to be faster for large data.
>> >
>>
>> I took a benchmark. str_compare() is not timing safe. It's there for
>> reference.
>>
>> str_siphash_compare Elapsed: 1.389824 Iterations: 1000000 DataSize: 8
>> str_xxhash32_compare Elapsed: 1.241737 Iterations: 1000000 DataSize: 8
>> str_md5_compare Elapsed: 3.029127 Iterations: 1000000 DataSize: 8
>> str_byte_compare Elapsed: 1.236183 Iterations: 1000000 DataSize: 8
>> str_byte_compare2 Elapsed: 1.269901 Iterations: 1000000 DataSize: 8
>> str_word_compare Elapsed: 1.273266 Iterations: 1000000 DataSize: 8
>> str_compare Elapsed: 1.181425 Iterations: 1000000 DataSize: 8
>>
>> str_byte_compare() is the winner for small data.
>> I'm a little surprised that str_xxhash32_compare() is the second.
>> str_word_compare() is marginally slower.
>>
>> str_siphash_compare Elapsed: 2.341025 Iterations: 1000000 DataSize: 128
>> str_xxhash32_compare Elapsed: 1.560131 Iterations: 1000000 DataSize: 128
>> str_md5_compare Elapsed: 6.055007 Iterations: 1000000 DataSize: 128
>> str_byte_compare Elapsed: 1.799050 Iterations: 1000000 DataSize: 128
>> str_byte_compare2 Elapsed: 2.163229 Iterations: 1000000 DataSize: 128
>> str_word_compare Elapsed: 1.337508 Iterations: 1000000 DataSize: 128
>> str_compare Elapsed: 1.194582 Iterations: 1000000 DataSize: 128
>>
>> str_word_compare() is the winner for relatively large data.
>>
>> It seems str_word_compare() is the way to go.
>>
>
> Quite honestly, I don't think that an absolute difference of 0.46 μs per
> function call should be used to pick the "winner" from the line-up you're
> made.
>
> That said, the implementation of str_word_compare
introduces a MIN()
> branch and the second loop should loop until MAX(b1_len, b2_len)
instead
> of b2_len
alone; imo you've just made it harder to prove that it actually
> prevents timing attacks.
>
> From what has been discussed so far, to me, the question that needs an
> answer is:
>
> "Should we have a function that attempts to provide a constant comparison
> for strings of different lengths?"
>
> If not, reduce the function to just a length mismatch branch and a loop;
> length information is leaked as long as both strings are of difference
> lengths only.
>
> If so, then we need to create and run a simulation program to prove that a
> particular algorithm is indeed suitable with a certain amount of confidence
> for the most common scenarios.
>
>
>>
>> Regards,
>>
>> --
>> Yasuo Ohgaki
>> [email protected]
>>
>
>
>
> --
> --
> Tjerk
--
--
Pádraic Brady
http://blog.astrumfutura.com
http://www.survivethedeepend.com
Zend Framework Community Review Team
Zend Framework PHP-FIG Representative