Re: [VOTE] Timing attack safe string comparison function

From: Date: Mon, 10 Feb 2014 12:39:26 +0000
Subject: Re: [VOTE] Timing attack safe string comparison function
References: 1 2 3 4 5 6  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message
Hi,

On 10 February 2014 09:29, Tjerk Meesters <[email protected]> wrote:
>> str_word_compare() is the winner for relatively large data.
>>
>> It seems str_word_compare() is the way to go.
>>
>
> Quite honestly, I don't think that an absolute difference of 0.46 μs per
> function call should be used to pick the "winner" from the line-up you're
> made.
>
> That said, the implementation of str_word_compare introduces a MIN()
> branch and the second loop should loop until MAX(b1_len, b2_len) instead
> of b2_len alone; imo you've just made it harder to prove that it actually
> prevents timing attacks.

Which is a significant risk. This is extremely easy to get wrong which
is why adopting an established solution that has actually withstood
examination is better than going with something that perhaps hasn’t..
There’s also Keep It Simple, Stupid ;).

> From what has been discussed so far, to me, the question that needs an
> answer is:
>
> "Should we have a function that attempts to provide a constant comparison
> for strings of different lengths?"
>
> If not, reduce the function to just a length mismatch branch and a loop;
> length information is leaked as long as both strings are of difference
> lengths only.
>
> If so, then we need to create and run a simulation program to prove that a
> particular algorithm is indeed suitable with a certain amount of confidence
> for the most common scenarios.

We do not need to protect length information because we’re not
protecting strings of arbitrary length. We’re protecting hashes which
have fixed lengths across a small range of commonly used types, i.e. a
hacker should already have a good idea of what the length is. So what
is length hiding? It’s security through obscurity.

The function docs should just state that it only accepts strings of
equal length or will throw an error. Users can then make merry,
bikeshedding workarounds on their own time, like hashing both strings
to an equal length and then comparing the hashes. That sort of thing.
We may even find that we already do that…

Paddy

--
Pádraic Brady

http://blog.astrumfutura.com
http://www.survivethedeepend.com
Zend Framework Community Review Team
Zend Framework PHP-FIG Representative

On 10 February 2014 09:29, Tjerk Meesters <[email protected]> wrote:
> Hi,
>
> On Mon, Feb 10, 2014 at 9:15 AM, Yasuo Ohgaki <[email protected]> wrote:
>
>> Hi all,
>>
>> On Mon, Feb 10, 2014 at 9:24 AM, Yasuo Ohgaki <[email protected]> wrote:
>>
>> > On Fri, Feb 7, 2014 at 10:39 AM, Yasuo Ohgaki <[email protected]>
>> wrote:
>> >
>> >> On Fri, Feb 7, 2014 at 8:05 AM, Yasuo Ohgaki <[email protected]>
>> wrote:
>> >>
>> >>> I made SipHash version of str_compare() as a sample.
>> >>> There is timing safe php_compare(), which is stolen from BSD.
>> >>>
>> >>> https://github.com/yohgaki/php-src/compare/PHP-5.6-rfc-hash-compare
>> >>>
>> >>> [yohgaki@dev github-php-src]$ ./php-bin -r
>> 'var_dump(str_compare("abc",
>> >>> "abc"));'
>> >>> bool(true)
>> >>> [yohgaki@dev github-php-src]$ ./php-bin -r
>> >>> 'var_dump(str_compare("asfasdf",
>> >>> "slkjojoeiwrj"));'
>> >>> bool(false)
>> >>>
>> >>> It's quick patch made less than 30 min.
>> >>> So it can be improved, I suppose.
>> >>>
>> >>
>> >> I thought it would be better to compare performance difference.
>> >> Added more functions to play with.
>> >> There are
>> >>
>> >>  bool str_siphash_compare(str, str) - siphash. timing safe. (64bit)
>> >>   bool str_xxhash32_compare(str, str) - xxhash. timing safe. (32bit)
>> >>  bool str_md5_compare(str, str) - md5. Timing safe (128bit)
>> >>   bool str_byte_compare(str, str) - Byte compare. Timing safe. No
>> >> division.
>> >>  bool str_byte_compare2(str, str) - Byte compare. Timing safe. With
>> >> division. (Modulo as this RFC)
>> >>  bool str_compare(str, str) - plain strncmp(). Not timing safe.
>> >>
>> >> I didn't took bench mark and did minimum tests.
>> >> I appreciate if anyone take benchmark.
>> >>
>> >
>> > Added yet another function to compare suggested by Lester.
>> >
>> > bool str_word_compare(str, str)
>> >
>> > This function compares data word by word rather than byte by byte.
>> > It supposed to be faster for large data.
>> >
>>
>> I took a benchmark. str_compare() is not timing safe. It's there for
>> reference.
>>
>> str_siphash_compare  Elapsed: 1.389824   Iterations: 1000000 DataSize: 8
>> str_xxhash32_compare Elapsed: 1.241737   Iterations: 1000000 DataSize: 8
>> str_md5_compare      Elapsed: 3.029127   Iterations: 1000000 DataSize: 8
>> str_byte_compare     Elapsed: 1.236183   Iterations: 1000000 DataSize: 8
>> str_byte_compare2    Elapsed: 1.269901   Iterations: 1000000 DataSize: 8
>> str_word_compare     Elapsed: 1.273266   Iterations: 1000000 DataSize: 8
>> str_compare          Elapsed: 1.181425   Iterations: 1000000 DataSize: 8
>>
>> str_byte_compare() is the winner for small data.
>> I'm a little surprised that str_xxhash32_compare() is the second.
>> str_word_compare() is marginally slower.
>>
>> str_siphash_compare  Elapsed: 2.341025   Iterations: 1000000 DataSize: 128
>> str_xxhash32_compare Elapsed: 1.560131   Iterations: 1000000 DataSize: 128
>> str_md5_compare      Elapsed: 6.055007   Iterations: 1000000 DataSize: 128
>> str_byte_compare     Elapsed: 1.799050   Iterations: 1000000 DataSize: 128
>> str_byte_compare2    Elapsed: 2.163229   Iterations: 1000000 DataSize: 128
>> str_word_compare     Elapsed: 1.337508   Iterations: 1000000 DataSize: 128
>> str_compare          Elapsed: 1.194582   Iterations: 1000000 DataSize: 128
>>
>> str_word_compare() is the winner for relatively large data.
>>
>> It seems str_word_compare() is the way to go.
>>
>
> Quite honestly, I don't think that an absolute difference of 0.46 μs per
> function call should be used to pick the "winner" from the line-up you're
> made.
>
> That said, the implementation of str_word_compare introduces a MIN()
> branch and the second loop should loop until MAX(b1_len, b2_len) instead
> of b2_len alone; imo you've just made it harder to prove that it actually
> prevents timing attacks.
>
> From what has been discussed so far, to me, the question that needs an
> answer is:
>
> "Should we have a function that attempts to provide a constant comparison
> for strings of different lengths?"
>
> If not, reduce the function to just a length mismatch branch and a loop;
> length information is leaked as long as both strings are of difference
> lengths only.
>
> If so, then we need to create and run a simulation program to prove that a
> particular algorithm is indeed suitable with a certain amount of confidence
> for the most common scenarios.
>
>
>>
>> Regards,
>>
>> --
>> Yasuo Ohgaki
>> [email protected]
>>
>
>
>
> --
> --
> Tjerk



-- 

--
Pádraic Brady

http://blog.astrumfutura.com
http://www.survivethedeepend.com
Zend Framework Community Review Team
Zend Framework PHP-FIG Representative


Thread (54 messages)

« previous php.internals (#72438) next »