Re: Re: Fwd: little request :)

From: Date: Mon, 10 Feb 2014 10:36:18 +0000
Subject: Re: Re: Fwd: little request :)
References: 1 2 3 4  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message
Hi Alexander,


On Thu, Feb 6, 2014 at 10:24 PM, Solar Designer <[email protected]> wrote:

> On Wed, Feb 05, 2014 at 11:20:27AM +0100, Pierre Joye wrote:
> > About the timing attack RFC, I have asked for some review and advice
> > and here is a useful one already, thanks Alex :)
> >
> > Please keep him as CC as I do not know if he is on this list.
>
> Thanks.  I am in fact not subscribed, and I appreciate the CC's.
> (I also find the "little request" message Subject weird for use on a
> public mailing list, but now it's stuck.)
>
> On Thu, Feb 06, 2014 at 07:57:17AM +0900, Yasuo Ohgaki wrote:
> > Another way to protect from timing attack is have a random sleep, but
> > this is tricky also. Too short or too long sleep doesn't help.
>
> Let's not even consider this.  Adding random noise doesn't necessarily
> prevent the signal from being heard, even if the noise is louder than
> the signal.
>
> There's a related approach, though: blinding.  Instead of comparing the
> two strings directly, they may be hashed along with some secret input
> (may be a newly generated random number) and the hashes then compared.
> There are some arguments in favor of this approach when it's implemented
> in a scripting language - e.g., if someone were writing code in PHP -
> but when we're talking about native implementation (to be part of PHP),
> those arguments don't apply.  Some references to this approach were in
> yesterday's and today's Twitter discussion between @zooko and others,
> including me.  There were lots of tweets.  You may take a look at:
>
>
> https://www.isecpartners.com/blog/2011/february/double-hmac-verification.aspx
>
> http://openwall.info/wiki/people/solar/algorithms/challenge-response-authentication#Timing-attacks
>
> https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/src/allmydata/util/hashutil.py?annotate=blame&rev=a4a6c02ef8ae2e0edb30bb0051873ffca6af6fc0#L205
> https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2165
>
> When the secret component is very low entropy, this attack may work:
>
> http://lists.randombit.net/pipermail/cryptography/2011-December/001866.html
> http://lists.randombit.net/pipermail/cryptography/2011-December/001874.html
>
> @mik235 pointed out that Python 3.4+ already does something similar for
> the == operator all the time, by using SipHash and, if I understood
> correctly, caching SipHash values along with the strings (just like it's
> also done for the lengths).  There's no attempt at not leaking the
> length.  I haven't looked at the code, so this is merely my current
> understanding based on @mik235's tweets.
>
> On Wed, Feb 05, 2014 at 03:00:10PM -0800, Stas Malyshev wrote:
> > I'm not sure the issue of hiding length is a real one here. Most crypto
> > algorithms would have fixed-length or one-of-the-very-short-list length
> > results, and in this scenario it is usually obvious from the app code
> > what the hash length would be, at least within very small number of
> > options, so I'm not sure how big a deal it is.
>
> You're right, for most uses it should not matter.
>
> On Thu, Feb 06, 2014 at 01:06:26AM +0100, Rouven We?ling wrote:
> > I may just be blind, because it's getting late. I don't see any integer
> division occurring.
>
> As Tjerk Meesters pointed out, the % operator involves integer division.
> This function can easily spend 10x more time doing the (variable time)
> integer division than anything else, thereby exacerbating timing leaks.
>
> > > Is MAX() implemented without branching?  Hardly.
> >
> > It's not (for reference the code is below). I think the impact of this
> is reduced by the fact, that unless the length is 0 always the same branch
> is used.
>
> Yes, it will most likely only leak whether the secret's length is 0 or not.
>
> > I think we should just document this as potentially leaking information
> about the length.
>
> I agree.
>
> > If we do find ways to reduce this, by all means, we should take them.
>
> Some of them may involve tradeoffs or in some cases actually make the
> timing leaks worse (while reducing them in other cases), so it's not so
> simple.  For example, this suggestion:
>
> On Thu, Feb 06, 2014 at 12:06:35PM +0900, Yasuo Ohgaki wrote:
> > For example, read few random bytes and add them to get number of
> > iterations, then iterate something for the number. Although it's not
> > perfect, it makes much harder than nothing.
>
> This may potentially make the leak even worse: the random noise may be
> filtered out given enough measurements, whereas the multiple iterations
> may be amplifying the signal.  Let's not do something so questionable.
>
> On Thu, Feb 06, 2014 at 11:51:55AM +0800, Tjerk Meesters wrote:
> > Btw, could we use the name hash_equals() instead? This function
> returns a
> > boolean whereas a standard comparison functions return -1, 0, 1.
>
> Makes sense to me.  In the Twitter discussion, it was pointed out that
> Tor project actually has a memcmp()-like constant time comparison
> function, but ends up using it to compare only for equal/not-equal
> anyway.  So the extra complexity of supporting 3 possible return values
> is not needed.
>
> Also, I recommend an approach-neutral, purpose-centric function name.
> This operation doesn't have to be constant-time to achieve the end goal.
> It may as well be e.g. random time, as long as it does not leak info
> about the secret via timings.  And the approach taken may change in a
> future version, as long as the function's properties are retained (or
> even improved upon).  e.g. Tahoe-LAFS went with timing_safe_compare()
> today:
>
>
> https://github.com/zooko/tahoe-lafs/commit/a79b6927af57dc701af2f9037f11accc324a9cfb
>
> but you may use a name more consistent with PHP's other related function
> names, and also emphasizing that the return value is TRUE if the strings
> match (if this is how you define the function to behave).  e.g.
> timing_safe_eq() or hash_eq().
>
> On Thu, Feb 06, 2014 at 01:53:29PM +0900, Yasuo Ohgaki wrote:
> > hash_compare() may only compare hashed values as the function
> > name imply. If length differs, it may return FALSE simply. (As well as
> > wrong type)
> >
> > This way, we don't have to worry about length leak and constant
> operation.
> > Hash string length is not a secret.
>
> Right.
>
> > If user have 'raw data' (e.g. raw password,
> > etc) to compare, make them apply hash function first.
>
> ... which will leak the length via timing of the hash function.
>
> More from Twitter, on trying to hide the length (no, that discussion was
> not about PHP, but in general):
>
> <@SteveBellovin> @marshray @tqbf You could, I suppose, "compare" dummy
> data to reach a constant length, making sure you match cache line alignment.
>
> As you can see, after getting rid of the modulo division, we quickly get
> to consider how our memory reads pattern differs from what we'd have
> when comparing against a sufficiently long secret string.  It's tricky.
>

I've given this problem some more thought and the following looked
promising; it would be great to get your feedback on it.

Basically, to remove the modulo and MAX(), I've used a combined buffer
that's used for the comparison, in two ways:

1. PAD := known (padded with \0 based on given_len)
2. PAD := known CONCAT given

The comparison is straightforward because we already know the buffer is at
least given_len.

Implementations of above can be found here:
1. https://github.com/datibbaw/php-src/compare/master...const-cmp
2. https://github.com/datibbaw/php-src/compare/master...const-cmp-2

Bottom line is that this attempts to compromise memory against time
stability.


>
> Alexander
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>


-- 
--
Tjerk


Thread (42 messages)

« previous php.internals (#72435) next »