On Tue, Jul 16, 2024, at 01:08, Rob Landers wrote:
>
>
> On Mon, Jul 15, 2024, at 23:29, Tim Düsterhus wrote:
>> Hi
>>
>> On 7/15/24 16:12, Rob Landers wrote:
>> > This always gets me. "safer" doesn't have a consistent meaning. For
>>
>> Yes it does. SHA-256 is safer than MD5. And on modern CPUs with sha_ni
>> extensions, it's also faster. The following is on a Intel i7-1365U:
>>
>> > $ openssl speed md5 sha1 sha256 sha512
>> > *snip*
>> > version: 3.0.10
>> > built on: Wed Feb 21 10:45:39 2024 UTC
>> > options: bn(64,64)
>> > compiler: *snip*
>> > CPUINFO: OPENSSL_ia32cap=0x7ffaf3ffffebffff:0x98c027bc239c27eb
>> > The 'numbers' are in 1000s of bytes per second processed.
>> > type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384
>> > bytes
>> > md5 114683.10k 286174.51k 550288.90k 715171.50k 783611.22k
>> > 788556.46k
>> > sha1 138578.57k 440607.38k 1082163.29k 1674088.45k 2017296.38k
>> > 2047377.41k
>> > sha256 150670.11k 460483.71k 1054829.57k 1553830.57k 1807897.94k
>> > 1823981.57k
>> > sha512 41246.76k 181566.07k 341457.66k 645468.50k 781042.81k
>> > 804296.02k
>>
>> ----
>>
>> > example, if you were to want to create a "content addressable
>> > address" using a hash and it needs to fit inside a 128 bit number
>> > (such as a GUID), you may be tempted to take SHA-X and just truncate
>> > it. However, this biases the resulting numbers, which this bias may
>>
>> This is false. For a hash algorithm to be considered cryptographically
>> secure (which I consider to be a reasonable definition of "safe"), it -
>> among other properties - needs to have the "avalanche effect" property,
>> which means that any change in the input is going to affect each output
>> bit with 50% probability.
>
> from a practical perspective across hundreds of millions of hashes of unique ids, I can say
> that there is a practical and detectable bias when truncating sha-256 hashes. Enough that we were
> having to throw out a/b test results… I’m not going to write a paper on it and I’m not going
> to bother arguing the point that no hash function is perfect, but I will point out that “theory”
> and “reality” don’t always agree.
I have been corrected. The issue was due to a modulus causing the bias deeper in the code.
— Rob