Skip to content

Question: Architecture-specific performance differences between blake3 and blake2 #516

@tasleson

Description

@tasleson

Summary

I’ve been evaluating the Rust blake3 crate for potential use. Data, code, and benchmarks are available here.

Results

  • x86: blake3 is consistently faster than blake2.
  • ppcle / s390x / aarch64: performance is generally slower than blake2.
    • Rayon parallelism sometimes improves results.
    • In some cases, performance is still worse (example).

Questions

  • Is this expected behavior on non-x86 architectures (e.g., SIMD gaps, missing intrinsics)?
  • Or is my sample code / benchmarking harness flawed?
  • Are there recommended tuning options or build flags for ppcle, s390x, and aarch64?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions