BLAKE-256 Hash Function

BLAKE-256 is the hashing algorithm used in Decred. It is an extremely flexible function for hardware and is very fast in software, both of which are desirable properties for a block hashing algorithm. BLAKE is a versatile and user-oriented design that performs well on all platforms, from embedded systems to resource-constrained hardware and requires low resources for implementation ¹ ² ³.

It is worth noting that BLAKE is a finalist of The National Institute of Standards and Technology (NIST) SHA-3 competition, a public competition for alternatives to SHA-1 and SHA-2 with better efficiency and resilience to future attacks. During the competition, BLAKE was thoroughly analyzed and attacked to identify vulnerabilities by notable cryptanalysts ¹. Below sections contain information regarding BLAKE-256 components, security and performance in hardware and software.

Algorithm	Word	Message	Block	Digest	Salt	Rounds
BLAKE-256	32-bit	64-bit	512-bit	256-bit	128-bit	14

Table 1: Characteristics of the BLAKE-256 hash function

BLAKE Components

BLAKE is built on previously analyzed, and reliable components; the hash iterative framework (HAIFA) of Biham and Dunkelman ⁴, the local wide-pipe introduced by the LAKE hash function⁶, and the core function inspired by ChaCha function of Bernstein ⁵. The use of well understood building blocks makes BLAKE a hash function that is simple to analyze and implement for cryptanalysts and implementers alike.

HAIFA iteration mode

The construction of a hash output is typically done by splitting the input data referred to as a ‘message’, into small blocks of fixed length and processed iteratively using a cryptographic compression function. The combination of calls to a compression function for processing the input data is an iteration mode (also referred to as a Domain Extender). The Merkle–Damgård (M-D) construction is the classic iteration mode used by MD5, SHA1, RIPEMD-160, and SHA2 hash functions. M-D splits the variable length input message into equal-sized message blocks x (i.e., 512 bits) of n blocks, the last block is padded as required and appended with the length of the message.

Merkle Damgard Construction

Figure 1: Merkle–Damgård Construction

As shown in Figure 1 above, the output of the compression function f known as a chaining value CV and the initial inputs are the Initialization Vector IV shown as CV₀, with h(x) being the output after all message blocks are processed. M-D maintains the collision resistance of the compression function however it lacks the pre-image and second pre-image resistance. As a result generic attacks such as the multicollision, the long message second-preimage, herding and length extension are possible on M-D based hash functions ⁷ ⁸ ⁹ ¹⁰ ¹¹.

BLAKE uses HAIFA, which maintains the valuable properties of the M-D construction and adds to the security and scalability of the transformation. HAIFA is essentially an M-D construction but with a mandatory counter (number of bits hashed so far) and an optional salt (random data that used as an additional input).

HAIFA Construction

Figure 2: HAIFA Construction

This iteration mode solves many of the internal collision problems with the M-D construction and provides resistance to herding, long message second pre-image, and length extension attacks as the additional salt and counter simulate distinct compression functions for each data block processed ⁴ ¹². HAIFA is both prefix-free and suffix-free and, as a result, is collision-resistant as well as indifferentiable from a random oracle, assuming the underlying compression function is ideal ¹ ¹³ ¹⁴.

The local wide-pipe and ChaCha inspired core function

Local wide-pipe

Figure 3: Local wide-pipe construction of BLAKE’s compression function, inherited from LAKE hash function ⁶

BLAKE uses the LAKE local wide-pipe structure for its strong security guarantees against collision attacks ¹ ⁶ and a core function G inspired by the stream cipher ChaCha for its simplicity and security ¹. The compression function takes as input four values: a chaining value, a message block, an optional salt and a mandatory counter. In the case of BLAKE-256:

Chaining Value	Message Block	Salt	Counter
256-bit	512-bit	128-bit	64-bit

The structure then works in three steps:

Initialization of a large internal state represented by 16 words in a 4x4 word matrix.
14 rounds of the G function with each round consisting of eight round-dependent transformations.
Finalization after a series of rounds, a new sequence of values taken from the matrix using sequence of initial values and salt.

A more detailed description can be found in the SHA-3 proposal BLAKE.

Security Evaluation Criteria

The security of a cryptographic hash function is determined by the number of queries or guesses required to solve the following problems:

Pre-image resistance (one-way): Given the y as an output of the hash function, it is computationally infeasible to find message x such that h(x) = y. ⁴

Figure 6: Pre-image resistance
Second pre-image resistance (weak collision resistant): Given x, it is computationally infeasible to find a second pre-image x’ ≠ x such that h(x) = h(x’) ⁴

Figure 7: Second pre-image resistance
Collision resistance (strong collision-resistant): It is computationally infeasible to find any two distinct inputs x, x’ that hash to the same output such that h(x) = h(x’) ⁴

Figure 8: Collision resistance
Length-extension resistance: Based on the hash of an unknown message h(x) and the length of the message len(x) it is not possible to choose a message x’ to calculate h(x’) such that h(x’) = h(x).

Figure 9: Length extension attack: Since x and x’ share the same first n blocks, the hash value h(x) is the intermediate hash value after first n blocks when computing h(x’) ¹⁷

In the first three properties, the phrase computationally infeasible means it would take the fastest computer a long time to solve the problem, e.g., billions of years, making it impossible in practice. Also, the fourth property, resistance to Length-extension attacks, is an additional requirement outlined in NIST’s evaluation criteria for its SHA-3 candidates hash function ¹⁵.

BLAKE was subject to a great deal of depth during cryptanalysis, more so than other candidates likely due to its similarity with ARX (modular addition, rotation, and XOR) based hash functions, which meant that many existing techniques and tools existed for beginning the analysis ¹⁶.

“Keccak received a significant amount of cryptanalysis, although not quite the depth of analysis applied to BLAKE.”

“BLAKE and Keccak have very large security margins.”

“Skein and BLAKE have no known distinguishing attacks that come close to threatening their full-round versions.”

NIST in Third-Round Report of the SHA-3 Cryptographic Hash Algorithm Competition¹⁶

BLAKE-256 Security bounds

Attack	NIST’s Security Requirements	BLAKE-256 Proven Generic Security
Preimage	256-bit	256-bit
Second-Preimage	256-bit	256-bit
Collision	128-bit	128-bit
Length-extension	Should be resistant	Immune
Indifferentiablity	Not specified	128-bit

Table 3: Minimum Security Requirements for a Hash Function and proven security of BLAKE-256 in Bits ¹⁵ ¹⁶

Analysis of BLAKE’s Domain extender proved that BLAKE-256 is secure against preimage, second-preimage, and collision attack up to 2²⁵⁶, 2²⁵⁶, and 2¹²⁸ queries, respectively and is indifferentiable from a random oracle up to 2¹²⁸ queries. Additionally, the counter in BLAKE protects against length extension attacks, as it assures consecutive compressions are different.

Software Performance

A significant amount of data on the performance of BLAKE and other SHA-3 finalists is available on the eBASH site with the best comparative presentation of data being the “shootout” graphs. The shootout graphs include data from AMD64 processor models, X86 processor implementations, ARM-NEON and 32-bit RISC (Reduced Instruction Set Computers). The throughput is stated in machine cycles-per-byte, where fewer cycles indicate better performance. BLAKE-256 is notable for its high performance on x86-64 microarchitecture, being faster for short messages than SHA-256 ¹⁸ despite being considered to have a much higher security margin at 14-rounds.

“Skein and BLAKE have the best overall software performance.”

“On 32-bit machines (mainly ARM processors) without vector units, BLAKE-256 is the overall leader, although it has only a small throughput advantage over SHA-256”

“On small embedded computers, BLAKE-256 has the best overall performance. BLAKE-256’s throughput is usually at or near the top, although occasionally slightly less than SHA-256, while BLAKE-256 generally has smaller memory requirements than SHA-2 and most of the other candidates.”

NIST in Third-Round Report of the SHA-3 Cryptographic Hash Algorithm Competition¹⁶

In the context of Decred, BLAKE-256 has excellent support for parallelism in both multi-core and single-core instruction-level scenarios (e.g. AVX2). This is important because the vast majority of validation takes place in general purpose hardware while mining takes place in specialized hardware.

Hardware Performance

Data on the performance of hardware implementations of Field Programmable Gate Arrays (FPGA) and Application Specific Integrated Circuits (ASIC), for BLAKE-256 and other SHA-3 finalists is available on the GMU CERG Database similar to the eBASH site. BLAKE was found to provide the best performance of any algorithm for very compact FPGA imlementations, which should also be true for ASIC implementations ¹⁶. High-performance implementations of BLAKE-256 in FPGAs or ASICs provide the same throughput as SHA-256 but require around twice the size, so BLAKE-256 throughput/area ratio is roughly half that of SHA-2. However, BLAKE has a significant advantage in hardware due to its high flexibility for hardware implementations ².

“Reasonably efficient, folded or serial implementations of all of the algorithms seem achievable, but BLAKE and Grøstl seem the most flexible.”

NIST in Third-Round Report of the SHA-3 Cryptographic Hash Algorithm Competition¹⁶

In Decred, only a single round of BLAKE-256 hashing occurs versus two rounds of SHA-256 in Bitcoin due to its technical shortcomings. The combined effect is that it takes much less energy to find a solution for a given hash rate.

References

Aumasson J., Henzen L., Meier W., Phan R. 2010. SHA-3 proposal BLAKE ↩↩↩↩↩
Aumasson J., Henzen L., Meier W., Phan R. 2014. The hash function BLAKE ↩↩
Balasch, J., Ege, B., Eisenbarth, T., Gérard, B., Gong, Z., Güneysu, T., Heyse, S., Kerckhof, S., Koeune, F., Plos, T., Pöppelmann, T., Regazzoni, F., Standaert, F.-X., Van Assche, G., Van Keer, R., van Oldeneel tot Oldenzeel, L., von Maurich, I.: Compact implementation and performance evaluation of hash functions in ATtiny devices ↩
Eli Biham, Orr Dunkelman. A Framework for Iterative Hash Functions — HAIFA ↩↩↩↩↩
Bernstein, D.J.: ChaCha, a variant of Salsa20 ↩
Aumasson, J.P., Meier, W., Phan, R.C.W.: The hash function family LAKE ↩↩↩
Thai Duong, Juliano Rizzo: Flickr’s API Signature Forgery Vulnerability ↩
Antoine Joux. Multi-collisions in iterated hash functions ↩
Richard Drews Dean. Formal Aspects of Mobile Code Security ↩
John Kelsey and Bruce Schneier. Second preimages on n-bit hash functions for much less than n 2 work ↩
John Kelsey and Tadayoshi Kohno. Herding hash functions and the Nostradamus attack ↩
Aumasson J., Henzen L., Meier W., Phan R. 2014. The hash function BLAKE Page 118. ↩
Aumasson J., Henzen L., Meier W., Phan R. 2014. The hash function BLAKE Page 149-150. ↩
Andreeva, Elena; Mennink, Bart; Preneel, Bart, Security Reductions of the Second Round SHA-3 Candidates ↩
National Institute for Standards and Technology. Announcing Request for Candidate Algorithm Nominations for a New Cryptographic Hash Algorithm (SHA3) Family ↩↩
Third-Round Report of the SHA-3 Cryptographic Hash Algorithm Competition ↩↩↩↩↩↩
Denton, Ben & Adhami, R. (2012). Modern Hash Function Construction ↩
Bernstein D. and Lange T. 2013. eBACS: ECRYPT benchmarking of cryptographic systems ↩