AVX2 implementation of BLAKE2b
authorJussi Kivilinna <jussi.kivilinna@iki.fi>
Sun, 14 Jan 2018 14:48:17 +0000 (16:48 +0200)
committerJussi Kivilinna <jussi.kivilinna@iki.fi>
Sun, 4 Feb 2018 16:51:44 +0000 (18:51 +0200)
commitaf7fc732f9a7af7a70276f1e8364d2132db314f1
tree5bb1b92c3821a41fa09ec4cc586ca108f3c3e3b7
parentffdc6f3623a0bcb41324d562340b2cd1c288e387
AVX2 implementation of BLAKE2b

* cipher/Makefile.am: Add 'blake2b-amd64-avx2.S'.
* cipher/blake2.c (USE_AVX2, ASM_FUNC_ABI, ASM_EXTRA_STACK)
(_gry_blake2b_transform_amd64_avx2): New.
(BLAKE2B_CONTEXT) [USE_AVX2]: Add 'use_avx2'.
(blake2b_transform): Rename to ...
(blake2b_transform_generic): ... this.
(blake2b_transform): New.
(blake2b_final): Pass 'ctx' pointer to transform function instead of
'S'.
(blake2b_init_ctx): Check HW features and enable AVX2 implementation
if supported.
* cipher/blake2b-amd64-avx2.S: New.
* configure.ac: Add 'blake2b-amd64-avx2.lo'.
--

Benchmark on Intel Core i7-4790K (4.0 Ghz, no turbo):

Before:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 BLAKE2B_512    |      1.07 ns/B     887.8 MiB/s      4.30 c/B

After (~1.4x faster):
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 BLAKE2B_512    |     0.771 ns/B    1236.8 MiB/s      3.08 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
cipher/Makefile.am
cipher/blake2.c
cipher/blake2b-amd64-avx2.S [new file with mode: 0644]
configure.ac