New Poly1305 implementations
authorJussi Kivilinna <jussi.kivilinna@iki.fi>
Sat, 6 Jan 2018 16:53:20 +0000 (18:53 +0200)
committerJussi Kivilinna <jussi.kivilinna@iki.fi>
Tue, 9 Jan 2018 16:39:26 +0000 (18:39 +0200)
commitb9a471ccf5f02f89e25c7ccc29898d0e4e486099
tree9b1ea7e0dc0dc57df8e6bb77194e5da59fa939bd
parentd39deb0a41dbeec81174704904d3d29c66d10d7e
New Poly1305 implementations

* cipher/Makefile.am: Include '../mpi' for 'longlong.h'; Remove
'poly1305-sse2-amd64.S', 'poly1305-avx2-amd64.S' and
'poly1305-armv7-neon.S'.
* cipher/poly1305-armv7-neon.S: Remove.
* cipher/poly1305-avx2-amd64.S: Remove.
* cipher/poly1305-sse2-amd64.S: Remove.
* cipher/poly1305-internal.h (POLY1305_BLOCKSIZE)
(POLY1305_STATE): New.
(POLY1305_SYSV_FUNC_ABI, POLY1305_REF_BLOCKSIZE)
(POLY1305_REF_STATESIZE, POLY1305_REF_ALIGNMENT)
(POLY1305_USE_SSE2, POLY1305_SSE2_BLOCKSIZE, POLY1305_SSE2_STATESIZE)
(POLY1305_SSE2_ALIGNMENT, POLY1305_USE_AVX2, POLY1305_AVX2_BLOCKSIZE)
(POLY1305_AVX2_STATESIZE, POLY1305_AVX2_ALIGNMENT)
(POLY1305_USE_NEON, POLY1305_NEON_BLOCKSIZE, POLY1305_NEON_STATESIZE)
(POLY1305_NEON_ALIGNMENT, POLY1305_LARGEST_BLOCKSIZE)
(POLY1305_LARGEST_STATESIZE, POLY1305_LARGEST_ALIGNMENT)
(POLY1305_STATE_BLOCKSIZE, POLY1305_STATE_STATESIZE)
(POLY1305_STATE_ALIGNMENT, OPS_FUNC_ABI, poly1305_key_s)
(poly1305_ops_s): Remove.
(poly1305_context_s): Rewrite.
* cipher/poly1305.c (_gcry_poly1305_amd64_sse2_init_ext)
(_gcry_poly1305_amd64_sse2_finish_ext)
(_gcry_poly1305_amd64_sse2_blocks, poly1305_amd64_sse2_ops)
(poly1305_init_ext_ref32, poly1305_blocks_ref32)
(poly1305_finish_ext_ref32, poly1305_default_ops)
(_gcry_poly1305_amd64_avx2_init_ext)
(_gcry_poly1305_amd64_avx2_finish_ext)
(_gcry_poly1305_amd64_avx2_blocks)
(poly1305_amd64_avx2_ops, poly1305_get_state): Remove.
(poly1305_init): Rewrite.
(USE_MPI_64BIT, USE_MPI_32BIT): New.
[USE_MPI_64BIT] (ADD_1305_64, MUL_MOD_1305_64, poly1305_blocks)
(poly1305_final): New implementation using 64-bit limbs.
[USE_MPI_32BIT] (UMUL_ADD_32, ADD_1305_32, MUL_MOD_1305_32)
(poly1305_blocks): New implementation using 32-bit limbs.
(_gcry_poly1305_update, _gcry_poly1305_finish)
(_gcry_poly1305_init): Adapt to new implementation.
* configure.ac: Remove 'poly1305-sse2-amd64.lo',
'poly1305-avx2-amd64.lo' and 'poly1305-armv7-neon.lo'.
--

Intel Core i7-4790K CPU @ 4.00GHz (x86_64):
                    |  nanosecs/byte   mebibytes/sec   cycles/byte
 POLY1305           |     0.284 ns/B    3358.6 MiB/s      1.14 c/B

Intel Core i7-4790K CPU @ 4.00GHz (i386):
                    |  nanosecs/byte   mebibytes/sec   cycles/byte
 POLY1305           |     0.888 ns/B    1073.9 MiB/s      3.55 c/B

Cortex-A53 @ 1152Mhz (armv7):
                    |  nanosecs/byte   mebibytes/sec   cycles/byte
 POLY1305           |      4.40 ns/B     216.7 MiB/s      5.07 c/B

Cortex-A53 @ 1152Mhz (aarch64):
                    |  nanosecs/byte   mebibytes/sec   cycles/byte
 POLY1305           |      2.60 ns/B     367.0 MiB/s      2.99 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
cipher/Makefile.am
cipher/poly1305-armv7-neon.S [deleted file]
cipher/poly1305-avx2-amd64.S [deleted file]
cipher/poly1305-internal.h
cipher/poly1305-sse2-amd64.S [deleted file]
cipher/poly1305.c
configure.ac