Add ARM/NEON implementation of Poly1305
authorJussi Kivilinna <jussi.kivilinna@iki.fi>
Sun, 2 Nov 2014 14:01:11 +0000 (16:01 +0200)
committerJussi Kivilinna <jussi.kivilinna@iki.fi>
Sun, 2 Nov 2014 14:26:53 +0000 (16:26 +0200)
commit0b520128551054d83fb0bb2db8873394f38de498
treecba613c83ce9a044417a2084573211ee254654eb
parentc584f44543883346d5a565581ff99a0afce9c5e1
Add ARM/NEON implementation of Poly1305

* cipher/Makefile.am: Add 'poly1305-armv7-neon.S'.
* cipher/poly1305-armv7-neon.S: New.
* cipher/poly1305-internal.h (POLY1305_USE_NEON)
(POLY1305_NEON_BLOCKSIZE, POLY1305_NEON_STATESIZE)
(POLY1305_NEON_ALIGNMENT): New.
* cipher/poly1305.c [POLY1305_USE_NEON]
(_gcry_poly1305_armv7_neon_init_ext)
(_gcry_poly1305_armv7_neon_finish_ext)
(_gcry_poly1305_armv7_neon_blocks, poly1305_armv7_neon_ops): New.
(_gcry_poly1305_init) [POLY1305_USE_NEON]: Select NEON implementation
if HWF_ARM_NEON set.
* configure.ac [neonsupport=yes]: Add 'poly1305-armv7-neon.lo'.
--

Add Andrew Moon's public domain NEON implementation of Poly1305. Original
source is available at: https://github.com/floodyberry/poly1305-opt

Benchmark on Cortex-A8 (--cpu-mhz 1008):

Old:
                    |  nanosecs/byte   mebibytes/sec   cycles/byte
 POLY1305           |     12.34 ns/B     77.27 MiB/s     12.44 c/B

New:
                    |  nanosecs/byte   mebibytes/sec   cycles/byte
 POLY1305           |      2.12 ns/B     450.7 MiB/s      2.13 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
cipher/Makefile.am
cipher/poly1305-armv7-neon.S [new file with mode: 0644]
cipher/poly1305-internal.h
cipher/poly1305.c
configure.ac