chacha20: add SSSE3 assembly implementation
authorJussi Kivilinna <jussi.kivilinna@iki.fi>
Sun, 11 May 2014 09:00:19 +0000 (12:00 +0300)
committerJussi Kivilinna <jussi.kivilinna@iki.fi>
Sun, 11 May 2014 09:00:19 +0000 (12:00 +0300)
commitdef7d4cad386271c6d4e2f10aabe0cb4abd871e4
tree173fee5e8ff2ed596a57cdae5c160811fe506d23
parent23f33d57c9b6f2295a8ddfc9a8eee5a2c30cf406
chacha20: add SSSE3 assembly implementation

* cipher/Makefile.am: Add 'chacha20-ssse3-amd64.S'.
* cipher/chacha20-ssse3-amd64.S: New.
* cipher/chacha20.c (USE_SSSE3): New macro.
[USE_SSSE3] (_gcry_chacha20_amd64_ssse3_blocks): New.
(chacha20_do_setkey): Select SSSE3 implementation if there is HW
support.
* configure.ac [host=x86-64]: Add 'chacha20-ssse3-amd64.lo'.
--

Add SSSE3 optimized implementation for ChaCha20. Based on implementation
by Andrew Moon.

Before (Intel Haswell):

 CHACHA20       |  nanosecs/byte   mebibytes/sec   cycles/byte
     STREAM enc |      1.97 ns/B     483.6 MiB/s      6.31 c/B
     STREAM dec |      1.97 ns/B     484.0 MiB/s      6.31 c/B

After:

 CHACHA20       |  nanosecs/byte   mebibytes/sec   cycles/byte
     STREAM enc |     0.742 ns/B    1284.8 MiB/s      2.38 c/B
     STREAM dec |     0.741 ns/B    1286.5 MiB/s      2.37 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
cipher/Makefile.am
cipher/chacha20-ssse3-amd64.S [new file with mode: 0644]
cipher/chacha20.c
configure.ac