Add AES-NI/AVX accelerated Camellia implementation
authorJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Wed, 23 Jan 2013 09:55:13 +0000 (11:55 +0200)
committerWerner Koch <wk@gnupg.org>
Tue, 19 Feb 2013 10:21:48 +0000 (11:21 +0100)
commit63ac3ba07dba82fde040d31b90b4eff627bd92b9
treec103c60a747faff8ebb8e1f7b72a9faa68ed089d
parent4de62d80644228fc5db2a9f9c94a7eb633d8de2e
Add AES-NI/AVX accelerated Camellia implementation

* configure.ac: Add option --disable-avx-support.
(HAVE_GCC_INLINE_ASM_AVX): New.
(ENABLE_AVX_SUPPORT): New.
(camellia) [ENABLE_AVX_SUPPORT, ENABLE_AESNI_SUPPORT]: Add
camellia_aesni_avx_x86-64.lo.
* cipher/Makefile.am (AM_CCASFLAGS): Add.
(EXTRA_libcipher_la_SOURCES): Add camellia_aesni_avx_x86-64.S
* cipher/camellia-glue.c [ENABLE_AESNI_SUPPORT, ENABLE_AVX_SUPPORT]
[__x86_64__] (USE_AESNI_AVX): Add macro.
(struct Camellia_context) [USE_AESNI_AVX]: Add use_aesni_avx.
[USE_AESNI_AVX] (_gcry_camellia_aesni_avx_ctr_enc)
(_gcry_camellia_aesni_avx_cbc_dec): New prototypes to assembly
functions.
(camellia_setkey) [USE_AESNI_AVX]: Enable AES-NI/AVX if hardware
support both.
(_gcry_camellia_ctr_enc) [USE_AESNI_AVX]: Add AES-NI/AVX code.
(_gcry_camellia_cbc_dec) [USE_AESNI_AVX]: Add AES-NI/AVX code.
* cipher/camellia_aesni_avx_x86-64.S: New.
* src/g10lib.h (HWF_INTEL_AVX): New.
* src/global.c (hwflist): Add HWF_INTEL_AVX.
* src/hwf-x86.c (detect_x86_gnuc) [ENABLE_AVX_SUPPORT]: Add detection
for AVX.
--

Before:
 Running each test 250 times.
                 ECB/Stream         CBC             CFB             OFB             CTR
              --------------- --------------- --------------- --------------- ---------------
 CAMELLIA128   2210ms  2200ms  2300ms  2050ms  2240ms  2250ms  2290ms  2270ms  2070ms  2070ms
 CAMELLIA256   2810ms  2800ms  2920ms  2670ms  2840ms  2850ms  2910ms  2890ms  2660ms  2640ms

After:
 Running each test 250 times.
                 ECB/Stream         CBC             CFB             OFB             CTR
              --------------- --------------- --------------- --------------- ---------------
 CAMELLIA128   2200ms  2220ms  2290ms   470ms  2240ms  2270ms  2270ms  2290ms   480ms   480ms
 CAMELLIA256   2820ms  2820ms  2900ms   600ms  2860ms  2860ms  2900ms  2920ms   620ms   620ms

AES-NI/AVX implementation works by processing 16 parallel blocks (256 bytes).
It's bytesliced implementation that uses AES-NI (Subbyte) for Camellia sboxes,
with help of prefiltering/postfiltering. For smaller data sets generic C
implementation is used.

Speed-up for CBC-decryption and CTR-mode (large data): 4.3x

Tests were run on: Intel Core i5-2450M

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
(license boiler plate update by wk)
cipher/Makefile.am
cipher/camellia-glue.c
cipher/camellia.c
cipher/camellia_aesni_avx_x86-64.S [new file with mode: 0644]
configure.ac
src/g10lib.h
src/global.c
src/hwf-x86.c