cast5: add amd64 assembly implementation
authorJussi Kivilinna <jussi.kivilinna@iki.fi>
Fri, 24 May 2013 09:43:29 +0000 (12:43 +0300)
committerWerner Koch <wk@gnupg.org>
Fri, 24 May 2013 10:51:16 +0000 (12:51 +0200)
commit0bdf26eea8cdbffefe7e37578f8f896c4f5f5275
tree4e3f21742db99202068b3797dd1d04412d541037
parentab8fc70b5f0c396a5bc941267f59166e860b8c5d
cast5: add amd64 assembly implementation

* cipher/Makefile.am: Add 'cast5-amd64.S'.
* cipher/cast5-amd64.S: New file.
* cipher/cast5.c (USE_AMD64_ASM): New macro.
(_gcry_cast5_s1tos4): Merge arrays s1, s2, s3, s4 to single array to
simplify access from assembly implementation.
(s1, s2, s3, s4): New macros pointing to subarrays in
_gcry_cast5_s1tos4.
[USE_AMD64_ASM] (_gcry_cast5_amd64_encrypt_block)
(_gcry_cast5_amd64_decrypt_block, _gcry_cast5_amd64_ctr_enc)
(_gcry_cast5_amd64_cbc_dec, _gcry_cast5_amd64_cfb_dec): New prototypes.
[USE_AMD64_ASM] (do_encrypt_block, do_decrypt_block, encrypt_block)
(decrypt_block): New functions.
(_gcry_cast5_ctr_enc, _gcry_cast5_cbc_dec, _gcry_cast5_cfb_dec)
(selftest_ctr, selftest_cbc, selftest_cfb): New functions.
(selftest): Call new bulk selftests.
* cipher/cipher.c (gcry_cipher_open) [USE_CAST5]: Register CAST5 bulk
functions for ctr-enc, cbc-dec and cfb-dec.
* configure.ac (cast5) [x86_64]: Add 'cast5-amd64.lo'.
* src/cipher.h (_gcry_cast5_ctr_enc, _gcry_cast5_cbc_dec)
(gcry_cast5_cfb_dec): New prototypes.
--

Provides non-parallel implementations for small speed-up and 4-way parallel
implementations that gets accelerated on `out-of-order' CPUs.

Speed old vs. new on AMD Phenom II X6 1055T:
                ECB/Stream         CBC             CFB             OFB             CTR
             --------------- --------------- --------------- --------------- ---------------
CAST5         1.23x   1.22x   1.21x   2.86x   1.21x   2.83x   1.22x   1.17x   2.73x   2.73x

Speed old vs. new on Intel Core i5-2450M (Sandy-Bridge):
                ECB/Stream         CBC             CFB             OFB             CTR
             --------------- --------------- --------------- --------------- ---------------
CAST5         1.00x   1.04x   1.06x   2.56x   1.06x   2.37x   1.03x   1.01x   2.43x   2.41x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
cipher/Makefile.am
cipher/cast5-amd64.S [new file with mode: 0644]
cipher/cast5.c
cipher/cipher.c
configure.ac
src/cipher.h