Optimize OCB offset calculation
authorJussi Kivilinna <jussi.kivilinna@iki.fi>
Mon, 10 Aug 2015 19:09:56 +0000 (22:09 +0300)
committerJussi Kivilinna <jussi.kivilinna@iki.fi>
Mon, 10 Aug 2015 19:09:56 +0000 (22:09 +0300)
commit49f52c67fb42c0656c8f9af655087f444562ca82
tree2ef935a60649db8d61b3e1f36982788a15a10506
parentce746936b6c210e602d106cfbf45cf60b408d871
Optimize OCB offset calculation

* cipher/cipher-internal.h (ocb_get_l): New.
* cipher/cipher-ocb.c (_gcry_cipher_ocb_authenticate)
(ocb_crypt): Use 'ocb_get_l' instead of '_gcry_cipher_ocb_get_l'.
* cipher/camellia-glue.c (get_l): Remove.
(_gcry_camellia_ocb_crypt, _gcry_camellia_ocb_auth): Precalculate
offset array when block count matches parallel operation size; Use
'ocb_get_l' instead of 'get_l'.
* cipher/rijndael-aesni.c (get_l): Add fast path for 75% most common
offsets.
(aesni_ocb_enc, aesni_ocb_dec, _gcry_aes_aesni_ocb_auth): Precalculate
offset array when block count matches parallel operation size.
* cipher/rijndael-ssse3-amd64.c (get_l): Add fast path for 75% most
common offsets.
* cipher/rijndael.c (_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth): Use
'ocb_get_l' instead of '_gcry_cipher_ocb_get_l'.
* cipher/serpent.c (get_l): Remove.
(_gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth): Precalculate
offset array when block count matches parallel operation size; Use
'ocb_get_l' instead of 'get_l'.
* cipher/twofish.c (get_l): Remove.
(_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): Use 'ocb_get_l'
instead of 'get_l'.
--

Patch optimizes OCB offset calculation for generic code and
assembly implementations with parallel block processing.

Benchmark of OCB AES-NI on Intel Haswell:

 $ tests/bench-slope --cpu-mhz 3201 cipher aes

 Before:
  AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
         CTR enc |     0.274 ns/B    3483.9 MiB/s     0.876 c/B
         CTR dec |     0.273 ns/B    3490.0 MiB/s     0.875 c/B
         OCB enc |     0.289 ns/B    3296.1 MiB/s     0.926 c/B
         OCB dec |     0.299 ns/B    3189.9 MiB/s     0.957 c/B
        OCB auth |     0.260 ns/B    3670.0 MiB/s     0.832 c/B

 After:
  AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
         CTR enc |     0.273 ns/B    3489.4 MiB/s     0.875 c/B
         CTR dec |     0.273 ns/B    3487.5 MiB/s     0.875 c/B
         OCB enc |     0.248 ns/B    3852.8 MiB/s     0.792 c/B
         OCB dec |     0.261 ns/B    3659.5 MiB/s     0.834 c/B
        OCB auth |     0.227 ns/B    4205.5 MiB/s     0.726 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
cipher/camellia-glue.c
cipher/cipher-internal.h
cipher/cipher-ocb.c
cipher/rijndael-aesni.c
cipher/rijndael-ssse3-amd64.c
cipher/rijndael.c
cipher/serpent.c
cipher/twofish.c