blowfish: add three rounds parallel handling to generic C implementation
authorJussi Kivilinna <jussi.kivilinna@iki.fi>
Sun, 31 Mar 2019 15:30:25 +0000 (18:30 +0300)
committerJussi Kivilinna <jussi.kivilinna@iki.fi>
Sun, 31 Mar 2019 15:45:28 +0000 (18:45 +0300)
commitced7508c857c0cc37da2299a393e5b167dd28e54
tree05a357921e5e3b2ea7d1d88805facf01a216245e
parent4ec566b3689eff4a712eacfcbb4161eb243bb1df
blowfish: add three rounds parallel handling to generic C implementation

* cipher/blowfish.c (BLOWFISH_ROUNDS): Remove.
[BLOWFISH_ROUNDS != 16] (function_F): Remove.
(F): Replace big-endian and little-endian version with single
endian-neutral version.
(R3, do_encrypt_3, do_decrypt_3): New.
(_gcry_blowfish_ctr_enc, _gcry_blowfish_cbc_dec)
(_gcry_blowfish_cfb_dec): Use new three block functions.
--

Benchmark on aarch64 (cortex-a53, 816 Mhz):

Before:
 BLOWFISH       |  nanosecs/byte   mebibytes/sec   cycles/byte
        CBC dec |     29.58 ns/B     32.24 MiB/s     24.13 c/B
        CFB dec |     33.38 ns/B     28.57 MiB/s     27.24 c/B
        CTR enc |     34.18 ns/B     27.90 MiB/s     27.89 c/B
After (~60%-70% faster):
 BLOWFISH       |  nanosecs/byte   mebibytes/sec   cycles/byte
        CBC dec |     18.18 ns/B     52.45 MiB/s     14.84 c/B
        CFB dec |     19.67 ns/B     48.50 MiB/s     16.05 c/B
        CTR enc |     19.77 ns/B     48.25 MiB/s     16.13 c/B

Benchmark on i386 (haswell, 4000 Mhz):

Before:
 BLOWFISH       |  nanosecs/byte   mebibytes/sec   cycles/byte
        CBC dec |      6.10 ns/B     156.4 MiB/s     24.39 c/B
        CFB dec |      6.39 ns/B     149.2 MiB/s     25.56 c/B
        CTR enc |      6.73 ns/B     141.6 MiB/s     26.93 c/B
After (~80% faster):
 BLOWFISH       |  nanosecs/byte   mebibytes/sec   cycles/byte
        CBC dec |      3.46 ns/B     275.5 MiB/s     13.85 c/B
        CFB dec |      3.53 ns/B     270.4 MiB/s     14.11 c/B
        CTR enc |      3.56 ns/B     268.0 MiB/s     14.23 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
cipher/blowfish.c