libgcrypt.git
2 years agoUpdate NEWS with release info from 1.7.4 to 1.7.6.
Werner Koch [Fri, 27 Jan 2017 08:13:07 +0000 (09:13 +0100)]
Update NEWS with release info from 1.7.4 to 1.7.6.

--

2 years agorijndael-ssse3-amd64: fix building on x32
Jussi Kivilinna [Mon, 23 Jan 2017 18:01:32 +0000 (20:01 +0200)]
rijndael-ssse3-amd64: fix building on x32

* cipher/rijndael-ssse3-amd64.c: Use 64-bit call instructions
with 64-bit registers.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2 years agobufhelp: use 'may_alias' attribute unaligned pointer types
Jussi Kivilinna [Mon, 23 Jan 2017 17:48:28 +0000 (19:48 +0200)]
bufhelp: use 'may_alias' attribute unaligned pointer types

* configure.ac (gcry_cv_gcc_attribute_may_alias)
(HAVE_GCC_ATTRIBUTE_MAY_ALIAS): New check for 'may_alias' attribute.
* cipher/bufhelp.h (BUFHELP_FAST_UNALIGNED_ACCESS): Enable only if
HAVE_GCC_ATTRIBUTE_MAY_ALIAS is defined.
[BUFHELP_FAST_UNALIGNED_ACCESS] (bufhelp_int_t, bufhelp_u32_t)
(bufhelp_u64_t): Add 'may_alias' attribute.
* src/g10lib.h (fast_wipememory_t): Add HAVE_GCC_ATTRIBUTE_MAY_ALIAS
defined check; Add 'may_alias' attribute.
--

Attribute 'may_alias' was missing from bufhelp unaligned memory access
pointer types, and was causing problems with newer GCC versions (with
more aggressive optimization). This patch fixes broken Camellia-CFB
with '-O3 -flto' flags with GCC-6 on x86-64 and generic GCM with
default '-O2' on x32.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2 years agorandom: Call getrandom before select and emitting a progress callback.
Werner Koch [Wed, 18 Jan 2017 09:24:06 +0000 (10:24 +0100)]
random: Call getrandom before select and emitting a progress callback.

* random/rndlinux.c (_gcry_rndlinux_gather_random): Move the getrandom
call before the select.
--

A select for getrandom does not make any sense because there is no
file descriptor for getrandom.  Thus if getrandom is available we now
select only when we want to read from the blocking /dev/random.  In
most cases this avoids all progress callbacks.

Signed-off-by: Werner Koch <wk@gnupg.org>
2 years agompi: amd64: fix too large jump alignment in mpih-rshift
Jussi Kivilinna [Wed, 4 Jan 2017 20:30:26 +0000 (22:30 +0200)]
mpi: amd64: fix too large jump alignment in mpih-rshift

* mpi/amd64/mpih-rshift.S (_gcry_mpih_rshift): Use 16-byte alignment
with 'ALIGN(4)' instead of 256-byte.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2 years agorijndael-ssse3: move assembly functions to separate source-file
Jussi Kivilinna [Wed, 4 Jan 2017 17:16:26 +0000 (19:16 +0200)]
rijndael-ssse3: move assembly functions to separate source-file

* cipher/Makefile.am: Add 'rinjdael-ssse3-amd64-asm.S'.
* cipher/rinjdael-ssse3-amd64-asm.S: Moved assembly functions
here ...
* cipher/rinjdael-ssse3-amd64.c: ... from this file.
(_gcry_aes_ssse3_enc_preload, _gcry_aes_ssse3_dec_preload)
(_gcry_aes_ssse3_shedule_core, _gcry_aes_ssse3_encrypt_core)
(_gcry_aes_ssse3_decrypt_core): New.
(vpaes_ssse3_prepare_enc, vpaes_ssse3_prepare_dec)
(_gcry_aes_ssse3_do_setkey, _gcry_aes_ssse3_prepare_decryption)
(do_vpaes_ssse3_enc, do_vpaes_ssse3_dec): Update to use external
assembly functions; remove 'aes_const_ptr' variable usage.
(_gcry_aes_ssse3_encrypt, _gcry_aes_ssse3_decrypt)
(_gcry_aes_ssse3_cfb_enc, _gcry_aes_ssse3_cbc_enc)
(_gcry_aes_ssse3_ctr_enc, _gcry_aes_ssse3_cfb_dec)
(_gcry_aes_ssse3_cbc_dec, ssse3_ocb_enc, ssse3_ocb_dec)
(_gcry_aes_ssse3_ocb_auth): Remove 'aes_const_ptr' variable usage.
* configure.ac: Add 'rinjdael-ssse3-amd64-asm.lo'.
--

After this change, libgcrypt can be compiled with -flto optimization
enabled on x86-64.

GnuPG-bug-id: 2882
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2 years agoAdd AVX2/vpgather bulk implementation of Twofish
Jussi Kivilinna [Wed, 4 Jan 2017 08:18:36 +0000 (10:18 +0200)]
Add AVX2/vpgather bulk implementation of Twofish

* cipher/Makefile.am: Add 'twofish-avx2-amd64.S'.
* cipher/twofish-avx2-amd64.S: New.
* cipher/twofish.c (USE_AVX2): New.
(TWOFISH_context) [USE_AVX2]: Add 'use_avx2' member.
(ASM_FUNC_ABI): New.
(twofish_setkey): Add check for AVX2 and fast VPGATHER HW features.
(_gcry_twofish_avx2_ctr_enc, _gcry_twofish_avx2_cbc_dec)
(_gcry_twofish_avx2_cfb_dec, _gcry_twofish_avx2_ocb_enc)
(_gcry_twofish_avx2_ocb_dec, _gcry_twofish_avx2_ocb_auth): New.
(_gcry_twofish_ctr_enc, _gcry_twofish_cbc_dec, _gcry_twofish_cfb_dec)
(_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): Add AVX2 bulk
handling.
(selftest_ctr, selftest_cbc, selftest_cfb): Increase nblocks from
3+X to 16+X.
* configure.ac: Add 'twofish-avx2-amd64.lo'.
* src/g10lib.h (HWF_INTEL_FAST_VPGATHER): New.
* src/hwf-x86.c (detect_x86_gnuc): Add detection for
HWF_INTEL_FAST_VPGATHER.
* src/hwfeatures.c (HWF_INTEL_FAST_VPGATHER): Add
"intel-fast-vpgather" for HWF_INTEL_FAST_VPGATHER.
--

Benchmark on Intel Core i3-6100 (3.7 Ghz):

Before:
 TWOFISH        |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |      4.25 ns/B     224.5 MiB/s     15.71 c/B
        ECB dec |      4.16 ns/B     229.5 MiB/s     15.38 c/B
        CBC enc |      4.53 ns/B     210.4 MiB/s     16.77 c/B
        CBC dec |      2.71 ns/B     351.6 MiB/s     10.04 c/B
        CFB enc |      4.60 ns/B     207.3 MiB/s     17.02 c/B
        CFB dec |      2.70 ns/B     353.5 MiB/s      9.98 c/B
        OFB enc |      4.25 ns/B     224.2 MiB/s     15.74 c/B
        OFB dec |      4.24 ns/B     225.0 MiB/s     15.68 c/B
        CTR enc |      2.72 ns/B     350.6 MiB/s     10.06 c/B
        CTR dec |      2.72 ns/B     350.7 MiB/s     10.06 c/B
        CCM enc |      7.25 ns/B     131.5 MiB/s     26.83 c/B
        CCM dec |      7.25 ns/B     131.5 MiB/s     26.83 c/B
       CCM auth |      4.57 ns/B     208.9 MiB/s     16.89 c/B
        GCM enc |      3.02 ns/B     315.3 MiB/s     11.19 c/B
        GCM dec |      3.02 ns/B     315.6 MiB/s     11.18 c/B
       GCM auth |     0.297 ns/B    3208.4 MiB/s      1.10 c/B
        OCB enc |      2.73 ns/B     349.7 MiB/s     10.09 c/B
        OCB dec |      2.82 ns/B     338.3 MiB/s     10.43 c/B
       OCB auth |      2.77 ns/B     343.7 MiB/s     10.27 c/B

After (CBC-dec & CFB-dec & CTR & OCB, ~1.5x faster):
 TWOFISH        |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |      4.25 ns/B     224.2 MiB/s     15.74 c/B
        ECB dec |      4.15 ns/B     229.5 MiB/s     15.37 c/B
        CBC enc |      4.61 ns/B     206.8 MiB/s     17.06 c/B
        CBC dec |      1.75 ns/B     544.0 MiB/s      6.49 c/B
        CFB enc |      4.52 ns/B     211.0 MiB/s     16.72 c/B
        CFB dec |      1.72 ns/B     554.1 MiB/s      6.37 c/B
        OFB enc |      4.27 ns/B     223.3 MiB/s     15.80 c/B
        OFB dec |      4.28 ns/B     222.7 MiB/s     15.84 c/B
        CTR enc |      1.73 ns/B     549.9 MiB/s      6.42 c/B
        CTR dec |      1.75 ns/B     545.1 MiB/s      6.47 c/B
        CCM enc |      6.31 ns/B     151.2 MiB/s     23.34 c/B
        CCM dec |      6.42 ns/B     148.5 MiB/s     23.76 c/B
       CCM auth |      4.56 ns/B     208.9 MiB/s     16.89 c/B
        GCM enc |      1.90 ns/B     502.8 MiB/s      7.02 c/B
        GCM dec |      2.00 ns/B     477.8 MiB/s      7.38 c/B
       GCM auth |     0.300 ns/B    3178.6 MiB/s      1.11 c/B
        OCB enc |      1.76 ns/B     542.2 MiB/s      6.51 c/B
        OCB dec |      1.76 ns/B     540.7 MiB/s      6.53 c/B
       OCB auth |      1.76 ns/B     542.8 MiB/s      6.50 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2 years agoAdd XTS cipher mode
Jussi Kivilinna [Fri, 6 Jan 2017 10:48:17 +0000 (12:48 +0200)]
Add XTS cipher mode

* cipher/Makefile.am: Add 'cipher-xts.c'.
* cipher/cipher-internal.h (gcry_cipher_handle): Add 'bulk.xts_crypt'
and 'u_mode.xts' members.
(_gcry_cipher_xts_crypt): New prototype.
* cipher/cipher-xts.c: New.
* cipher/cipher.c (_gcry_cipher_open_internal, cipher_setkey)
(cipher_reset, cipher_encrypt, cipher_decrypt): Add XTS mode handling.
* doc/gcrypt.texi: Add XTS mode to documentation.
* src/gcrypt.h.in (GCRY_CIPHER_MODE_XTS, GCRY_XTS_BLOCK_LEN): New.
* tests/basic.c (do_check_xts_cipher, check_xts_cipher): New.
(check_bulk_cipher_modes): Add XTS test-vectors.
(check_one_cipher_core, check_one_cipher, check_ciphers): Add XTS
testing support.
(check_cipher_modes): Add XTS test.
* tests/bench-slope.c (bench_xts_encrypt_init)
(bench_xts_encrypt_do_bench, bench_xts_decrypt_do_bench)
(xts_encrypt_ops, xts_decrypt_ops): New.
(cipher_modes, cipher_bench_one): Add XTS.
* tests/benchmark.c (cipher_bench): Add XTS testing.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2 years agorijndael-ssse3: fix counter operand from read-only to read/write
Jussi Kivilinna [Wed, 4 Jan 2017 10:02:36 +0000 (12:02 +0200)]
rijndael-ssse3: fix counter operand from read-only to read/write

* cipher/rijndael-ssse3-amd64.c (_gcry_aes_ssse3_ctr_enc): Change
'ctrlow' operand from read-only to read-write.
--

With read-only operand, compiler is allowed to pass temporary
register to assembly block and throw away any calculation that
have been done on that register. On the other hand, compiler is
also allowed to keep operand value permanently in one register
as value is treated as read-only, and effectly operates as
expected. Selection between these two depends on compiler
version and used flags.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2 years agoExtend GCRYCTL_PRINT_CONFIG to print compiler version.
Werner Koch [Tue, 3 Jan 2017 15:30:54 +0000 (16:30 +0100)]
Extend GCRYCTL_PRINT_CONFIG to print compiler version.

* src/global.c (print_config): Print version of libgpg-error and used
compiler.

Signed-off-by: Werner Koch <wk@gnupg.org>
2 years agotests: Add option --disable-hwf to the version utility.
Werner Koch [Tue, 3 Jan 2017 14:34:33 +0000 (15:34 +0100)]
tests: Add option --disable-hwf to the version utility.

* src/hwfeatures.c (_gcry_disable_hw_feature): Rewrite to allow
passing a colon delimited feature set.
(parse_hwf_deny_file): Remove unused var I.
* tests/version.c (main): Add options --verbose and --disable-hwf.

Signed-off-by: Werner Koch <wk@gnupg.org>
2 years agoAdd release info from 1.7.5
Werner Koch [Thu, 15 Dec 2016 08:49:47 +0000 (09:49 +0100)]
Add release info from 1.7.5

--

2 years agoFix regression in broken mlock detection.
Werner Koch [Thu, 15 Dec 2016 07:50:40 +0000 (08:50 +0100)]
Fix regression in broken mlock detection.

* acinclude.m4 (GNUPG_CHECK_MLOCK): Fix typo EGAIN->EAGAIN.
--

GnuPG-bug-id: 2870
Fixes-commit: 618b8978f46f4011c11512fd5f30c15e01652e2e
Co-authored-by: Nicolas Porcel <nicolasporcel06@gmail.com>
Signed-off-by: Werner Koch <wk@gnupg.org>
2 years agotests: Check the result of all gcry_control operations.
Justus Winter [Tue, 13 Dec 2016 12:33:45 +0000 (13:33 +0100)]
tests: Check the result of all gcry_control operations.

--
Signed-off-by: Justus Winter <justus@g10code.com>
2 years agotests: Use common code for all tests.
Justus Winter [Tue, 13 Dec 2016 12:24:51 +0000 (13:24 +0100)]
tests: Use common code for all tests.

--
Also fix minor fallout from the format string warnings.

Signed-off-by: Justus Winter <justus@g10code.com>
2 years agotests: Rename 'show' to 'info'.
Justus Winter [Tue, 13 Dec 2016 11:20:32 +0000 (12:20 +0100)]
tests: Rename 'show' to 'info'.

--
Signed-off-by: Justus Winter <justus@g10code.com>
2 years agotests: Rename 'PGMNAME' to 'PGM'.
Justus Winter [Tue, 13 Dec 2016 11:12:26 +0000 (12:12 +0100)]
tests: Rename 'PGMNAME' to 'PGM'.

--
Signed-off-by: Justus Winter <justus@g10code.com>
2 years agotests: Rename 'errorcount' to 'error_count'.
Justus Winter [Tue, 13 Dec 2016 11:09:40 +0000 (12:09 +0100)]
tests: Rename 'errorcount' to 'error_count'.

--
Signed-off-by: Justus Winter <justus@g10code.com>
2 years agohwfeatures: add 'all' for disabling all hardware features
Jussi Kivilinna [Sat, 10 Dec 2016 10:29:12 +0000 (12:29 +0200)]
hwfeatures: add 'all' for disabling all hardware features

* .gitignore: Add 'tests/basic-disable-all-hwf'.
* configure.ac: Ditto.
* tests/Makefile.am: Ditto.
* src/hwfeatures.c (_gcry_disable_hw_feature): Match 'all' for
masking all HW features off.
(parse_hwf_deny_file): Use '_gcry_disable_hw_feature' for matching.
* tests/basic-disable-all-hwf.in: New.
--

Also add new test to run 'basic' with all HWF disable. With current
assembly implementations and build servers using new CPUs, generic
implementations are not being tested enough anymore and compiler
problems might end up unnoticed.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2 years agotests/hashtest-256g: add missing executable extension for Win32
Jussi Kivilinna [Sat, 10 Dec 2016 10:29:12 +0000 (12:29 +0200)]
tests/hashtest-256g: add missing executable extension for Win32

* tests/hashtest-256g.in: Add @EXEEXT@.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2 years agoOCB ARM CE: Move ocb_get_l handling to assembly part
Jussi Kivilinna [Sat, 10 Dec 2016 10:29:12 +0000 (12:29 +0200)]
OCB ARM CE: Move ocb_get_l handling to assembly part

* cipher/rijndael-armv8-aarch32-ce.S: Add OCB 'L_{ntz(i)}' calculation.
* cipher/rijndael-armv8-aarch64-ce.S: Ditto.
* cipher/rijndael-armv8-ce.c (_gcry_aes_ocb_enc_armv8_ce)
(_gcry_aes_ocb_dec_armv8_ce, _gcry_aes_ocb_auth_armv8_ce)
(ocb_cryt_fn_t): Updated arguments.
(_gcry_aes_armv8_ce_ocb_crypt, _gcry_aes_armv8_ce_ocb_auth): Remove
'ocb_get_l' handling and splitting input to 32 block chunks, instead
pass full buffers to assembly.
--

Performance on Cortex-A53 (AArch32):

Before:
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        OCB enc |      1.63 ns/B     583.8 MiB/s      1.88 c/B
        OCB dec |      1.67 ns/B     572.1 MiB/s      1.92 c/B
       OCB auth |      1.33 ns/B     717.1 MiB/s      1.53 c/B

After (~12% faster):
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        OCB enc |      1.47 ns/B     650.2 MiB/s      1.69 c/B
        OCB dec |      1.48 ns/B     644.5 MiB/s      1.70 c/B
       OCB auth |      1.19 ns/B     798.2 MiB/s      1.38 c/B

Performance on Cortex-A53 (AArch64):

Before:
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        OCB enc |      1.29 ns/B     738.5 MiB/s      1.49 c/B
        OCB dec |      1.32 ns/B     723.5 MiB/s      1.52 c/B
       OCB auth |      1.15 ns/B     827.0 MiB/s      1.33 c/B

After (~8% faster):
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        OCB enc |      1.21 ns/B     789.1 MiB/s      1.39 c/B
        OCB dec |      1.21 ns/B     789.2 MiB/s      1.39 c/B
       OCB auth |      1.10 ns/B     867.0 MiB/s      1.27 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2 years agoOCB: Move large L handling from bottom to upper level
Jussi Kivilinna [Sat, 10 Dec 2016 10:29:12 +0000 (12:29 +0200)]
OCB: Move large L handling from bottom to upper level

* cipher/cipher-ocb.c (_gcry_cipher_ocb_get_l): Remove.
(ocb_get_L_big): New.
(_gcry_cipher_ocb_authenticate): L-big handling done in upper
processing loop, so that lower level never sees the case where
'aad_nblocks % 65536 == 0'; Add missing stack burn.
(ocb_aad_finalize): Add missing stack burn.
(ocb_crypt): L-big handling done in upper processing loop, so that
lower level never sees the case where 'data_nblocks % 65536 == 0'.
* cipher/cipher-internal.h (_gcry_cipher_ocb_get_l): Remove.
(ocb_get_l): Remove 'l_tmp' usage and simplify since input
is more limited now, 'N is not multiple of 65536'.
* cipher/rijndael-aesni.c (get_l): Remove.
(aesni_ocb_enc, aesni_ocb_dec, _gcry_aes_aesni_ocb_auth): Remove
l_tmp; Use 'ocb_get_l'.
* cipher/rijndael-ssse3-amd64.c (get_l): Remove.
(ssse3_ocb_enc, ssse3_ocb_dec, _gcry_aes_ssse3_ocb_auth): Remove
l_tmp; Use 'ocb_get_l'.
* cipher/camellia-glue.c: Remove OCB l_tmp usage.
* cipher/rijndael-armv8-ce.c: Ditto.
* cipher/rijndael.c: Ditto.
* cipher/serpent.c: Ditto.
* cipher/twofish.c: Ditto.
--

Move large L value generation to up-most level to simplify lower level
ocb_get_l for greater performance and simpler implementation. This helps
implementing OCB in assembly as 'ocb_get_l' no longer has function call
on slow-path.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2 years agoOCB: remove 'int64_t' usage
Jussi Kivilinna [Sat, 10 Dec 2016 10:29:12 +0000 (12:29 +0200)]
OCB: remove 'int64_t' usage

* cipher/cipher-ocb.c (double_block): Use alternative way to generate
sign-bit mask, without 'int64_t'.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2 years agorandom-drbg: use bufhelp function for big-endian store
Jussi Kivilinna [Sat, 10 Dec 2016 10:29:12 +0000 (12:29 +0200)]
random-drbg: use bufhelp function for big-endian store

* random/random-drbg.c (drbg_cpu_to_be32): Remove.
(drbg_ctr_df, drbg_hash_df): Use 'buf_put_be32' instead of
'drbg_cpu_to_be32'.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2 years agoAdd release info from 1.7.4
Werner Koch [Fri, 9 Dec 2016 14:57:33 +0000 (15:57 +0100)]
Add release info from 1.7.4

--

2 years agoImprove handling of mlock error codes.
Werner Koch [Fri, 9 Dec 2016 11:10:54 +0000 (12:10 +0100)]
Improve handling of mlock error codes.

* acinclude.m4 (GNUPG_CHECK_MLOCK): Check also for EAGAIN which is a
legitimate return code and does not indicate a broken mlock().
* src/secmem.c (lock_pool_pages): Test ERR instead of ERRNO which
could have been overwritten by cap_from+text et al.
--

  On FreeBSD, if there are not enough free pages, mlock() can return
  EAGAIN, as documented in mlock(2). That doesn't mean that mlock is
  broken. I suspect this same issue also exists on the other BSD's.

Suggested-by: Ruben Kerkhof <ruben@rubenkerkhof.com>
This is (now) also true for Linux.

Signed-off-by: Werner Koch <wk@gnupg.org>
2 years agorandom: Eliminate unneeded memcpy invocations in the DRBG.
Stephan Mueller [Sat, 3 Dec 2016 18:18:01 +0000 (19:18 +0100)]
random: Eliminate unneeded memcpy invocations in the DRBG.

* random/random-drbg.c (drbg_hash): Remove arg 'outval' and return a
pointer instead.
(drbg_instantiate): Reduce size of scratchpad.
(drbg_hmac_update): Avoid use of scratch buffers for the hash.
(drbg_hmac_generate, drbg_hash_df): Ditto.
(drbg_hash_process_addtl): Ditto.
(drbg_hash_hashgen): Ditto.
(drbg_hash_generate): Ditto.

--
The gcry_md_read returns a pointer to the hash which can be directly
used instead of copying it into a scratch buffer. This eliminates a
number of memcpy invocations for HMAC and Hash DRBG and reduces the
memory footprint of the Hash DRBG by the block size of the used hash.

The performance increase is between 1 and 3 MB/s depending on the output
buffer size.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
ChangeLog entries above written by -wk.

2 years agorandom: Add performance improvements for the DRBG.
Stephan Mueller [Thu, 1 Dec 2016 16:15:10 +0000 (17:15 +0100)]
random: Add performance improvements for the DRBG.

* random/random-drbg.c (struct drbg_state_ops_s): New function
pointers 'crypto_init' and 'crypto-fini'.
(struct drbg_state_s): New fields 'priv_data', 'ctr_handle', and
'ctr_null'.
(drbg_hash_init, drbg_hash_fini): New.
(drbg_hmac_init, drbg_hmac_setkey): New.
(drbg_sym_fini, drbg_sym_init, drbg_sym_setkey): New.
(drbg_sym_ctr): New.
(drbg_ctr_bcc): Set the key.
(drbg_ctr_df): Ditto.
(drbg_hmac_update): Ditto.
(drbg_hmac_generate): Replace drgb_hmac by drbg_hash.
(drbg_hash_df): Ditto.
(drbg_hash_process_addtl): Ditto.
(drbg_hash_hashgen): Ditto.
(drbg_ctr_update): Rework.
(drbg_ctr_generate): Rework.
(drbg_ctr_ops): Init new functions pointers.
(drbg_uninstantiate): Call fini function.
(drbg_instantiate): Call init function.

--
The performance improvements can be categorized as follows:

* Initialize the cipher handle of the backend ciphers once and re-use
  them for subsequent cipher invocations.

* Limit the invocation of setkey to the cases when the key is newly
  created.

* Use the AES CTR mode and rip out the counter maintenance in the DRBG
  code. This allows the use of accelerated CTR AES implementations. To
  use the CTR AES mode, a NULL buffer is created that is used as the
  "plaintext" to the CTR mode, because the DRBG CTR AES operation is the
  result of the encryption of the CTR (i.e. the NULL buffer makes the
  final XOR of the CTR AES mode a noop).

The following timing measurements are made. The measurement do not use a
precise timing operation and should rather serve as a general hint to
the performance improvements.

 On a Broadwell i7 CPU:

block size 4096 1024 128 32 16
 aes256 old 28MB/s 27MB/s 19MB/s 11MB/s 6MB/s
 aes128 old 29MB/s 32MB/s 23MB/s 15MB/s 9MB/s
 sha256 old 48MB/s 48MB/s 33MB/s 16MB/s 8MB/s
 hmac sha256 old 15MB/s 15MB/s 10MB/s 5MB/s 2MB/s

 aes256 new 180MB/s 169MB/s 93MB/s 37MB/s 20MB/s
 aes128 new 240MB/s 221MB/s 125MB/s 51MB/s 27MB/s
 sha256 new 75MB/s 69MB/s 48MB/s 23MB/s 11MB/s
 hmac sha256 new 37MB/s 34MB/s 21MB/s 8MB/s 4MB/s

Signed-off-by: Stephan Mueller <smueller@chronox.de>
ChnageLog entries above written by -wk

2 years agocipher: New function for reading the counter in CTR mode
Stephan Mueller [Thu, 1 Dec 2016 16:11:42 +0000 (17:11 +0100)]
cipher: New function for reading the counter in CTR mode

* cipher/cipher.c (gcry_cipher_getctr): New.
--
The API call allows reading the current counter of the CTR mode. The API
remains internal to libgcrypt and is not exported to external callers.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
ChangeLog entry above added by -wk

2 years agodoc: Remove comment that is not applicable any more.
Stephan Mueller [Sun, 27 Nov 2016 09:14:21 +0000 (10:14 +0100)]
doc: Remove comment that is not applicable any more.

--
Signed-off-by: Stephan Mueller <smueller@chronox.de>
2 years agodoc: Update NEWS.
Werner Koch [Wed, 7 Dec 2016 17:55:06 +0000 (18:55 +0100)]
doc: Update NEWS.

--

2 years agoDocument the overflow pools and add a stupid test case.
Werner Koch [Wed, 7 Dec 2016 16:01:19 +0000 (17:01 +0100)]
Document the overflow pools and add a stupid test case.

* tests/t-secmem.c (test_secmem_overflow): New func.
(main): Disable warning and call new function.

Signed-off-by: Werner Koch <wk@gnupg.org>
2 years agoImplement overflow secmem pools for xmalloc style allocators.
Werner Koch [Wed, 7 Dec 2016 15:59:57 +0000 (16:59 +0100)]
Implement overflow secmem pools for xmalloc style allocators.

* src/secmem.c (pooldesc_s): Add fields next, cur_alloced, and
cur_blocks.
(cur_alloced, cur_blocks): Remove vars.
(ptr_into_pool_p): Make it inline.
(stats_update): Add arg pool and update the new pool specific
counters.
(_gcry_secmem_malloc_internal): Add arg xhint and allocate overflow
pools as needed.
(_gcry_secmem_malloc): Pass XHINTS along.
(_gcry_secmem_realloc_internal): Ditto.
(_gcry_secmem_realloc): Ditto.
(_gcry_secmem_free_internal): Take multiple pools in account.  Add
return value to indicate whether the arg was freed.
(_gcry_secmem_free): Add return value to indicate whether the arg was
freed.
(_gcry_private_is_secure): Take multiple pools in account.
(_gcry_secmem_term): Release all pools.
(_gcry_secmem_dump_stats): Print stats for all pools.
* src/stdmem.c (_gcry_private_free): Replace _gcry_private_is_secure
test with a direct call of _gcry_secmem_free to avoid double checking.
--

This patch avoids process termination due to an out-of-secure-memory
condition in the MPI subsystem.  We consider it more important to have
reliable MPI computations than process termination due the need for
memory which is protected against being swapped out.  Using encrypted
swap is anyway a more reliable protection than those mlock'ed pages.
Note also that mlock'ed pages won't help against hibernation.

GnuPG-bug-id: 2857
Signed-off-by: Werner Koch <wk@gnupg.org>
2 years agoGive the secmem allocators a hint when a xmalloc calls them.
Werner Koch [Wed, 7 Dec 2016 09:37:50 +0000 (10:37 +0100)]
Give the secmem allocators a hint when a xmalloc calls them.

* src/secmem.c (_gcry_secmem_malloc): New not yet used arg XHINT.
(_gcry_secmem_realloc): Ditto.
* src/stdmem.c (_gcry_private_malloc_secure): New arg XHINT to be
passed to the secmem functions.
(_gcry_private_realloc): Ditto.
* src/g10lib.h (GCRY_ALLOC_FLAG_XHINT): New.
* src/global.c (do_malloc): Pass this flag as XHINT to the private
allocator.
(_gcry_malloc_secure): Factor code out to ...
(_gcry_malloc_secure_core): this.  Add arg XHINT.
(_gcry_realloc): Factor code out to ...
(_gcry_realloc_core): here.  Add arg XHINT.
(_gcry_strdup): Factor code out to ...
(_gcry_strdup_core): here.  Add arg XHINT.
(_gcry_xrealloc): Use the core function and pass true for XHINT.
(_gcry_xmalloc_secure): Ditto.
(_gcry_xstrdup): Ditto.

Signed-off-by: Werner Koch <wk@gnupg.org>
2 years agotests: New test t-secmem.
Werner Koch [Wed, 7 Dec 2016 09:01:39 +0000 (10:01 +0100)]
tests: New test t-secmem.

* src/secmem.c (_gcry_secmem_dump_stats): Add arg EXTENDED and adjust
caller.
* src/gcrypt-testapi.h (PRIV_CTL_DUMP_SECMEM_STATS): New.
* src/global.c (_gcry_vcontrol): Implement that.
* tests/t-secmem.c: New.
* tests/Makefile.am (tests_bin): Add that test.
--

This test does not much right now.

Signed-off-by: Werner Koch <wk@gnupg.org>
2 years agoFix compiler warning about possible-NULL-dreference
Werner Koch [Tue, 6 Dec 2016 21:19:04 +0000 (22:19 +0100)]
Fix compiler warning about possible-NULL-dreference

* src/mpi.h (mpi_is_const, mpi_is_immutable): Do check arg before
deref-ing.  The are only used at places where the arg shall not be NULL.
--

This was designed as a general purpose macro and written in a
defensive way.  However, if it a NULL would be passed to that macro
code run in the else branch will deref the arg anyway.

Signed-off-by: Werner Koch <wk@gnupg.org>
2 years agoFix possible NULL-deref in gcry_log_debugsxp
Werner Koch [Tue, 6 Dec 2016 20:44:33 +0000 (21:44 +0100)]
Fix possible NULL-deref in gcry_log_debugsxp

* src/misc.c (_gcry_log_printsxp): Prevent passing NULL to strlen.

Signed-off-by: Werner Koch <wk@gnupg.org>
2 years agoReorganize code in secmem.c.
Werner Koch [Tue, 6 Dec 2016 20:20:54 +0000 (21:20 +0100)]
Reorganize code in secmem.c.

* src/secmem.c (pooldesc_t): New type to collect information about one
pool.
(pool_size): Remove.  Now a member of pooldesc_t.
(pool_okay): Ditto.
(pool_is_mmapped): Ditto.
(pool): Rename variable ...
(mainpool): And change type to pooldesc_t.
(ptr_into_pool_p): Add arg 'pool'.
(mb_get_next): Ditto.
(mb_get_prev): Ditto.
(mb_merge): Ditto.
(mb_get_new): Ditto.
(init_pool): Ditto.
(lock_pool): Rename to ...
(look_pool_pages: this.
(secmem_init): Rename to ...
(_gcry_secmem_init_internal): this.  Add local var POOL and init with
address of MAINPOOL.
(_gcry_secmem_malloc_internal): Add local var POOL and init with
address of MAINPOOL.
(_gcry_private_is_secure): Ditto.
(_gcry_secmem_term): Ditto.
(_gcry_secmem_dump_stats): Ditto.
(_gcry_secmem_free_internal): Ditto.  Remove check for NULL arg.
(_gcry_secmem_free): Add check for NULL arg before taking the lock.
(_gcry_secmem_realloc): Factor most code out to ...
(_gcry_secmem_realloc_internal): this.
--

This change prepares future work to allow the use of several pools.

Signed-off-by: Werner Koch <wk@gnupg.org>
2 years agotests: Add PBKDF2 tests for Stribog512.
Dmitry Eremin-Solenikov [Fri, 25 Nov 2016 12:52:47 +0000 (15:52 +0300)]
tests: Add PBKDF2 tests for Stribog512.

* tests/t-kdf.c (check_pbkdf2): Add Stribog512 test cases from TC26's
additions to PKCS#5.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2 years agotests: Add Stribog HMAC tests from TC26ALG.
Dmitry Eremin-Solenikov [Fri, 25 Nov 2016 12:52:46 +0000 (15:52 +0300)]
tests: Add Stribog HMAC tests from TC26ALG.

* tests/basic.c (check_mac): add HMAC test vectors from TC26ALG document
for Stribog.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2 years agocipher: Add Stribog OIDs from TC26 space.
Dmitry Eremin-Solenikov [Fri, 25 Nov 2016 12:52:45 +0000 (15:52 +0300)]
cipher: Add Stribog OIDs from TC26 space.

* cipher/stribog.c (oid_spec_stribog256, oid_spec_stribog512): New.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2 years agotests: Fix memory leak.
Justus Winter [Fri, 25 Nov 2016 08:38:51 +0000 (09:38 +0100)]
tests: Fix memory leak.

* tests/basic.c (check_gost28147_cipher): Free cipher handles.

Fixes-commit: 4f5c26c73c66daf2e4aff966e43c22b2db7e0138
Signed-off-by: Justus Winter <justus@g10code.com>
2 years agoCast oid argument of gcry_cipher_set_sbox to disable compiler warning.
Dmitry Eremin-Solenikov [Wed, 23 Nov 2016 05:38:33 +0000 (08:38 +0300)]
Cast oid argument of gcry_cipher_set_sbox to disable compiler warning.

* src/gcrypt.h.in (gcry_cipher_set_sbox): Cast oid to (void *).

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2 years agogost: Rename tc26 s-box from A to Z.
Dmitry Eremin-Solenikov [Wed, 23 Nov 2016 05:38:32 +0000 (08:38 +0300)]
gost: Rename tc26 s-box from A to Z.

* cipher/gost-s-box.c (gost_sboxes): Rename TC26_A to TC26_Z as it is
the name that ended up in all standards.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2 years agotests: Add test to verify GOST 28147-89 against known results.
Dmitry Eremin-Solenikov [Wed, 23 Nov 2016 05:38:31 +0000 (08:38 +0300)]
tests: Add test to verify GOST 28147-89 against known results.

* tests/basic.c (check_gost28147_cipher): new test function.

--
Currently the only test executed against GOST 28147-89 cipher is a
basic cipher test: it checks that decoding of encoded text returns
the original plaintext. Add a function to verify the cipher against
test vectors.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2 years agocipher/gost28147: Fix CryptoPro-B S-BOX.
Dmitry Eremin-Solenikov [Wed, 16 Nov 2016 20:36:01 +0000 (23:36 +0300)]
cipher/gost28147: Fix CryptoPro-B S-BOX.

* cipher/gost-s-box.c: CryptoPro_B s-box missed one line, resulting in
incorrect encryption/decryption using that s-box.  Add missing data.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2 years agoPut blocking calls into Libgpg-error's system call clamp.
Werner Koch [Sat, 12 Nov 2016 10:34:49 +0000 (11:34 +0100)]
Put blocking calls into Libgpg-error's system call clamp.

* src/gcrypt.h.in (GCRYCTL_REINIT_SYSCALL_CLAMP): New.
* configure.ac: Require Libgpg-error 1.25.  Set version number to
1.8.0.
* src/gcrypt-int.h: Remove error code emulation.
* src/global.c (pre_syscall_func, post_syscall_func): New.
(global_init): Call gpgrt_get_syscall_clamp.
(_gcry_vcontrol) <GCRYCTL_REINIT_SYSCALL_CLAMP>: Ditto.
(_gcry_pre_syscall, _gcry_post_syscall): New.
* random/rndlinux.c (_gcry_rndlinux_gather_random): Use the new
functions.

Signed-off-by: Werner Koch <wk@gnupg.org>
2 years agocipher: Fix IDEA cipher for clearing memory.
NIIBE Yutaka [Tue, 1 Nov 2016 05:34:16 +0000 (14:34 +0900)]
cipher: Fix IDEA cipher for clearing memory.

* cipher/idea.c (invert_key): Use wipememory, since this kind of memset
may be removed by compiler optimization.

--
Reported-by: Zhaomo Yang and Brian Johannesmeyer
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
3 years agoGCM: Add bulk processing for ARMv8/AArch64 implementation
Jussi Kivilinna [Sun, 9 Oct 2016 09:53:48 +0000 (12:53 +0300)]
GCM: Add bulk processing for ARMv8/AArch64 implementation

* cipher/cipher-gcm-armv8-aarch64-ce.S: Add 6 blocks bulk processing.
--

Benchmark on Cortex-A53 (1152 Mhz):

Before:
                    |  nanosecs/byte   mebibytes/sec   cycles/byte
 GMAC_AES           |      1.30 ns/B     731.6 MiB/s      1.50 c/B

After (1.49x faster):
                    |  nanosecs/byte   mebibytes/sec   cycles/byte
 GMAC_AES           |     0.873 ns/B    1092.1 MiB/s      1.01 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoGCM: Add bulk processing for ARMv8/AArch32 implementation
Jussi Kivilinna [Sun, 9 Oct 2016 09:52:55 +0000 (12:52 +0300)]
GCM: Add bulk processing for ARMv8/AArch32 implementation

* cipher/cipher-gcm-armv8-aarch32-ce.S: Add 4 blocks bulk processing.
* tests/basic.c (check_digests): Print correct data length for "?"
tests.
(check_one_mac): Add large 1000000 bytes tests, when input is "!" or
"?".
(check_mac): Add "?" tests vectors for HMAC, CMAC, GMAC and POLY1305.
--

Benchmark on Cortex-A53 (1152 Mhz):

Before:
                    |  nanosecs/byte   mebibytes/sec   cycles/byte
 GMAC_AES           |     0.924 ns/B    1032.2 MiB/s      1.06 c/B

After (1.21x faster):
                    |  nanosecs/byte   mebibytes/sec   cycles/byte
 GMAC_AES           |     0.764 ns/B    1248.2 MiB/s     0.880 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoAdd Aarch64 assembly implementation of Twofish
Jussi Kivilinna [Wed, 27 Apr 2016 15:18:54 +0000 (18:18 +0300)]
Add Aarch64 assembly implementation of Twofish

* cipher/Makefile.am: Add 'twofish-aarch64.S'.
* cipher/twofish-aarch64.S: New.
* cipher/twofish.c: Enable USE_ARM_ASM if __AARCH64EL__ and
HAVE_COMPATIBLE_GCC_AARCH64_PLATFORM_AS defined.
* configure.ac [host=aarch64]: Add 'twofish-aarch64.lo'.
--

Patch adds ARMv8/Aarch64 implementation of Twofish.

Benchmark on Cortex-A53 (1152 Mhz):

 Before:
 TWOFISH        |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     27.51 ns/B     34.67 MiB/s     31.69 c/B
        ECB dec |     26.37 ns/B     36.17 MiB/s     30.38 c/B
        CBC enc |     28.64 ns/B     33.29 MiB/s     33.00 c/B
        CBC dec |     26.21 ns/B     36.39 MiB/s     30.19 c/B
        CFB enc |     28.54 ns/B     33.42 MiB/s     32.88 c/B
        CFB dec |     27.40 ns/B     34.81 MiB/s     31.56 c/B
        OFB enc |     28.38 ns/B     33.61 MiB/s     32.69 c/B
        OFB dec |     28.37 ns/B     33.61 MiB/s     32.69 c/B
        CTR enc |     27.57 ns/B     34.60 MiB/s     31.76 c/B
        CTR dec |     27.57 ns/B     34.60 MiB/s     31.76 c/B
        CCM enc |     55.28 ns/B     17.25 MiB/s     63.69 c/B
        CCM dec |     55.29 ns/B     17.25 MiB/s     63.70 c/B
       CCM auth |     27.83 ns/B     34.27 MiB/s     32.06 c/B
        GCM enc |     28.86 ns/B     33.04 MiB/s     33.25 c/B
        GCM dec |     28.87 ns/B     33.04 MiB/s     33.25 c/B
       GCM auth |      1.30 ns/B     731.9 MiB/s      1.50 c/B
        OCB enc |     29.69 ns/B     32.12 MiB/s     34.20 c/B
        OCB dec |     28.50 ns/B     33.47 MiB/s     32.83 c/B
       OCB auth |     29.04 ns/B     32.84 MiB/s     33.45 c/B
                =

 After (~1.3x faster):
 TWOFISH        |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     19.97 ns/B     47.77 MiB/s     23.00 c/B
        ECB dec |     18.29 ns/B     52.16 MiB/s     21.06 c/B
        CBC enc |     20.94 ns/B     45.54 MiB/s     24.13 c/B
        CBC dec |     18.34 ns/B     52.00 MiB/s     21.13 c/B
        CFB enc |     20.83 ns/B     45.77 MiB/s     24.00 c/B
        CFB dec |     19.97 ns/B     47.76 MiB/s     23.00 c/B
        OFB enc |     20.94 ns/B     45.54 MiB/s     24.13 c/B
        OFB dec |     20.94 ns/B     45.54 MiB/s     24.13 c/B
        CTR enc |     20.19 ns/B     47.24 MiB/s     23.26 c/B
        CTR dec |     20.19 ns/B     47.24 MiB/s     23.26 c/B
        CCM enc |     40.53 ns/B     23.53 MiB/s     46.69 c/B
        CCM dec |     40.53 ns/B     23.53 MiB/s     46.69 c/B
       CCM auth |     20.40 ns/B     46.74 MiB/s     23.50 c/B
        GCM enc |     21.49 ns/B     44.39 MiB/s     24.75 c/B
        GCM dec |     21.48 ns/B     44.39 MiB/s     24.75 c/B
       GCM auth |      1.30 ns/B     731.8 MiB/s      1.50 c/B
        OCB enc |     22.15 ns/B     43.05 MiB/s     25.52 c/B
        OCB dec |     20.47 ns/B     46.58 MiB/s     23.59 c/B
       OCB auth |     21.64 ns/B     44.07 MiB/s     24.93 c/B
                =

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoAdd Aarch64 assembly implementation of Camellia
Jussi Kivilinna [Wed, 27 Apr 2016 15:18:54 +0000 (18:18 +0300)]
Add Aarch64 assembly implementation of Camellia

* cipher/Makefile.am: Add 'camellia-aarch64.S'.
* cipher/camellia-aarch64.S: New.
* cipher/camellia-glue.c [USE_ARM_ASM][__aarch64__]: Set stack burn
size to zero.
* cipher/camellia.h: Enable USE_ARM_ASM if __AARCH64EL__ and
HAVE_COMPATIBLE_GCC_AARCH64_PLATFORM_AS defined.
* configure.ac [host=aarch64]: Add 'rijndael-aarch64.lo'.
--

Patch adds ARMv8/Aarch64 implementation of Camellia.

Benchmark on Cortex-A53 (1152 Mhz):

 Before:
 CAMELLIA128    |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     39.71 ns/B     24.01 MiB/s     45.75 c/B
        ECB dec |     39.72 ns/B     24.01 MiB/s     45.75 c/B
        CBC enc |     40.80 ns/B     23.38 MiB/s     47.00 c/B
        CBC dec |     39.66 ns/B     24.05 MiB/s     45.69 c/B
        CFB enc |     40.69 ns/B     23.44 MiB/s     46.88 c/B
        CFB dec |     39.66 ns/B     24.05 MiB/s     45.69 c/B
        OFB enc |     40.69 ns/B     23.44 MiB/s     46.88 c/B
        OFB dec |     40.69 ns/B     23.44 MiB/s     46.88 c/B
        CTR enc |     39.88 ns/B     23.91 MiB/s     45.94 c/B
        CTR dec |     39.88 ns/B     23.91 MiB/s     45.94 c/B
        CCM enc |     79.97 ns/B     11.92 MiB/s     92.13 c/B
        CCM dec |     79.97 ns/B     11.93 MiB/s     92.13 c/B
       CCM auth |     40.20 ns/B     23.72 MiB/s     46.31 c/B
        GCM enc |     41.18 ns/B     23.16 MiB/s     47.44 c/B
        GCM dec |     41.18 ns/B     23.16 MiB/s     47.44 c/B
       GCM auth |      1.30 ns/B     732.7 MiB/s      1.50 c/B
        OCB enc |     42.04 ns/B     22.69 MiB/s     48.43 c/B
        OCB dec |     42.03 ns/B     22.69 MiB/s     48.42 c/B
       OCB auth |     41.38 ns/B     23.05 MiB/s     47.67 c/B
                =
 CAMELLIA256    |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     52.36 ns/B     18.22 MiB/s     60.31 c/B
        ECB dec |     52.36 ns/B     18.22 MiB/s     60.31 c/B
        CBC enc |     53.39 ns/B     17.86 MiB/s     61.50 c/B
        CBC dec |     52.14 ns/B     18.29 MiB/s     60.06 c/B
        CFB enc |     53.28 ns/B     17.90 MiB/s     61.38 c/B
        CFB dec |     52.14 ns/B     18.29 MiB/s     60.06 c/B
        OFB enc |     53.17 ns/B     17.94 MiB/s     61.25 c/B
        OFB dec |     53.17 ns/B     17.94 MiB/s     61.25 c/B
        CTR enc |     52.36 ns/B     18.21 MiB/s     60.32 c/B
        CTR dec |     52.36 ns/B     18.21 MiB/s     60.32 c/B
        CCM enc |     105.0 ns/B      9.08 MiB/s     120.9 c/B
        CCM dec |     105.0 ns/B      9.08 MiB/s     120.9 c/B
       CCM auth |     52.74 ns/B     18.08 MiB/s     60.75 c/B
        GCM enc |     53.66 ns/B     17.77 MiB/s     61.81 c/B
        GCM dec |     53.66 ns/B     17.77 MiB/s     61.82 c/B
       GCM auth |      1.30 ns/B     732.3 MiB/s      1.50 c/B
        OCB enc |     54.54 ns/B     17.49 MiB/s     62.83 c/B
        OCB dec |     54.48 ns/B     17.50 MiB/s     62.77 c/B
       OCB auth |     53.89 ns/B     17.70 MiB/s     62.09 c/B
                =

 After (~1.7x faster):
 CAMELLIA128    |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     22.25 ns/B     42.87 MiB/s     25.63 c/B
        ECB dec |     22.25 ns/B     42.87 MiB/s     25.63 c/B
        CBC enc |     23.27 ns/B     40.97 MiB/s     26.81 c/B
        CBC dec |     22.14 ns/B     43.08 MiB/s     25.50 c/B
        CFB enc |     23.17 ns/B     41.17 MiB/s     26.69 c/B
        CFB dec |     22.14 ns/B     43.08 MiB/s     25.50 c/B
        OFB enc |     23.11 ns/B     41.26 MiB/s     26.63 c/B
        OFB dec |     23.11 ns/B     41.26 MiB/s     26.63 c/B
        CTR enc |     22.36 ns/B     42.65 MiB/s     25.76 c/B
        CTR dec |     22.36 ns/B     42.65 MiB/s     25.76 c/B
        CCM enc |     44.87 ns/B     21.26 MiB/s     51.69 c/B
        CCM dec |     44.87 ns/B     21.25 MiB/s     51.69 c/B
       CCM auth |     22.62 ns/B     42.15 MiB/s     26.06 c/B
        GCM enc |     23.66 ns/B     40.31 MiB/s     27.25 c/B
        GCM dec |     23.66 ns/B     40.31 MiB/s     27.25 c/B
       GCM auth |      1.30 ns/B     732.0 MiB/s      1.50 c/B
        OCB enc |     24.32 ns/B     39.21 MiB/s     28.02 c/B
        OCB dec |     24.32 ns/B     39.21 MiB/s     28.02 c/B
       OCB auth |     23.75 ns/B     40.15 MiB/s     27.36 c/B
                =
 CAMELLIA256    |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     29.08 ns/B     32.79 MiB/s     33.50 c/B
        ECB dec |     29.19 ns/B     32.67 MiB/s     33.63 c/B
        CBC enc |     30.11 ns/B     31.67 MiB/s     34.69 c/B
        CBC dec |     29.05 ns/B     32.83 MiB/s     33.47 c/B
        CFB enc |     30.00 ns/B     31.79 MiB/s     34.56 c/B
        CFB dec |     28.97 ns/B     32.91 MiB/s     33.38 c/B
        OFB enc |     29.95 ns/B     31.84 MiB/s     34.50 c/B
        OFB dec |     29.95 ns/B     31.84 MiB/s     34.50 c/B
        CTR enc |     29.19 ns/B     32.67 MiB/s     33.63 c/B
        CTR dec |     29.19 ns/B     32.67 MiB/s     33.63 c/B
        CCM enc |     58.54 ns/B     16.29 MiB/s     67.43 c/B
        CCM dec |     58.54 ns/B     16.29 MiB/s     67.44 c/B
       CCM auth |     29.46 ns/B     32.37 MiB/s     33.94 c/B
        GCM enc |     30.49 ns/B     31.28 MiB/s     35.12 c/B
        GCM dec |     30.49 ns/B     31.27 MiB/s     35.13 c/B
       GCM auth |      1.30 ns/B     731.6 MiB/s      1.50 c/B
        OCB enc |     31.16 ns/B     30.61 MiB/s     35.90 c/B
        OCB dec |     31.22 ns/B     30.55 MiB/s     35.96 c/B
       OCB auth |     30.59 ns/B     31.18 MiB/s     35.24 c/B
                =

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoAdd ARMv8/AArch64 Crypto Extension implementation of AES
Jussi Kivilinna [Sun, 4 Sep 2016 10:41:02 +0000 (13:41 +0300)]
Add ARMv8/AArch64 Crypto Extension implementation of AES

* cipher/Makefile.am: Add 'rijndael-armv-aarch64-ce.S'.
* cipher/rijndael-armv8-aarch64-ce.S: New.
* cipher/rijndael-internal.h (USE_ARM_CE): Enable for ARMv8/AArch64.
* configure.ac: Add 'rijndael-armv-aarch64-ce.lo' and
'rijndael-armv8-ce.lo' for ARMv8/AArch64.
--

Improvement vs AArch64 assembly on Cortex-A53:

           AES-128  AES-192  AES-256
CBC enc:    13.19x   13.53x   13.76x
CBC dec:    20.53x   21.91x   22.60x
CFB enc:    14.29x   14.50x   14.63x
CFB dec:    20.42x   21.69x   22.50x
CTR:        18.29x   19.61x   20.53x
OCB enc:    15.21x   16.32x   17.12x
OCB dec:    14.95x   16.11x   16.88x
OCB auth:   16.73x   17.93x   18.66x

Benchmark on Cortex-A53 (1152 Mhz):

Before:
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     21.86 ns/B     43.62 MiB/s     25.19 c/B
        ECB dec |     22.68 ns/B     42.05 MiB/s     26.13 c/B
        CBC enc |     18.66 ns/B     51.10 MiB/s     21.50 c/B
        CBC dec |     18.72 ns/B     50.95 MiB/s     21.56 c/B
        CFB enc |     18.61 ns/B     51.25 MiB/s     21.44 c/B
        CFB dec |     18.61 ns/B     51.25 MiB/s     21.44 c/B
        OFB enc |     22.84 ns/B     41.75 MiB/s     26.31 c/B
        OFB dec |     22.84 ns/B     41.75 MiB/s     26.31 c/B
        CTR enc |     18.89 ns/B     50.50 MiB/s     21.76 c/B
        CTR dec |     18.89 ns/B     50.50 MiB/s     21.76 c/B
        CCM enc |     37.55 ns/B     25.40 MiB/s     43.25 c/B
        CCM dec |     37.55 ns/B     25.40 MiB/s     43.25 c/B
       CCM auth |     18.77 ns/B     50.80 MiB/s     21.63 c/B
        GCM enc |     20.18 ns/B     47.25 MiB/s     23.25 c/B
        GCM dec |     20.18 ns/B     47.25 MiB/s     23.25 c/B
       GCM auth |      1.30 ns/B     732.5 MiB/s      1.50 c/B
        OCB enc |     19.67 ns/B     48.48 MiB/s     22.66 c/B
        OCB dec |     19.73 ns/B     48.34 MiB/s     22.72 c/B
       OCB auth |     19.46 ns/B     49.00 MiB/s     22.42 c/B
                =
 AES192         |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     25.39 ns/B     37.56 MiB/s     29.25 c/B
        ECB dec |     26.15 ns/B     36.47 MiB/s     30.13 c/B
        CBC enc |     22.08 ns/B     43.19 MiB/s     25.44 c/B
        CBC dec |     22.25 ns/B     42.87 MiB/s     25.63 c/B
        CFB enc |     22.03 ns/B     43.30 MiB/s     25.38 c/B
        CFB dec |     22.03 ns/B     43.29 MiB/s     25.38 c/B
        OFB enc |     26.26 ns/B     36.32 MiB/s     30.25 c/B
        OFB dec |     26.26 ns/B     36.32 MiB/s     30.25 c/B
        CTR enc |     22.30 ns/B     42.76 MiB/s     25.69 c/B
        CTR dec |     22.30 ns/B     42.76 MiB/s     25.69 c/B
        CCM enc |     44.38 ns/B     21.49 MiB/s     51.13 c/B
        CCM dec |     44.38 ns/B     21.49 MiB/s     51.13 c/B
       CCM auth |     22.20 ns/B     42.97 MiB/s     25.57 c/B
        GCM enc |     23.60 ns/B     40.41 MiB/s     27.19 c/B
        GCM dec |     23.60 ns/B     40.41 MiB/s     27.19 c/B
       GCM auth |      1.30 ns/B     732.4 MiB/s      1.50 c/B
        OCB enc |     23.09 ns/B     41.31 MiB/s     26.60 c/B
        OCB dec |     23.21 ns/B     41.09 MiB/s     26.74 c/B
       OCB auth |     22.88 ns/B     41.68 MiB/s     26.36 c/B
                =
 AES256         |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     28.76 ns/B     33.17 MiB/s     33.13 c/B
        ECB dec |     29.46 ns/B     32.37 MiB/s     33.94 c/B
        CBC enc |     25.45 ns/B     37.48 MiB/s     29.31 c/B
        CBC dec |     25.50 ns/B     37.40 MiB/s     29.38 c/B
        CFB enc |     25.39 ns/B     37.56 MiB/s     29.25 c/B
        CFB dec |     25.39 ns/B     37.56 MiB/s     29.25 c/B
        OFB enc |     29.62 ns/B     32.19 MiB/s     34.13 c/B
        OFB dec |     29.62 ns/B     32.19 MiB/s     34.13 c/B
        CTR enc |     25.67 ns/B     37.15 MiB/s     29.57 c/B
        CTR dec |     25.67 ns/B     37.15 MiB/s     29.57 c/B
        CCM enc |     51.11 ns/B     18.66 MiB/s     58.88 c/B
        CCM dec |     51.11 ns/B     18.66 MiB/s     58.88 c/B
       CCM auth |     25.56 ns/B     37.32 MiB/s     29.44 c/B
        GCM enc |     26.96 ns/B     35.37 MiB/s     31.06 c/B
        GCM dec |     26.98 ns/B     35.35 MiB/s     31.08 c/B
       GCM auth |      1.30 ns/B     733.4 MiB/s      1.50 c/B
        OCB enc |     26.45 ns/B     36.05 MiB/s     30.47 c/B
        OCB dec |     26.53 ns/B     35.95 MiB/s     30.56 c/B
       OCB auth |     26.24 ns/B     36.34 MiB/s     30.23 c/B
                =

After:
Cipher:
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |      4.83 ns/B     197.5 MiB/s      5.56 c/B
        ECB dec |      4.99 ns/B     191.1 MiB/s      5.75 c/B
        CBC enc |      1.41 ns/B     675.5 MiB/s      1.63 c/B
        CBC dec |     0.911 ns/B    1046.9 MiB/s      1.05 c/B
        CFB enc |      1.30 ns/B     732.2 MiB/s      1.50 c/B
        CFB dec |     0.911 ns/B    1046.7 MiB/s      1.05 c/B
        OFB enc |      5.81 ns/B     164.3 MiB/s      6.69 c/B
        OFB dec |      5.81 ns/B     164.3 MiB/s      6.69 c/B
        CTR enc |      1.03 ns/B     924.0 MiB/s      1.19 c/B
        CTR dec |      1.03 ns/B     924.1 MiB/s      1.19 c/B
        CCM enc |      2.50 ns/B     381.8 MiB/s      2.88 c/B
        CCM dec |      2.50 ns/B     381.7 MiB/s      2.88 c/B
       CCM auth |      1.57 ns/B     606.1 MiB/s      1.81 c/B
        GCM enc |      2.33 ns/B     408.5 MiB/s      2.69 c/B
        GCM dec |      2.34 ns/B     408.4 MiB/s      2.69 c/B
       GCM auth |      1.30 ns/B     732.1 MiB/s      1.50 c/B
        OCB enc |      1.29 ns/B     736.6 MiB/s      1.49 c/B
        OCB dec |      1.32 ns/B     724.4 MiB/s      1.52 c/B
       OCB auth |      1.16 ns/B     819.6 MiB/s      1.34 c/B
                =
 AES192         |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |      5.48 ns/B     174.0 MiB/s      6.31 c/B
        ECB dec |      5.64 ns/B     169.0 MiB/s      6.50 c/B
        CBC enc |      1.63 ns/B     585.8 MiB/s      1.88 c/B
        CBC dec |      1.02 ns/B     935.8 MiB/s      1.17 c/B
        CFB enc |      1.52 ns/B     627.7 MiB/s      1.75 c/B
        CFB dec |      1.02 ns/B     935.9 MiB/s      1.17 c/B
        OFB enc |      6.46 ns/B     147.7 MiB/s      7.44 c/B
        OFB dec |      6.46 ns/B     147.7 MiB/s      7.44 c/B
        CTR enc |      1.14 ns/B     836.1 MiB/s      1.31 c/B
        CTR dec |      1.14 ns/B     835.9 MiB/s      1.31 c/B
        CCM enc |      2.83 ns/B     337.6 MiB/s      3.25 c/B
        CCM dec |      2.82 ns/B     338.0 MiB/s      3.25 c/B
       CCM auth |      1.79 ns/B     532.7 MiB/s      2.06 c/B
        GCM enc |      2.44 ns/B     390.3 MiB/s      2.82 c/B
        GCM dec |      2.44 ns/B     390.2 MiB/s      2.82 c/B
       GCM auth |      1.30 ns/B     731.9 MiB/s      1.50 c/B
        OCB enc |      1.41 ns/B     674.7 MiB/s      1.63 c/B
        OCB dec |      1.44 ns/B     662.0 MiB/s      1.66 c/B
       OCB auth |      1.28 ns/B     746.1 MiB/s      1.47 c/B
                =
 AES256         |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |      6.13 ns/B     155.5 MiB/s      7.06 c/B
        ECB dec |      6.29 ns/B     151.5 MiB/s      7.25 c/B
        CBC enc |      1.85 ns/B     516.8 MiB/s      2.13 c/B
        CBC dec |      1.13 ns/B     845.6 MiB/s      1.30 c/B
        CFB enc |      1.74 ns/B     549.5 MiB/s      2.00 c/B
        CFB dec |      1.13 ns/B     846.1 MiB/s      1.30 c/B
        OFB enc |      7.11 ns/B     134.2 MiB/s      8.19 c/B
        OFB dec |      7.11 ns/B     134.2 MiB/s      8.19 c/B
        CTR enc |      1.25 ns/B     763.5 MiB/s      1.44 c/B
        CTR dec |      1.25 ns/B     763.4 MiB/s      1.44 c/B
        CCM enc |      3.15 ns/B     302.9 MiB/s      3.63 c/B
        CCM dec |      3.15 ns/B     302.9 MiB/s      3.63 c/B
       CCM auth |      2.01 ns/B     474.2 MiB/s      2.32 c/B
        GCM enc |      2.55 ns/B     374.2 MiB/s      2.94 c/B
        GCM dec |      2.55 ns/B     373.7 MiB/s      2.94 c/B
       GCM auth |      1.30 ns/B     732.2 MiB/s      1.50 c/B
        OCB enc |      1.54 ns/B     617.6 MiB/s      1.78 c/B
        OCB dec |      1.57 ns/B     606.8 MiB/s      1.81 c/B
       OCB auth |      1.40 ns/B     679.8 MiB/s      1.62 c/B
                =

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoAdd ARMv8/AArch64 Crypto Extension implementation of GCM
Jussi Kivilinna [Sun, 4 Sep 2016 10:41:02 +0000 (13:41 +0300)]
Add ARMv8/AArch64 Crypto Extension implementation of GCM

* cipher/Makefile.am: Add 'cipher-gcm-armv8-aarch64-ce.S'.
* cipher/cipher-gcm-armv8-aarch64-ce.S: New.
* cipher/cipher-internal.h (GCM_USE_ARM_PMULL): Enable on
ARMv8/AArch64.
--

Benchmark on Cortex-A53 (1152 Mhz):

Before:
                    |  nanosecs/byte   mebibytes/sec   cycles/byte
 GMAC_AES           |     15.54 ns/B     61.36 MiB/s     17.91 c/B

After (11.9x faster):
                    |  nanosecs/byte   mebibytes/sec   cycles/byte
 GMAC_AES           |      1.30 ns/B     731.5 MiB/s      1.50 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoAdd ARMv8/AArch64 Crypto Extension implementation of SHA-256
Jussi Kivilinna [Sun, 4 Sep 2016 10:41:02 +0000 (13:41 +0300)]
Add ARMv8/AArch64 Crypto Extension implementation of SHA-256

* cipher/Makefile.am: Add 'sha256-armv8-aarch64-ce.S'.
* cipher/sha256-armv8-aarch64-ce.S: New.
* cipher/sha256-armv8-aarch32-ce.S: Move round macros to correct
section.
* cipher/sha256.c (USE_ARM_CE): Enable on ARMv8/AArch64.
* configure.ac: Add 'sha256-armv8-aarch64-ce.lo'; Swap places for
'sha512-arm.lo' and 'sha256-armv8-aarch32-ce.lo'.
--

Benchmark on Cortex-A53 (1152 Mhz):

Before:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 SHA256         |     13.34 ns/B     71.51 MiB/s     15.36 c/B

After (7.2x faster):
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 SHA256         |      1.85 ns/B     516.3 MiB/s      2.13 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoAdd ARMv8/AArch64 Crypto Extension implementation of SHA-1
Jussi Kivilinna [Sun, 4 Sep 2016 10:41:02 +0000 (13:41 +0300)]
Add ARMv8/AArch64 Crypto Extension implementation of SHA-1

* cipher/Makefile.am: Add 'sha1-armv8-aarch64-ce.S'.
* cipher/sha1-armv8-aarch64-ce.S: New.
* cipher/sha1.c (USE_ARM_CE): Enable on ARMv8/AArch64.
* configure.ac: Add 'sha1-armv8-aarch64-ce.lo'.
--

Benchmark on Cortex-A53 (1152 Mhz):

Before:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 SHA1           |      7.54 ns/B     126.4 MiB/s      8.69 c/B

After (4.3x faster):
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 SHA1           |      1.72 ns/B     553.0 MiB/s      1.99 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoAdd AArch64 assembly implementation of AES
Jussi Kivilinna [Sun, 4 Sep 2016 10:41:02 +0000 (13:41 +0300)]
Add AArch64 assembly implementation of AES

* cipher/Makefile.am: Add 'rijndael-aarch64.S'.
* cipher/rijndael-aarch64.S: New.
* cipher/rijndael-internal.h: Enable USE_ARM_ASM if __AARCH64EL__ and
HAVE_COMPATIBLE_GCC_AARCH64_PLATFORM_AS defined.
* configure.ac (gcry_cv_gcc_aarch64_platform_as_ok): New check.
[host=aarch64]: Add 'rijndael-aarch64.lo'.
--

Patch adds ARMv8/Aarch64 implementation of AES.

Benchmark on Cortex-A53 (1536 Mhz):

 Before:

 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     19.37 ns/B     49.22 MiB/s     29.76 c/B
        ECB dec |     19.85 ns/B     48.03 MiB/s     30.50 c/B
        CBC enc |     16.84 ns/B     56.62 MiB/s     25.87 c/B
        CBC dec |     16.81 ns/B     56.74 MiB/s     25.82 c/B
        CFB enc |     16.80 ns/B     56.75 MiB/s     25.81 c/B
        CFB dec |     16.81 ns/B     56.75 MiB/s     25.81 c/B
        OFB enc |     20.02 ns/B     47.64 MiB/s     30.75 c/B
        OFB dec |     20.02 ns/B     47.64 MiB/s     30.75 c/B
        CTR enc |     17.06 ns/B     55.91 MiB/s     26.20 c/B
        CTR dec |     17.06 ns/B     55.92 MiB/s     26.20 c/B
        CCM enc |     33.94 ns/B     28.10 MiB/s     52.13 c/B
        CCM dec |     33.94 ns/B     28.10 MiB/s     52.14 c/B
       CCM auth |     16.97 ns/B     56.18 MiB/s     26.07 c/B
        GCM enc |     28.70 ns/B     33.23 MiB/s     44.09 c/B
        GCM dec |     28.70 ns/B     33.23 MiB/s     44.09 c/B
       GCM auth |     11.66 ns/B     81.81 MiB/s     17.90 c/B
        OCB enc |     17.66 ns/B     53.99 MiB/s     27.13 c/B
        OCB dec |     17.61 ns/B     54.16 MiB/s     27.05 c/B
       OCB auth |     17.44 ns/B     54.69 MiB/s     26.78 c/B
                =
 AES192         |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     21.82 ns/B     43.71 MiB/s     33.51 c/B
        ECB dec |     22.55 ns/B     42.30 MiB/s     34.63 c/B
        CBC enc |     19.33 ns/B     49.33 MiB/s     29.70 c/B
        CBC dec |     19.50 ns/B     48.91 MiB/s     29.95 c/B
        CFB enc |     19.29 ns/B     49.44 MiB/s     29.63 c/B
        CFB dec |     19.28 ns/B     49.46 MiB/s     29.61 c/B
        OFB enc |     22.49 ns/B     42.40 MiB/s     34.55 c/B
        OFB dec |     22.50 ns/B     42.38 MiB/s     34.56 c/B
        CTR enc |     19.53 ns/B     48.83 MiB/s     30.00 c/B
        CTR dec |     19.54 ns/B     48.80 MiB/s     30.02 c/B
        CCM enc |     38.91 ns/B     24.51 MiB/s     59.77 c/B
        CCM dec |     38.90 ns/B     24.51 MiB/s     59.76 c/B
       CCM auth |     19.45 ns/B     49.02 MiB/s     29.88 c/B
        GCM enc |     31.13 ns/B     30.63 MiB/s     47.82 c/B
        GCM dec |     31.14 ns/B     30.63 MiB/s     47.82 c/B
       GCM auth |     11.66 ns/B     81.80 MiB/s     17.91 c/B
        OCB enc |     20.15 ns/B     47.33 MiB/s     30.95 c/B
        OCB dec |     20.30 ns/B     46.98 MiB/s     31.18 c/B
       OCB auth |     19.92 ns/B     47.88 MiB/s     30.59 c/B
                =
 AES256         |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     24.33 ns/B     39.19 MiB/s     37.38 c/B
        ECB dec |     25.23 ns/B     37.80 MiB/s     38.76 c/B
        CBC enc |     21.82 ns/B     43.71 MiB/s     33.51 c/B
        CBC dec |     22.18 ns/B     42.99 MiB/s     34.07 c/B
        CFB enc |     21.77 ns/B     43.80 MiB/s     33.44 c/B
        CFB dec |     21.77 ns/B     43.81 MiB/s     33.44 c/B
        OFB enc |     24.99 ns/B     38.16 MiB/s     38.39 c/B
        OFB dec |     24.99 ns/B     38.17 MiB/s     38.38 c/B
        CTR enc |     22.02 ns/B     43.32 MiB/s     33.82 c/B
        CTR dec |     22.02 ns/B     43.31 MiB/s     33.82 c/B
        CCM enc |     43.86 ns/B     21.74 MiB/s     67.38 c/B
        CCM dec |     43.87 ns/B     21.74 MiB/s     67.39 c/B
       CCM auth |     21.94 ns/B     43.48 MiB/s     33.69 c/B
        GCM enc |     33.66 ns/B     28.33 MiB/s     51.71 c/B
        GCM dec |     33.66 ns/B     28.33 MiB/s     51.70 c/B
       GCM auth |     11.69 ns/B     81.59 MiB/s     17.95 c/B
        OCB enc |     22.90 ns/B     41.65 MiB/s     35.17 c/B
        OCB dec |     23.25 ns/B     41.02 MiB/s     35.71 c/B
       OCB auth |     22.69 ns/B     42.03 MiB/s     34.85 c/B
                =

 After (~1.2x faster):

 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     16.40 ns/B     58.16 MiB/s     25.19 c/B
        ECB dec |     17.01 ns/B     56.07 MiB/s     26.13 c/B
        CBC enc |     13.99 ns/B     68.15 MiB/s     21.49 c/B
        CBC dec |     14.04 ns/B     67.94 MiB/s     21.56 c/B
        CFB enc |     13.96 ns/B     68.32 MiB/s     21.44 c/B
        CFB dec |     13.95 ns/B     68.34 MiB/s     21.43 c/B
        OFB enc |     17.14 ns/B     55.65 MiB/s     26.32 c/B
        OFB dec |     17.13 ns/B     55.67 MiB/s     26.31 c/B
        CTR enc |     14.17 ns/B     67.31 MiB/s     21.76 c/B
        CTR dec |     14.17 ns/B     67.29 MiB/s     21.77 c/B
        CCM enc |     28.16 ns/B     33.86 MiB/s     43.26 c/B
        CCM dec |     28.16 ns/B     33.87 MiB/s     43.26 c/B
       CCM auth |     14.08 ns/B     67.71 MiB/s     21.63 c/B
        GCM enc |     25.82 ns/B     36.94 MiB/s     39.66 c/B
        GCM dec |     25.82 ns/B     36.94 MiB/s     39.65 c/B
       GCM auth |     11.67 ns/B     81.74 MiB/s     17.92 c/B
        OCB enc |     14.78 ns/B     64.55 MiB/s     22.69 c/B
        OCB dec |     14.80 ns/B     64.43 MiB/s     22.74 c/B
       OCB auth |     14.59 ns/B     65.36 MiB/s     22.41 c/B
                =
 AES192         |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     19.05 ns/B     50.07 MiB/s     29.25 c/B
        ECB dec |     19.62 ns/B     48.62 MiB/s     30.13 c/B
        CBC enc |     16.56 ns/B     57.59 MiB/s     25.44 c/B
        CBC dec |     16.69 ns/B     57.14 MiB/s     25.64 c/B
        CFB enc |     16.52 ns/B     57.71 MiB/s     25.38 c/B
        CFB dec |     16.52 ns/B     57.73 MiB/s     25.37 c/B
        OFB enc |     19.70 ns/B     48.41 MiB/s     30.26 c/B
        OFB dec |     19.69 ns/B     48.43 MiB/s     30.24 c/B
        CTR enc |     16.73 ns/B     57.00 MiB/s     25.70 c/B
        CTR dec |     16.73 ns/B     57.01 MiB/s     25.70 c/B
        CCM enc |     33.29 ns/B     28.65 MiB/s     51.13 c/B
        CCM dec |     33.29 ns/B     28.65 MiB/s     51.13 c/B
       CCM auth |     16.65 ns/B     57.29 MiB/s     25.57 c/B
        GCM enc |     28.39 ns/B     33.60 MiB/s     43.60 c/B
        GCM dec |     28.39 ns/B     33.59 MiB/s     43.60 c/B
       GCM auth |     11.64 ns/B     81.92 MiB/s     17.88 c/B
        OCB enc |     17.33 ns/B     55.03 MiB/s     26.62 c/B
        OCB dec |     17.40 ns/B     54.82 MiB/s     26.72 c/B
       OCB auth |     17.16 ns/B     55.59 MiB/s     26.35 c/B
                =
 AES256         |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     21.56 ns/B     44.23 MiB/s     33.12 c/B
        ECB dec |     22.09 ns/B     43.17 MiB/s     33.93 c/B
        CBC enc |     19.09 ns/B     49.97 MiB/s     29.31 c/B
        CBC dec |     19.13 ns/B     49.86 MiB/s     29.38 c/B
        CFB enc |     19.04 ns/B     50.09 MiB/s     29.24 c/B
        CFB dec |     19.04 ns/B     50.08 MiB/s     29.25 c/B
        OFB enc |     22.22 ns/B     42.93 MiB/s     34.13 c/B
        OFB dec |     22.22 ns/B     42.92 MiB/s     34.13 c/B
        CTR enc |     19.25 ns/B     49.53 MiB/s     29.57 c/B
        CTR dec |     19.25 ns/B     49.55 MiB/s     29.57 c/B
        CCM enc |     38.33 ns/B     24.88 MiB/s     58.88 c/B
        CCM dec |     38.34 ns/B     24.88 MiB/s     58.88 c/B
       CCM auth |     19.17 ns/B     49.76 MiB/s     29.44 c/B
        GCM enc |     30.91 ns/B     30.86 MiB/s     47.47 c/B
        GCM dec |     30.91 ns/B     30.85 MiB/s     47.48 c/B
       GCM auth |     11.71 ns/B     81.47 MiB/s     17.98 c/B
        OCB enc |     19.85 ns/B     48.04 MiB/s     30.49 c/B
        OCB dec |     19.89 ns/B     47.95 MiB/s     30.55 c/B
       OCB auth |     19.67 ns/B     48.48 MiB/s     30.22 c/B
                =

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoPost release updates
Werner Koch [Wed, 17 Aug 2016 11:40:19 +0000 (13:40 +0200)]
Post release updates

--

3 years agoRelease 1.7.3 libgcrypt-1.7.3
Werner Koch [Wed, 17 Aug 2016 11:31:12 +0000 (13:31 +0200)]
Release 1.7.3

* configure.ac: Set LT version to C21/A1/R3.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agorandom: Hash continuous areas in the csprng pool.
Werner Koch [Mon, 8 Aug 2016 10:54:08 +0000 (12:54 +0200)]
random: Hash continuous areas in the csprng pool.

* random/random-csprng.c (mix_pool): Store the first hash at the end
of the pool.
--

This fixes a long standing bug (since 1998) in Libgcrypt and GnuPG.
An attacker who obtains 580 bytes of the random number from the
standard RNG can trivially predict the next 20 bytes of output.

For use in GnuPG this bug does not affect the default generation of
keys because running gpg for key creation creates at most 2 keys from
the pool: For a single 4096 bit RSA key 512 byte of random are
required and thus for the second key (encryption subkey), 20 bytes
could be predicted from the the first key.  However, the security of
an OpenPGP key depends on the primary key (which was generated first)
and thus the 20 predictable bytes should not be a problem.  For the
default key length of 2048 bit nothing will be predictable.

For the former default of DSA+Elgamal key it is complicate to give an
answer: For 2048 bit keys a pool of 30 non-secret candidate primes of
about 300 bits each are first created.  This reads at least 1140 bytes
from the pool and thus parts could be predicted.  At some point a 256
bit secret is read from the pool; which in the worst case might be
partly predictable.

The bug was found and reported by Felix Dörre and Vladimir Klebanov,
Karlsruhe Institute of Technology.  A paper describing the problem in
detail will shortly be published.

CVE-id: CVE-2016-6313
Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agorandom: Improve the diagram showing the random mixing
Werner Koch [Mon, 8 Aug 2016 10:08:43 +0000 (12:08 +0200)]
random: Improve the diagram showing the random mixing

* random/random-csprng.c (mix_pool): Use DIGESTLEN instead of 20.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agocrc-intel-pclmul: split assembly block to ease register pressure
Jussi Kivilinna [Tue, 19 Jul 2016 10:20:53 +0000 (13:20 +0300)]
crc-intel-pclmul: split assembly block to ease register pressure

* cipher/crc-intel-pclmul.c (crc32_less_than_16): Split inline
assembly block handling 4 byte input into multiple blocks.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agorijndael-aesni: split assembly block to ease register pressure
Jussi Kivilinna [Tue, 19 Jul 2016 10:20:13 +0000 (13:20 +0300)]
rijndael-aesni: split assembly block to ease register pressure

* cipher/rijndael-aesni.c (do_aesni_ctr_4): Use single register
constraint for passing 'bige_addb' to assembly block; split
first inline assembly block into two parts.
--

Fixes compiling on i386 with GCC-4.8 and older.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoAdd ARMv8/AArch32 Crypto Extension implementation of AES
Jussi Kivilinna [Thu, 14 Jul 2016 14:55:28 +0000 (17:55 +0300)]
Add ARMv8/AArch32 Crypto Extension implementation of AES

* cipher/Makefile.am: Add 'rijndael-armv8-ce.c' and
'rijndael-armv-aarch32-ce.S'.
* cipher/rijndael-armv8-aarch32-ce.S: New.
* cipher/rijndael-armv8-ce.c: New.
* cipher/rijndael-internal.h (USE_ARM_CE): New.
(RIJNDAEL_context_s): Add 'use_arm_ce'.
* cipher/rijndael.c [USE_ARM_CE] (_gcry_aes_armv8_ce_setkey)
(_gcry_aes_armv8_ce_prepare_decryption)
(_gcry_aes_armv8_ce_encrypt, _gcry_aes_armv8_ce_decrypt)
(_gcry_aes_armv8_ce_cfb_enc, _gcry_aes_armv8_ce_cbc_enc)
(_gcry_aes_armv8_ce_ctr_enc, _gcry_aes_armv8_ce_cfb_dec)
(_gcry_aes_armv8_ce_cbc_dec, _gcry_aes_armv8_ce_ocb_crypt)
(_gcry_aes_armv8_ce_ocb_auth): New.
(do_setkey) [USE_ARM_CE]: Add ARM CE/AES HW feature check and key
setup for ARM CE.
(prepare_decryption, _gcry_aes_cfb_enc, _gcry_aes_cbc_enc)
(_gcry_aes_ctr_enc, _gcry_aes_cfb_dec, _gcry_aes_cbc_dec)
(_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth) [USE_ARM_CE]: Add
ARM CE support.
* configure.ac: Add 'rijndael-armv8-ce.lo' and
'rijndael-armv8-aarch32-ce.lo'.
--

Improvement vs ARM assembly on Cortex-A53:

           AES-128  AES-192  AES-256
CBC enc:   14.8x    12.8x    11.4x
CBC dec:   21.4x    20.5x    19.4x
CFB enc:   16.2x    13.6x    11.6x
CFB dec:   21.6x    20.5x    19.4x
CTR:       19.1x    18.6x    17.8x
OCB enc:   16.0x    16.2x    16.1x
OCB dec:   15.6x    15.9x    15.8x
OCB auth:  18.3x    18.4x    18.0x

Benchmark on Cortex-A53 (1152 Mhz):

Before:
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     24.42 ns/B     39.06 MiB/s     28.13 c/B
        ECB dec |     25.07 ns/B     38.05 MiB/s     28.88 c/B
        CBC enc |     21.05 ns/B     45.30 MiB/s     24.25 c/B
        CBC dec |     21.16 ns/B     45.07 MiB/s     24.38 c/B
        CFB enc |     21.05 ns/B     45.31 MiB/s     24.25 c/B
        CFB dec |     21.38 ns/B     44.61 MiB/s     24.62 c/B
        OFB enc |     26.15 ns/B     36.47 MiB/s     30.13 c/B
        OFB dec |     26.15 ns/B     36.47 MiB/s     30.13 c/B
        CTR enc |     21.17 ns/B     45.06 MiB/s     24.38 c/B
        CTR dec |     21.16 ns/B     45.06 MiB/s     24.38 c/B
        CCM enc |     42.32 ns/B     22.53 MiB/s     48.75 c/B
        CCM dec |     42.32 ns/B     22.53 MiB/s     48.75 c/B
       CCM auth |     21.17 ns/B     45.06 MiB/s     24.38 c/B
        GCM enc |     22.08 ns/B     43.19 MiB/s     25.44 c/B
        GCM dec |     22.08 ns/B     43.18 MiB/s     25.44 c/B
       GCM auth |     0.923 ns/B    1032.8 MiB/s      1.06 c/B
        OCB enc |     26.20 ns/B     36.40 MiB/s     30.18 c/B
        OCB dec |     25.97 ns/B     36.73 MiB/s     29.91 c/B
       OCB auth |     24.52 ns/B     38.90 MiB/s     28.24 c/B
                =
 AES192         |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     27.83 ns/B     34.26 MiB/s     32.06 c/B
        ECB dec |     28.54 ns/B     33.42 MiB/s     32.88 c/B
        CBC enc |     24.47 ns/B     38.97 MiB/s     28.19 c/B
        CBC dec |     25.27 ns/B     37.74 MiB/s     29.11 c/B
        CFB enc |     25.08 ns/B     38.02 MiB/s     28.89 c/B
        CFB dec |     25.31 ns/B     37.68 MiB/s     29.16 c/B
        OFB enc |     29.57 ns/B     32.25 MiB/s     34.06 c/B
        OFB dec |     29.57 ns/B     32.25 MiB/s     34.06 c/B
        CTR enc |     25.24 ns/B     37.78 MiB/s     29.08 c/B
        CTR dec |     25.24 ns/B     37.79 MiB/s     29.08 c/B
        CCM enc |     49.81 ns/B     19.15 MiB/s     57.38 c/B
        CCM dec |     49.80 ns/B     19.15 MiB/s     57.37 c/B
       CCM auth |     24.58 ns/B     38.80 MiB/s     28.32 c/B
        GCM enc |     26.15 ns/B     36.47 MiB/s     30.13 c/B
        GCM dec |     26.11 ns/B     36.52 MiB/s     30.08 c/B
       GCM auth |     0.923 ns/B    1033.0 MiB/s      1.06 c/B
        OCB enc |     29.59 ns/B     32.23 MiB/s     34.09 c/B
        OCB dec |     29.42 ns/B     32.42 MiB/s     33.89 c/B
       OCB auth |     27.92 ns/B     34.16 MiB/s     32.16 c/B
                =
 AES256         |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     31.20 ns/B     30.57 MiB/s     35.94 c/B
        ECB dec |     31.80 ns/B     29.99 MiB/s     36.63 c/B
        CBC enc |     27.83 ns/B     34.27 MiB/s     32.06 c/B
        CBC dec |     27.87 ns/B     34.21 MiB/s     32.11 c/B
        CFB enc |     27.88 ns/B     34.20 MiB/s     32.12 c/B
        CFB dec |     28.16 ns/B     33.87 MiB/s     32.44 c/B
        OFB enc |     32.93 ns/B     28.96 MiB/s     37.94 c/B
        OFB dec |     32.93 ns/B     28.96 MiB/s     37.94 c/B
        CTR enc |     27.95 ns/B     34.13 MiB/s     32.19 c/B
        CTR dec |     27.95 ns/B     34.12 MiB/s     32.20 c/B
        CCM enc |     55.88 ns/B     17.07 MiB/s     64.38 c/B
        CCM dec |     55.88 ns/B     17.07 MiB/s     64.38 c/B
       CCM auth |     27.95 ns/B     34.12 MiB/s     32.20 c/B
        GCM enc |     28.86 ns/B     33.05 MiB/s     33.25 c/B
        GCM dec |     28.87 ns/B     33.04 MiB/s     33.25 c/B
       GCM auth |     0.923 ns/B    1033.0 MiB/s      1.06 c/B
        OCB enc |     32.96 ns/B     28.94 MiB/s     37.97 c/B
        OCB dec |     32.73 ns/B     29.14 MiB/s     37.70 c/B
       OCB auth |     31.29 ns/B     30.48 MiB/s     36.04 c/B

After:
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |      5.10 ns/B     187.0 MiB/s      5.88 c/B
        ECB dec |      5.27 ns/B     181.0 MiB/s      6.07 c/B
        CBC enc |      1.41 ns/B     675.8 MiB/s      1.63 c/B
        CBC dec |     0.992 ns/B     961.7 MiB/s      1.14 c/B
        CFB enc |      1.30 ns/B     732.4 MiB/s      1.50 c/B
        CFB dec |     0.991 ns/B     962.7 MiB/s      1.14 c/B
        OFB enc |      7.05 ns/B     135.2 MiB/s      8.13 c/B
        OFB dec |      7.05 ns/B     135.2 MiB/s      8.13 c/B
        CTR enc |      1.11 ns/B     856.9 MiB/s      1.28 c/B
        CTR dec |      1.11 ns/B     857.0 MiB/s      1.28 c/B
        CCM enc |      2.58 ns/B     369.8 MiB/s      2.97 c/B
        CCM dec |      2.58 ns/B     369.5 MiB/s      2.97 c/B
       CCM auth |      1.58 ns/B     605.2 MiB/s      1.82 c/B
        GCM enc |      2.04 ns/B     467.9 MiB/s      2.35 c/B
        GCM dec |      2.04 ns/B     466.6 MiB/s      2.35 c/B
       GCM auth |     0.923 ns/B    1033.0 MiB/s      1.06 c/B
        OCB enc |      1.64 ns/B     579.8 MiB/s      1.89 c/B
        OCB dec |      1.66 ns/B     574.5 MiB/s      1.91 c/B
       OCB auth |      1.33 ns/B     715.5 MiB/s      1.54 c/B
                =
 AES192         |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |      5.64 ns/B     169.0 MiB/s      6.50 c/B
        ECB dec |      5.81 ns/B     164.3 MiB/s      6.69 c/B
        CBC enc |      1.90 ns/B     502.1 MiB/s      2.19 c/B
        CBC dec |      1.24 ns/B     771.7 MiB/s      1.42 c/B
        CFB enc |      1.84 ns/B     517.1 MiB/s      2.12 c/B
        CFB dec |      1.23 ns/B     772.5 MiB/s      1.42 c/B
        OFB enc |      7.60 ns/B     125.5 MiB/s      8.75 c/B
        OFB dec |      7.60 ns/B     125.6 MiB/s      8.75 c/B
        CTR enc |      1.36 ns/B     702.7 MiB/s      1.56 c/B
        CTR dec |      1.36 ns/B     702.5 MiB/s      1.56 c/B
        CCM enc |      3.31 ns/B     287.8 MiB/s      3.82 c/B
        CCM dec |      3.31 ns/B     288.0 MiB/s      3.81 c/B
       CCM auth |      2.06 ns/B     462.1 MiB/s      2.38 c/B
        GCM enc |      2.28 ns/B     418.4 MiB/s      2.63 c/B
        GCM dec |      2.28 ns/B     418.0 MiB/s      2.63 c/B
       GCM auth |     0.923 ns/B    1032.8 MiB/s      1.06 c/B
        OCB enc |      1.83 ns/B     520.1 MiB/s      2.11 c/B
        OCB dec |      1.84 ns/B     517.8 MiB/s      2.12 c/B
       OCB auth |      1.52 ns/B     626.1 MiB/s      1.75 c/B
                =
 AES256         |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |      5.86 ns/B     162.7 MiB/s      6.75 c/B
        ECB dec |      6.02 ns/B     158.3 MiB/s      6.94 c/B
        CBC enc |      2.44 ns/B     390.5 MiB/s      2.81 c/B
        CBC dec |      1.45 ns/B     656.4 MiB/s      1.67 c/B
        CFB enc |      2.39 ns/B     399.5 MiB/s      2.75 c/B
        CFB dec |      1.45 ns/B     656.8 MiB/s      1.67 c/B
        OFB enc |      7.81 ns/B     122.1 MiB/s      9.00 c/B
        OFB dec |      7.81 ns/B     122.1 MiB/s      9.00 c/B
        CTR enc |      1.57 ns/B     605.8 MiB/s      1.81 c/B
        CTR dec |      1.57 ns/B     605.9 MiB/s      1.81 c/B
        CCM enc |      4.07 ns/B     234.3 MiB/s      4.69 c/B
        CCM dec |      4.07 ns/B     234.1 MiB/s      4.69 c/B
       CCM auth |      2.61 ns/B     365.7 MiB/s      3.00 c/B
        GCM enc |      2.50 ns/B     381.9 MiB/s      2.88 c/B
        GCM dec |      2.49 ns/B     382.3 MiB/s      2.87 c/B
       GCM auth |     0.926 ns/B    1029.7 MiB/s      1.07 c/B
        OCB enc |      2.05 ns/B     465.6 MiB/s      2.36 c/B
        OCB dec |      2.06 ns/B     462.0 MiB/s      2.38 c/B
       OCB auth |      1.74 ns/B     548.4 MiB/s      2.00 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoAdd ARMv8/AArch32 Crypto Extension implementation of GCM
Jussi Kivilinna [Thu, 14 Jul 2016 14:55:28 +0000 (17:55 +0300)]
Add ARMv8/AArch32 Crypto Extension implementation of GCM

* cipher/Makefile.am: Add 'cipher-gcm-armv8-aarch32-ce.S'.
* cipher/cipher-gcm-armv8-aarch32-ce.S: New.
* cipher/cipher-gcm.c [GCM_USE_ARM_PMULL]
(_gcry_ghash_setup_armv8_ce_pmull, _gcry_ghash_armv8_ce_pmull)
(ghash_setup_armv8_ce_pmull, ghash_armv8_ce_pmull): New.
(setupM) [GCM_USE_ARM_PMULL]: Enable ARM PMULL implementation if
HWF_ARM_PULL HW feature flag is enabled.
* cipher/cipher-gcm.h (GCM_USE_ARM_PMULL): New.
--

Benchmark on Cortex-A53 (1152 Mhz):

Before:
                     |  nanosecs/byte   mebibytes/sec   cycles/byte
  GMAC_AES           |     24.10 ns/B     39.57 MiB/s     27.76 c/B

After (~26x faster):
                     |  nanosecs/byte   mebibytes/sec   cycles/byte
  GMAC_AES           |     0.924 ns/B    1032.2 MiB/s      1.06 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoAdd ARMv8/AArch32 Crypto Extension implemenation of SHA-256
Jussi Kivilinna [Thu, 14 Jul 2016 14:55:28 +0000 (17:55 +0300)]
Add ARMv8/AArch32 Crypto Extension implemenation of SHA-256

* cipher/Makefile.am: Add 'sha256-armv8-aarch32-ce.S'.
* cipher/sha256-armv8-aarch32-ce.S: New.
* cipher/sha256.c (USE_ARM_CE): New.
(sha256_init, sha224_init): Check features for HWF_ARM_SHA1.
[USE_ARM_CE] (_gcry_sha256_transform_armv8_ce): New.
(transform) [USE_ARM_CE]: Use ARMv8 CE implementation if HW supports.
(SHA256_CONTEXT): Add 'use_arm_ce'.
* configure.ac: Add 'sha256-armv8-aarch32-ce.lo'.
--

Benchmark on Cortex-A53 (1152 Mhz):

Before:

                 |  nanosecs/byte   mebibytes/sec   cycles/byte
  SHA256         |     17.38 ns/B     54.88 MiB/s     20.02 c/B

After (~9.3x faster):

                 |  nanosecs/byte   mebibytes/sec   cycles/byte
  SHA256         |      1.85 ns/B     515.7 MiB/s      2.13 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoAdd ARMv8/AArch32 Crypto Extension implementation of SHA-1
Jussi Kivilinna [Thu, 14 Jul 2016 14:55:28 +0000 (17:55 +0300)]
Add ARMv8/AArch32 Crypto Extension implementation of SHA-1

* cipher/Makefile.am: Add 'sha1-armv8-aarch32-ce.S'.
* cipher/sha1-armv7-neon.S (_gcry_sha1_transform_armv7_neon): Add
missing size.
* cipher/sha1-armv8-aarch32-ce.S: New.
* cipher/sha1.c (USE_ARM_CE): New.
(sha1_init): Check features for HWF_ARM_SHA1.
[USE_ARM_CE] (_gcry_sha1_transform_armv8_ce): New.
(transform) [USE_ARM_CE]: Use ARMv8 CE implementation if HW supports
it.
* cipher/sha1.h (SHA1_CONTEXT): Add 'use_arm_ce'.
* configure.ac: Add 'sha1-armv8-aarch32-ce.lo'.
--

Benchmark on Cortex-A53 (1152 Mhz):

Before (SHA-1 NEON):

                 |  nanosecs/byte   mebibytes/sec   cycles/byte
  SHA1           |      6.62 ns/B     144.2 MiB/s      7.62 c/B

After (~3.8x faster):

                 |  nanosecs/byte   mebibytes/sec   cycles/byte
  SHA1           |      1.73 ns/B     552.2 MiB/s      1.99 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoAdd HW feature check for ARMv8 AArch64 and crypto extensions
Jussi Kivilinna [Thu, 14 Jul 2016 14:55:28 +0000 (17:55 +0300)]
Add HW feature check for ARMv8 AArch64 and crypto extensions

* configure.ac: Add '--disable-arm-crypto-support'; enable hwf-arm
module on 64-bit ARM.
(armcryptosupport, gcry_cv_gcc_inline_aarch32_crypto)
(gcry_cv_inline_asm_aarch64_neon)
(gcry_cv_gcc_inline_asm_aarch64_crypto): New.
* src/g10lib.h (HWF_ARM_AES, HWF_ARM_SHA1, HWF_ARM_SHA2)
(HWF_ARM_PMULL): New.
* src/hwf-arm.c [__aarch64__]: Enable building in AArch64 mode.
(feature_map_s): New.
[__arm__] (AT_HWCAP, AT_HWCAP2, HWCAP2_AES, HWCAP2_PMULL)
(HWCAP2_SHA1, HWCAP2_SHA2, arm_features): New.
[__aarch64__] (AT_HWCAP, AT_HWCAP2, HWCAP_ASIMD, HWCAP_AES)
(HWCAP_PMULL, HWCAP_SHA1, HWCAP_SHA2, arm_features): New.
(get_hwcap): Add reading of 'AT_HWCAP2'; Change auxv use
'unsigned long'.
(detect_arm_at_hwcap): Add mapping of HWCAP/HWCAP2 to HWF flags.
(detect_arm_proc_cpuinfo): Add mapping of CPU features to HWF flags.
(_gcry_hwf_detect_arm): Use __ARM_NEON instead of legacy __ARM_NEON__.
* src/hwfeatures.c (hwflist): Add 'arm-aes', 'arm-sha1', 'arm-sha2'
and 'arm-pmull'.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoPost release updates
Werner Koch [Thu, 14 Jul 2016 09:36:40 +0000 (11:36 +0200)]
Post release updates

--

3 years agoRelease 1.7.2 libgcrypt-1.7.2
Werner Koch [Thu, 14 Jul 2016 09:23:34 +0000 (11:23 +0200)]
Release 1.7.2

* configure.ac: Set LT version to C21/A1/R2.
* Makefile.am (distcheck-hook): New.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agoMerge branch 'master' into LIBGCRYPT-1-7-BRANCH
Werner Koch [Thu, 14 Jul 2016 09:19:22 +0000 (11:19 +0200)]
Merge branch 'master' into LIBGCRYPT-1-7-BRANCH

3 years agobuild: Update NEWS.
Werner Koch [Thu, 14 Jul 2016 09:15:38 +0000 (11:15 +0200)]
build: Update NEWS.

--

3 years agobuild: Update config.{guess,sub} to {2016-05-15,2016-06-20}.
Werner Koch [Wed, 13 Jul 2016 17:05:34 +0000 (19:05 +0200)]
build: Update config.{guess,sub} to {2016-05-15,2016-06-20}.

* build-aux/config.guess: Update.
* build-aux/config.sub: Update.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agoFix unaligned accesses with ldm/stm in ChaCha20 and Poly1305 ARM/NEON
Jussi Kivilinna [Thu, 7 Jul 2016 22:22:58 +0000 (01:22 +0300)]
Fix unaligned accesses with ldm/stm in ChaCha20 and Poly1305 ARM/NEON

* cipher/chacha20-armv7-neon.S (UNALIGNED_STMIA8)
(UNALIGNED_LDMIA4): New.
(_gcry_chacha20_armv7_neon_blocks): Use new helper macros instead of
ldm/stm instructions directly.
* cipher/poly1305-armv7-neon.S (UNALIGNED_LDMIA2)
(UNALIGNED_LDMIA4): New.
(_gcry_poly1305_armv7_neon_init_ext, _gcry_poly1305_armv7_neon_blocks)
(_gcry_poly1305_armv7_neon_finish_ext): Use new helper macros instead
of ldm instruction directly.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agobench-slope: add unaligned buffer mode
Jussi Kivilinna [Sun, 3 Jul 2016 15:39:40 +0000 (18:39 +0300)]
bench-slope: add unaligned buffer mode

* tests/bench-slope.c (unaligned_mode): New.
(do_slope_benchmark): Unalign buffer if in unaligned mode enabled.
(print_help, main): Add '--unaligned' parameter.
--

Patch adds --unaligned parameter to allow measurement of unaligned
buffer overhead.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoFix static build
Jussi Kivilinna [Fri, 1 Jul 2016 20:07:07 +0000 (23:07 +0300)]
Fix static build

* tests/pubkey.c (_gcry_pk_util_get_nbits): Make function 'static'.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoDisallow encryption/decryption if key is not set
Jussi Kivilinna [Thu, 30 Jun 2016 18:51:50 +0000 (21:51 +0300)]
Disallow encryption/decryption if key is not set

* cipher/cipher.c (cipher_encrypt, cipher_decrypt): If mode is not
NONE, make sure that key is set.
* cipher/cipher-ccm.c (_gcry_cipher_ccm_set_nonce): Do not clear
'marks.key' when reseting state.
--

Reported-by: Andreas Metzler <ametzler@bebt.de>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoAvoid unaligned accesses with ARM ldm/stm instructions
Jussi Kivilinna [Thu, 30 Jun 2016 18:34:46 +0000 (21:34 +0300)]
Avoid unaligned accesses with ARM ldm/stm instructions

* cipher/rijndael-arm.S: Remove __ARM_FEATURE_UNALIGNED ifdefs, always
compile with unaligned load/store code paths.
* cipher/sha512-arm.S: Ditto.
--

Reported-by: Michael Plass <mfpnb@plass-family.net>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoFix non-PIC reference in PIC for poly1305/ARMv7-NEON
Jussi Kivilinna [Thu, 30 Jun 2016 18:23:05 +0000 (21:23 +0300)]
Fix non-PIC reference in PIC for poly1305/ARMv7-NEON

* cipher/poly1305-armv7-neon.S (GET_DATA_POINTER): New.
(_gcry_poly1305_armv7_neon_init_ext): Use GET_DATA_POINTER.
--

Reported-by: Michael Plass <mfpnb@plass-family.net>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoFix wrong CPU feature #ifdef for SHA1/AVX
Jussi Kivilinna [Thu, 30 Jun 2016 18:17:32 +0000 (21:17 +0300)]
Fix wrong CPU feature #ifdef for SHA1/AVX

* cipher/sha1-avx-amd64.S: Check for HAVE_GCC_INLINE_ASM_AVX instead of
HAVE_GCC_INLINE_ASM_AVX2 & HAVE_GCC_INLINE_ASM_BMI2.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agorandom: Remove debug message about not supported getrandom syscall.
Werner Koch [Thu, 30 Jun 2016 11:00:50 +0000 (13:00 +0200)]
random: Remove debug message about not supported getrandom syscall.

* random/rndlinux.c (_gcry_rndlinux_gather_random): Remove log_debug
for getrandom error ENOSYS.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agotests: Do not test SHAKE128 et al with gcry_md_hash_buffer.
Werner Koch [Mon, 27 Jun 2016 15:22:18 +0000 (17:22 +0200)]
tests: Do not test SHAKE128 et al with gcry_md_hash_buffer.

* tests/benchmark.c (md_bench): Do not test variable lengths algos
with the gcry_md_hash_buffer.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agomd: Improve diagnostic when using SHAKE128 with gcry_md_hash_buffer.
Werner Koch [Mon, 27 Jun 2016 15:11:23 +0000 (17:11 +0200)]
md: Improve diagnostic when using SHAKE128 with gcry_md_hash_buffer.

* cipher/md.c (md_read): Detect missing read function.
(_gcry_md_hash_buffers): Return an error.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agoecc: Fix memory leak.
Werner Koch [Sat, 25 Jun 2016 18:52:47 +0000 (20:52 +0200)]
ecc: Fix memory leak.

* cipher/ecc.c (ecc_check_secret_key): Do not init point if already
set.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agodoc: Update yat2m.
Werner Koch [Sat, 25 Jun 2016 14:07:16 +0000 (16:07 +0200)]
doc: Update yat2m.

* doc/yat2m.c: Update from Libgpg-error
--

Taken from Libgpg-error
commit 9b5e3d1608922f4aaf9958e022431849d5a58501

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agotests: Add attributes to helper functions.
Werner Koch [Sat, 25 Jun 2016 14:09:20 +0000 (16:09 +0200)]
tests: Add attributes to helper functions.

* tests/t-common.h (die, fail, info): Add attributes.
* tests/random.c (die, inf): Ditto.
* tests/pubkey.c (die, fail, info): Add attributes.
* tests/fipsdrv.c (die): Add attribute.
(main): Take care of missing --key,--iv,--dt options.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agoImprove robustness and help lint.
Werner Koch [Sat, 25 Jun 2016 13:38:06 +0000 (15:38 +0200)]
Improve robustness and help lint.

* cipher/rsa.c (rsa_encrypt): Check for !DATA.
* cipher/md.c (search_oid): Check early for !OID.
(md_copy): Use gpg_err_code_from_syserror.  Replace chains of if(!err)
tests.
* cipher/cipher.c (search_oid): Check early for !OID.
* src/misc.c (do_printhex): Allow for BUFFER==NULL even with LENGTH>0.
* mpi/mpicoder.c (onecompl): Allow for A==NULL to help static
analyzers.
--

The change for md_copy is to help static analyzers which have no idea
that gpg_err_code_from_syserror will never return 0.  A gcc attribute
returns_nonzero would be a nice to have.

Some changes are due to the fact the macros like mpi_is_immutable
gracefully handle a NULL arg but a static analyzer the considers that
the function allows for a NULL arg.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agocipher: Improve fatal error message for bad use of gcry_md_read.
Werner Koch [Thu, 23 Jun 2016 08:29:08 +0000 (10:29 +0200)]
cipher: Improve fatal error message for bad use of gcry_md_read.

* cipher/md.c (md_read): Use _gcry_fatal_error instead of BUG.
--

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agoecc: Default cofactor 1 for PUBKEY_FLAG_PARAM.
Niibe Yutaka [Thu, 16 Jun 2016 01:56:28 +0000 (10:56 +0900)]
ecc: Default cofactor 1 for PUBKEY_FLAG_PARAM.

* cipher/ecc.c (ecc_check_secret_key, ecc_sign, ecc_verify)
(ecc_encrypt_raw, ecc_decrypt_raw, compute_keygrip): Set default
cofactor as 1, when not specified.

--

GnuPG-bug-id: 2347
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
(backport from master
commit 0f3a069211d8d24a61aa0dc2cc6c4ef04cc4fab7)

3 years agoecc: Default cofactor 1 for PUBKEY_FLAG_PARAM.
Niibe Yutaka [Thu, 16 Jun 2016 01:56:28 +0000 (10:56 +0900)]
ecc: Default cofactor 1 for PUBKEY_FLAG_PARAM.

* cipher/ecc.c (ecc_check_secret_key, ecc_sign, ecc_verify)
(ecc_encrypt_raw, ecc_decrypt_raw, compute_keygrip): Set default
cofactor as 1, when not specified.

--

GnuPG-bug-id: 2347
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
3 years agoPost release updates
Werner Koch [Wed, 15 Jun 2016 07:50:31 +0000 (09:50 +0200)]
Post release updates

--

3 years agoRelease 1.7.1 libgcrypt-1.7.1
Werner Koch [Wed, 15 Jun 2016 07:34:02 +0000 (09:34 +0200)]
Release 1.7.1

3 years agoMerge branch 'master' into LIBGCRYPT-1-7-BRANCH
Werner Koch [Wed, 15 Jun 2016 07:24:02 +0000 (09:24 +0200)]
Merge branch 'master' into LIBGCRYPT-1-7-BRANCH

--

3 years agodoc: Describe envvars.
Werner Koch [Wed, 15 Jun 2016 07:18:31 +0000 (09:18 +0200)]
doc: Describe envvars.

* doc/gcrypt.texi: Add chapter Configuration.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agorandom: Change names of debug envvars.
Werner Koch [Wed, 15 Jun 2016 07:17:44 +0000 (09:17 +0200)]
random: Change names of debug envvars.

* random/rndunix.c (start_gatherer): Change GNUPG_RNDUNIX_DBG to
GCRYPT_RNDUNIX_DBG, change GNUPG_RNDUNIX_DBG to GCRYPT_RNDUNIX_DBG.
* random/rndw32.c (registry_poll): Change GNUPG_RNDW32_NOPERF to
GCRYPT_RNDW32_NOPERF.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agocipher: Assign OIDs to the Serpent cipher.
Werner Koch [Tue, 14 Jun 2016 13:53:10 +0000 (15:53 +0200)]
cipher: Assign OIDs to the Serpent cipher.

* cipher/serpent.c (serpent128_oids, serpent192_oids)
(serpent256_oids): New. Add them to the specs blow.
(serpent128_aliases): Add "SERPENT-128".
(serpent256_aliases, serpent192_aliases): New.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agocipher: Assign OIDs to the Serpent cipher.
Werner Koch [Tue, 14 Jun 2016 13:53:10 +0000 (15:53 +0200)]
cipher: Assign OIDs to the Serpent cipher.

* cipher/serpent.c (serpent128_oids, serpent192_oids)
(serpent256_oids): New. Add them to the specs blow.
(serpent128_aliases): Add "SERPENT-128".
(serpent256_aliases, serpent192_aliases): New.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agorsa: Implement blinding also for signing.
Werner Koch [Fri, 3 Jun 2016 13:42:53 +0000 (15:42 +0200)]
rsa: Implement blinding also for signing.

* cipher/rsa.c (rsa_decrypt): Factor blinding code out to ...
(secret_blinded): new.
(rsa_sign): Use blinding by default.
--

Although blinding of the RSA sign operation has a noticable speed
loss, we better be on the safe site by using it by default.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agorandom: Remove debug output for getrandom(2) output.
Werner Koch [Fri, 3 Jun 2016 13:15:36 +0000 (15:15 +0200)]
random: Remove debug output for getrandom(2) output.

* random/rndlinux.c (_gcry_rndlinux_gather_random): Remove debug
output.
--

Fixes-commit: ee5a32226a7ca4ab067864e06623fc11a1768900
Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agoFix gcc portability on Solaris 9 SPARC boxes.
Werner Koch [Mon, 7 Sep 2015 13:38:04 +0000 (15:38 +0200)]
Fix gcc portability on Solaris 9 SPARC boxes.

* mpi/longlong.h: Use __sparcv8 as alias for __sparc_v8__.
--

This patch has been in use by pkgsrc for
  SunOS mentok 5.9 Generic_117171-02 sun4u sparc SUNW,Sun-Fire-V240
since 2004.

GnuPG-bug-id: 1703
Signed-off-by: Werner Koch <wk@gnupg.org>
[cherry-pick of commit d281624]
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>