libgcrypt.git
5 years agoImprove performance of SHA-512/ARM/NEON implementation
Jussi Kivilinna [Tue, 17 Dec 2013 13:35:38 +0000 (15:35 +0200)]
Improve performance of SHA-512/ARM/NEON implementation

* cipher/sha512-armv7-neon.S (RT01q, RT23q, RT45q, RT67q): New.
(round_0_63, round_64_79): Remove.
(rounds2_0_63, rounds2_64_79): New.
(_gcry_sha512_transform_armv7_neon): Add 'nblks' input; Handle multiple
input blocks; Use new round macros.
* cipher/sha512.c [USE_ARM_NEON_ASM]
(_gcry_sha512_transform_armv7_neon): Add 'num_blks'.
(transform) [USE_ARM_NEON_ASM]: Pass nblks to assembly.
--

Benchmarks on ARM Cortex-A8:

C-language:     139.1 c/B
Old ARM/NEON:   34.30 c/B
New ARM/NEON:   24.46 c/B

New vs C:       5.68x
New vs Old:     1.40x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoAdd AVX and AVX2/BMI implementations for SHA-256
Jussi Kivilinna [Tue, 17 Dec 2013 13:35:38 +0000 (15:35 +0200)]
Add AVX and AVX2/BMI implementations for SHA-256

* LICENSES: Add 'cipher/sha256-avx-amd64.S' and
'cipher/sha256-avx2-bmi2-amd64.S'.
* cipher/Makefile.am: Add 'sha256-avx-amd64.S' and
'sha256-avx2-bmi2-amd64.S'.
* cipher/sha256-avx-amd64.S: New.
* cipher/sha256-avx2-bmi2-amd64.S: New.
* cipher/sha256-ssse3-amd64.S: Use 'lea' instead of 'add' in few
places for tiny speed improvement.
* cipher/sha256.c (USE_AVX, USE_AVX2): New.
(SHA256_CONTEXT) [USE_AVX, USE_AVX2]: Add 'use_avx' and 'use_avx2'.
(sha256_init, sha224_init) [USE_AVX, USE_AVX2]: Initialize above
new context members.
[USE_AVX] (_gcry_sha256_transform_amd64_avx): New.
[USE_AVX2] (_gcry_sha256_transform_amd64_avx2): New.
(transform) [USE_AVX2]: Use AVX2 assembly if enabled.
(transform) [USE_AVX]: Use AVX assembly if enabled.
* configure.ac: Add 'sha256-avx-amd64.lo' and
'sha256-avx2-bmi2-amd64.lo'.
--

Patch adds fast AVX and AVX2/BMI2 implementations of SHA-256 by Intel
Corporation. The assembly source is licensed under 3-clause BSD license,
thus compatible with LGPL2.1+. Original source can be accessed at:
 http://www.intel.com/p/en_US/embedded/hwsw/technology/packet-processing#docs

Implementation is described in white paper
 "Fast SHA - 256 Implementations on Intel® Architecture Processors"
 http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/sha-256-implementations-paper.html

Note: AVX implementation uses SHLD instruction to emulate RORQ, since it's
      faster on Intel Sandy-Bridge. However, on non-Intel CPUs SHLD is much
      slower than RORQ, so therefore AVX implementation is (for now) limited
      to Intel CPUs.
Note: AVX2 implementation also uses BMI2 instruction rorx, thus additional
      HWF flag.

Benchmarks:

cpu                C-lang       SSSE3        AVX/AVX2     C vs AVX/AVX2
                                                                   vs SSSE3
Intel i5-4570       13.86 c/B    10.27 c/B     8.70 c/B    1.59x    1.18x
Intel i5-2450M      17.25 c/B    12.36 c/B    10.31 c/B    1.67x    1.19x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoAdd AVX and AVX/BMI2 implementations for SHA-1
Jussi Kivilinna [Tue, 17 Dec 2013 13:35:38 +0000 (15:35 +0200)]
Add AVX and AVX/BMI2 implementations for SHA-1

* cipher/Makefile.am: Add 'sha1-avx-amd64.S' and
'sha1-avx-bmi2-amd64.S'.
* cipher/sha1-avx-amd64.S: New.
* cipher/sha1-avx-bmi2-amd64.S: New.
* cipher/sha1.c (USE_AVX, USE_BMI2): New.
(SHA1_CONTEXT) [USE_AVX]: Add 'use_avx'.
(SHA1_CONTEXT) [USE_BMI2]: Add 'use_bmi2'.
(sha1_init): Initialize 'use_avx' and 'use_bmi2'.
[USE_AVX] (_gcry_sha1_transform_amd64_avx): New.
[USE_BMI2] (_gcry_sha1_transform_amd64_bmi2): New.
(transform) [USE_BMI2]: Use BMI2 assembly if enabled.
(transform) [USE_AVX]: Use AVX assembly if enabled.
* configure.ac: Add 'sha1-avx-amd64.lo' and 'sha1-avx-bmi2-amd64.lo'.
--

Patch adds AVX (for Sandybridge and Ivybridge) and AVX/BMI2 (for Haswell)
optimized implementations of SHA-1.

Note: AVX implementation is currently limited to Intel CPUs due to use
      of SHLD instruction for faster rotations on Sandybrigde.

Benchmarks:

cpu             C-version  SSSE3     AVX/(SHLD|BMI2) New vs C  New vs SSSE3
Intel i5-4570    8.84 c/B   4.61 c/B  3.86 c/B        2.29x     1.19x
Intel i5-2450M   9.45 c/B   5.30 c/B  4.39 c/B        2.15x     1.20x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoSHA-1/SSSE3: Improve performance on large buffers
Jussi Kivilinna [Tue, 17 Dec 2013 13:35:38 +0000 (15:35 +0200)]
SHA-1/SSSE3: Improve performance on large buffers

* cipher/sha1-ssse3-amd64.S (RNBLKS): New.
(_gcry_sha1_transform_amd64_ssse3): Handle multiple input blocks, with
software pipelining of next data block processing.
* cipher/sha1.c [USE_SSSE3] (_gcry_sha1_transform_amd64_ssse3): Add
'nblks'.
(transform) [USE_SSSE3]: Pass nblks to assembly function.
--

Patch gives small improvement for large buffer processing, on Intel i5-4570
speed goes from 4.80 c/B to 4.61 c/B.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoAdd bulk processing for hash transform functions
Jussi Kivilinna [Tue, 17 Dec 2013 13:35:38 +0000 (15:35 +0200)]
Add bulk processing for hash transform functions

* cipher/hash-common.c (_gcry_md_block_write): Preload 'hd->blocksize'
to stack, pass number of blocks to 'hd->bwrite'.
* cipher/hash-common.c (_gcry_md_block_write_t): Add 'nblks'.
* cipher/gostr3411-94.c: Rename 'transform' function to
'transform_blk', add new 'transform' function with 'nblks' as
additional input.
* cipher/md4.c: Ditto.
* cipher/md5.c: Ditto.
* cipher/md4.c: Ditto.
* cipher/rmd160.c: Ditto.
* cipher/sha1.c: Ditto.
* cipher/sha256.c: Ditto.
* cipher/sha512.c: Ditto.
* cipher/stribog.c: Ditto.
* cipher/tiger.c: Ditto.
* cipher/whirlpool.c: Ditto.
--

Pass number of blocks to algorithm for futher optimizations.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoOpen new development branch.
Werner Koch [Mon, 16 Dec 2013 16:58:42 +0000 (17:58 +0100)]
Open new development branch.

--

5 years agoPost release updates.
Werner Koch [Mon, 16 Dec 2013 16:49:56 +0000 (17:49 +0100)]
Post release updates.

--

5 years agoRelease 1.6.0. libgcrypt-1.6.0
Werner Koch [Mon, 16 Dec 2013 16:38:55 +0000 (17:38 +0100)]
Release 1.6.0.

5 years agodoc: Change yat2m to allow arbitrary condition names.
Werner Koch [Mon, 16 Dec 2013 15:54:53 +0000 (16:54 +0100)]
doc: Change yat2m to allow arbitrary condition names.

* doc/yat2m.c (MAX_CONDITION_NESTING): New.
(gpgone_defined): Remove.
(condition_s, condition_stack, condition_stack_idx): New.
(cond_is_active, cond_in_verbatim): New.
(add_predefined_macro, set_macro, macro_set_p): New.
(evaluate_conditions, push_condition, pop_condition): New.
(parse_file): Rewrite to use the condition stack.
(top_parse_file): Set prefined macros.
(main): Change -D to define arbitrary macros.
--

This change allows the use of other conditionals than "gpgone" and
thus make "gpgtwoone" et al. actually work.  It does now also track
conditionals over included files.

Signed-off-by: Werner Koch <wk@gnupg.org>
From GnuPG master commit a15c35f37ed2b58805adc213029998aa3e52f038

5 years agotests: Add SHA-512 to the long hash test.
Werner Koch [Mon, 16 Dec 2013 11:43:50 +0000 (12:43 +0100)]
tests: Add SHA-512 to the long hash test.

* tests/hashtest.c (testvectors): Add vectors for 256GiB SHA-512.
* tests/hashtest-256g.in (algos): Add test for SHA-512.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agoAdd configure option --enable-large-data-tests.
Werner Koch [Mon, 16 Dec 2013 10:43:22 +0000 (11:43 +0100)]
Add configure option --enable-large-data-tests.

* configure.ac: Add option --enable-large-data-tests.
* tests/hashtest-256g.in: New.
* tests/Makefile.am (EXTRA_DIST): Add hashtest-256g.in.
(TESTS): Split up into tests_bin, tests_bin_last, tests_sh, and
tests_sh_last.
(tests_sh_last): Add hashtest-256g
(noinst_PROGRAMS): Add only tests_bin and tests_bin_last.
(bench-slope.log, hashtest-256g.log): New rules to enforce serial run.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agorandom: Call random progress handler more often.
Werner Koch [Mon, 16 Dec 2013 08:45:02 +0000 (09:45 +0100)]
random: Call random progress handler more often.

* random/rndlinux.c (_gcry_rndlinux_gather_random): Update progress
indicator earlier.
--

GnuPG-bug-id: 1531
Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agocipher: Normalize the MPIs used as input to secret key functions.
Werner Koch [Mon, 16 Dec 2013 08:22:10 +0000 (09:22 +0100)]
cipher: Normalize the MPIs used as input to secret key functions.

* cipher/dsa.c (sign): Normalize INPUT.
* cipher/elgamal.c (decrypt): Normalize A and B.
* cipher/rsa.c (secret): Normalize the INPUT.
(rsa_decrypt): Reduce DATA before passing to secret.
--

mpi_normalize is in general not required because extra leading zeroes
do not harm the computation.  However, adding extra all zero limbs or
padding with multiples of N may be useful in side-channel attacks.
This is an extra pre-caution in case RSA blinding has been disabled.

CVE-id: CVE-2013-4576
Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agoChange dummy variable in mpih-div.c to mpi_limb_t type
Jussi Kivilinna [Mon, 16 Dec 2013 10:15:37 +0000 (12:15 +0200)]
Change dummy variable in mpih-div.c to mpi_limb_t type

* mpi/mpih-div.c (_gcry_mpih_mod_1, _gcry_mpih_divmod_1): Change dummy
variable to 'mpi_limb_t' type from 'int'.
--

Patch attempts to fix problem reported by Matthias Wachs:

 while updating our buildbots I got another compile error:

 On a OS X machine:

 Darwin luke.net.in.tum.de 11.3.0 Darwin Kernel Version 11.3.0: Thu Jan
 12 18:47:41 PST 2012; root:xnu-1699.24.23~1/RELEASE_X86_64 x86_64

 /bin/sh ../libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I.
 -I..  -I../src -I../src -I/opt/local/include -I/opt/local/include -g -O2
 -Wall -MT mpih-div.lo -MD -MP -MF .deps/mpih-div.Tpo -c -o mpih-div.lo
 mpih-div.c
 libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I.. -I../src -I../src
 -I/opt/local/include -I/opt/local/include -g -O2 -Wall -MT mpih-div.lo
 -MD -MP -MF .deps/mpih-div.Tpo -c mpih-div.c  -fno-common -DPIC -o
 .libs/mpih-div.o
 mpih-div.c: In function '_gcry_mpih_mod_1':
 mpih-div.c:183: error: unsupported inline asm: input constraint with a
 matching output constraint of incompatible type!
 make[2]: *** [mpih-div.lo] Error 1
 make[1]: *** [all-recursive] Error 1
 make: *** [all] Error 2

The new x86-64 inline assembly for MPI expects outputs to be limb sized
variables (64-bit), but mpi/mpih-div.c was using 32-bit dummy variable.
Appearently this mismatch between assembly output and variable sizes does not
fail on every platform.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoRemove duplicate gcry_mac_hd_t typedef
Jussi Kivilinna [Mon, 16 Dec 2013 09:54:37 +0000 (11:54 +0200)]
Remove duplicate gcry_mac_hd_t typedef

* cipher/mac-internal.h (gcry_mac_hd_t): Remove.
--

Attempt to fix problem reported by Matthias Wachs:

 On a freebsd 9.1 amd64 and a debian Lenny x86 system:

 In file included from mac.c:27:
 mac-internal.h:22: error: redefinition of typedef 'gcry_mac_hd_t'
 ../src/gcrypt.h:1301: error: previous declaration of 'gcry_mac_hd_t' was
 here
 *** [mac.lo] Error code 1

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoUse u64 for CCM data lengths
Jussi Kivilinna [Sun, 15 Dec 2013 18:07:54 +0000 (20:07 +0200)]
Use u64 for CCM data lengths

* cipher/cipher-ccm.c: Move code inside [HAVE_U64_TYPEDEF].
[HAVE_U64_TYPEDEF] (_gcry_cipher_ccm_set_lengths): Use 'u64' for
data lengths.
[!HAVE_U64_TYPEDEF] (_gcry_cipher_ccm_encrypt)
(_gcry_cipher_ccm_decrypt, _gcry_cipher_ccm_set_nonce)
(_gcry_cipher_ccm_authenticate, _gcry_cipher_ccm_get_tag)
(_gcry_cipher_ccm_check_tag): Dummy functions returning
GPG_ERROR_NOT_SUPPORTED.
* cipher/cipher-internal.h (gcry_cipher_handle.u_mode.ccm)
(_gcry_cipher_ccm_set_lengths): Move inside [HAVE_U64_TYPEDEF] and use
u64 instead of size_t for CCM data lengths.
* cipher/cipher.c (_gcry_cipher_open_internal, cipher_reset)
(_gcry_cipher_ctl) [!HAVE_U64_TYPEDEF]: Return GPG_ERR_NOT_SUPPORTED
for CCM.
(_gcry_cipher_ctl) [HAVE_U64_TYPEDEF]: Use u64 for
GCRYCTL_SET_CCM_LENGTHS length parameters.
* tests/basic.c: Do not use CCM if !HAVE_U64_TYPEDEF.
* tests/bench-slope.c: Ditto.
* tests/benchmark.c: Ditto.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agotests: Prevent rare failure of gcry_pk_decrypt test.
Werner Koch [Sat, 14 Dec 2013 20:40:36 +0000 (21:40 +0100)]
tests: Prevent rare failure of gcry_pk_decrypt test.

* tests/basic.c (check_pubkey_crypt): Add special mode 1.
(main): Add option --loop.

--

This failure has been reported by Jussi Kivilinna.  The new loop
option was needed to track that down.  It took me up to 100 iterations
to trigger the bug.  With the fix applied I am currently at 1000
iteration with no problems.  Command line to evoke the problem was:

  ./basic --pubkey --verbose --loop -1 --die

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agoMinor fixes to SHA assembly implementations
Jussi Kivilinna [Sat, 14 Dec 2013 09:23:03 +0000 (11:23 +0200)]
Minor fixes to SHA assembly implementations

* cipher/Makefile.am: Correct 'sha256-avx*.S' to 'sha512-avx*.S'.
* cipher/sha1-ssse3-amd64.S: First line, correct filename.
* cipher/sha256-ssse3-amd64.S: Return correct stack burn depth.
* cipher/sha512-avx-amd64.S: Use 'vzeroall' to clear registers.
* cipher/sha512-avx2-bmi2-amd64.S: Ditto and return correct stack burn
depth.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoSHA-1/SSSE3: Do not check for Intel syntax assembly support
Jussi Kivilinna [Fri, 13 Dec 2013 23:11:32 +0000 (01:11 +0200)]
SHA-1/SSSE3: Do not check for Intel syntax assembly support

* cipher/sha1-ssse3-amd64.S: Remove check for
HAVE_INTEL_SYNTAX_PLATFORM_AS.
* cipher/sha1.c [USE_SSSE3]: Ditto.
--

SHA-1 SSSE3 implementation uses AT&T syntax so check for
HAVE_INTEL_SYNTAX_PLATFORM_AS is unnecessary.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoConvert SHA-1 SSSE3 implementation from mixed asm&C to pure asm
Jussi Kivilinna [Fri, 13 Dec 2013 19:07:41 +0000 (21:07 +0200)]
Convert SHA-1 SSSE3 implementation from mixed asm&C to pure asm

* cipher/Makefile.am: Change 'sha1-ssse3-amd64.c' to
'sha1-ssse3-amd64.S'.
* cipher/sha1-ssse3-amd64.c: Remove.
* cipher/sha1-ssse3-amd64.S: New.
--

Mixed C&asm implementation appears to trigger GCC bugs easily. Therefore
convert SSSE3 implementation to pure assembly for safety.

Benchmark also show smallish speed improvement.

cpu             C&asm     asm
Intel i5-4570   5.22 c/B  5.09 c/B
Intel i5-2450M  7.24 c/B  7.00 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoSHA-1: Add SSSE3 implementation
Jussi Kivilinna [Fri, 13 Dec 2013 10:47:56 +0000 (12:47 +0200)]
SHA-1: Add SSSE3 implementation

* cipher/Makefile.am: Add 'sha1-ssse3-amd64.c'.
* cipher/sha1-ssse3-amd64.c: New.
* cipher/sha1.c (USE_SSSE3): New.
(SHA1_CONTEXT) [USE_SSSE3]: Add 'use_ssse3'.
(sha1_init) [USE_SSSE3]: Initialize 'use_ssse3'.
(transform): Rename to...
(_transform): this.
(transform): New.
* configure.ac [host=x86_64]: Add 'sha1-ssse3-amd64.lo'.
--

Patch adds SSSE3 implementation based on white paper "Improving the Performance
of the Secure Hash Algorithm (SHA-1)" at
 http://software.intel.com/en-us/articles/improving-the-performance-of-the-secure-hash-algorithm-1

Benchmarks:

cpu                Old        New        Diff
Intel i5-4570      9.02 c/B   5.22 c/B   1.72x
Intel i5-2450M     12.27 c/B  7.24 c/B   1.69x
Intel Core2 T8100  7.94 c/B   6.76 c/B   1.17x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoAdd missing register clearing in to SHA-256 and SHA-512 assembly
Jussi Kivilinna [Fri, 13 Dec 2013 14:14:05 +0000 (16:14 +0200)]
Add missing register clearing in to SHA-256 and SHA-512 assembly

* cipher/sha256-ssse3-amd64.S: Clear used XMM/YMM registers at return.
* cipher/sha512-avx-amd64.S: Ditto.
* cipher/sha512-avx2-bmi2-amd64.S: Ditto.
* cipher/sha512-ssse3-amd64.S: Ditto.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoUpdate license information
Werner Koch [Fri, 13 Dec 2013 13:52:21 +0000 (14:52 +0100)]
Update license information

* LICENSES: New.
* Makefile.am (EXTRA_DIST): Add LICENSES.
* AUTHORS: Add list of copyright holders.
* README: Reference AUTHORS.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agodoc: Minor manual fix.
Werner Koch [Fri, 13 Dec 2013 09:53:26 +0000 (10:53 +0100)]
doc: Minor manual fix.

--

5 years agoFix empty clobber in AVX2 assembly check
Jussi Kivilinna [Thu, 12 Dec 2013 22:00:08 +0000 (00:00 +0200)]
Fix empty clobber in AVX2 assembly check

* configure.ac (gcry_cv_gcc_inline_asm_avx2): Add "cc" as assembly
globber.
--

Appearently empty globbers only work in some cases on linux, and fail on
mingw32.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoFix W32 build
Jussi Kivilinna [Thu, 12 Dec 2013 21:53:28 +0000 (23:53 +0200)]
Fix W32 build

* random/rndw32.c (register_poll, slow_gatherer): Change gcry_xmalloc to
xmalloc, and gcry_xrealloc to xrealloc.
--

Patch fixes following errors:

../random/.libs/librandom.a(rndw32.o): In function `registry_poll':
.../libgcrypt/random/rndw32.c:434: undefined reference to `__gcry_USE_THE_UNDERSCORED_FUNCTION'
.../libgcrypt/random/rndw32.c:454: undefined reference to `__gcry_USE_THE_UNDERSCORED_FUNCTION'
../random/.libs/librandom.a(rndw32.o): In function `slow_gatherer':
.../random/rndw32.c:658: undefined reference to `__gcry_USE_THE_UNDERSCORED_FUNCTION'

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoSHA-512: Add AVX and AVX2 implementations for x86-64
Jussi Kivilinna [Thu, 12 Dec 2013 11:56:13 +0000 (13:56 +0200)]
SHA-512: Add AVX and AVX2 implementations for x86-64

* cipher/Makefile.am: Add 'sha512-avx-amd64.S' and
'sha512-avx2-bmi2-amd64.S'.
* cipher/sha512-avx-amd64.S: New.
* cipher/sha512-avx2-bmi2-amd64.S: New.
* cipher/sha512.c (USE_AVX, USE_AVX2): New.
(SHA512_CONTEXT) [USE_AVX]: Add 'use_avx'.
(SHA512_CONTEXT) [USE_AVX2]: Add 'use_avx2'.
(sha512_init, sha384_init) [USE_AVX]: Initialize 'use_avx'.
(sha512_init, sha384_init) [USE_AVX2]: Initialize 'use_avx2'.
[USE_AVX] (_gcry_sha512_transform_amd64_avx): New.
[USE_AVX2] (_gcry_sha512_transform_amd64_avx2): New.
(transform) [USE_AVX2]: Add call for AVX2 implementation.
(transform) [USE_AVX]: Add call for AVX implementation.
* configure.ac (HAVE_GCC_INLINE_ASM_BMI2): New check.
(sha512): Add 'sha512-avx-amd64.lo' and 'sha512-avx2-bmi2-amd64.lo'.
* doc/gcrypt.texi: Document 'intel-cpu' and 'intel-bmi2'.
* src/g10lib.h (HWF_INTEL_CPU, HWF_INTEL_BMI2): New.
* src/hwfeatures.c (hwflist): Add "intel-cpu" and "intel-bmi2".
* src/hwf-x86.c (detect_x86_gnuc): Check for HWF_INTEL_CPU and
HWF_INTEL_BMI2.
--

Patch adds fast AVX and AVX2 implementation of SHA-512 by Intel Corporation.
The assembly source is licensed under 3-clause BSD license, thus compatible
with LGPL2.1+. Original source can be accessed at:
 http://www.intel.com/p/en_US/embedded/hwsw/technology/packet-processing#docs

Implementation is described in white paper
 "Fast SHA512 Implementations on Intel® Architecture Processors"
 http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/fast-sha512-implementat$

Note: AVX implementation uses SHLD instruction to emulate RORQ, since it's
      faster on Intel Sandy-Bridge. However, on non-Intel CPUs SHLD is much
      slower than RORQ, so therefore AVX implementation is (for now) limited
      to Intel CPUs.
Note: AVX2 implementation also uses BMI2 instruction rorx, thus additional
      HWF flag.

Benchmarks:

cpu                 Old         SSSE3       AVX/AVX2   Old vs AVX/AVX2
                                                              vs SSSE3
Intel i5-4570       10.11 c/B    7.56 c/B   6.72 c/B   1.50x  1.12x
Intel i5-2450M      14.11 c/B   10.53 c/B   8.88 c/B   1.58x  1.18x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoSHA-512: Add SSSE3 implementation for x86-64
Jussi Kivilinna [Thu, 12 Dec 2013 10:43:08 +0000 (12:43 +0200)]
SHA-512: Add SSSE3 implementation for x86-64

* cipher/Makefile.am: Add 'sha512-ssse3-amd64.S'.
* cipher/sha512-ssse3-amd64.S: New.
* cipher/sha512.c (USE_SSSE3): New.
(SHA512_CONTEXT) [USE_SSSE3]: Add 'use_ssse3'.
(sha512_init, sha384_init) [USE_SSSE3]: Initialize 'use_ssse3'.
[USE_SSSE3] (_gcry_sha512_transform_amd64_ssse3): New.
(transform) [USE_SSSE3]: Call SSSE3 implementation.
* configure.ac (sha512): Add 'sha512-ssse3-amd64.lo'.
--

Patch adds fast SSSE3 implementation of SHA-512 by Intel Corporation. The
assembly source is licensed under 3-clause BSD license, thus compatible
with LGPL2.1+. Original source can be accessed at:
 http://www.intel.com/p/en_US/embedded/hwsw/technology/packet-processing#docs

Implementation is described in white paper
 "Fast SHA512 Implementations on Intel® Architecture Processors"
 http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/fast-sha512-implementations-ia-processors-paper.html

Benchmarks:

cpu                 Old         New         Diff
Intel i5-4570       10.11 c/B    7.56 c/B   1.33x
Intel i5-2450M      14.11 c/B   10.53 c/B   1.33x
Intel Core2 T8100   11.92 c/B   10.22 c/B   1.16x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoSHA-256: Add SSSE3 implementation for x86-64
Jussi Kivilinna [Wed, 11 Dec 2013 17:32:08 +0000 (19:32 +0200)]
SHA-256: Add SSSE3 implementation for x86-64

* cipher/Makefile.am: Add 'sha256-ssse3-amd64.S'.
* cipher/sha256-ssse3-amd64.S: New.
* cipher/sha256.c (USE_SSSE3): New.
(SHA256_CONTEXT) [USE_SSSE3]: Add 'use_ssse3'.
(sha256_init, sha224_init) [USE_SSSE3]: Initialize 'use_ssse3'.
(transform): Rename to...
(_transform): This.
[USE_SSSE3] (_gcry_sha256_transform_amd64_ssse3): New.
(transform): New.
* configure.ac (HAVE_INTEL_SYNTAX_PLATFORM_AS): New check.
(sha256): Add 'sha256-ssse3-amd64.lo'.
* doc/gcrypt.texi: Document 'intel-ssse3'.
* src/g10lib.h (HWF_INTEL_SSSE3): New.
* src/hwfeatures.c (hwflist): Add "intel-ssse3".
* src/hwf-x86.c (detect_x86_gnuc): Test for SSSE3.
--

Patch adds fast SSSE3 implementation of SHA-256 by Intel Corporation. The
assembly source is licensed under 3-clause BSD license, thus compatible
with LGPL2.1+. Original source can be accessed at:
 http://www.intel.com/p/en_US/embedded/hwsw/technology/packet-processing#docs

Implementation is described in white paper
 "Fast SHA - 256 Implementations on Intel® Architecture Processors"
 http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/sha-256-implementations-paper.html

Benchmarks:

cpu                 Old         New         Diff
Intel i5-4570       13.99 c/B   10.66 c/B   1.31x
Intel i5-2450M      21.53 c/B   15.79 c/B   1.36x
Intel Core2 T8100   20.84 c/B   15.07 c/B   1.38x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoAdd a configuration file to disable hardware features.
Werner Koch [Thu, 12 Dec 2013 19:26:56 +0000 (20:26 +0100)]
Add a configuration file to disable hardware features.

* src/hwfeatures.c: Inclyde syslog.h and ctype.h.
(HWF_DENY_FILE): New.
(my_isascii): New.
(parse_hwf_deny_file): New.
(_gcry_detect_hw_features): Call it.

* src/mpicalc.c (main): Correctly initialize Libgcrypt.  Add options
"--print-config" and "--disable-hwf".

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agoMove list of hardware features to hwfeatures.c.
Werner Koch [Thu, 12 Dec 2013 17:53:39 +0000 (18:53 +0100)]
Move list of hardware features to hwfeatures.c.

* src/global.c (hwflist, disabled_hw_features): Move to ..
* src/hwfeatures.c: here.
(_gcry_disable_hw_feature): New.
(_gcry_enum_hw_features): New.
(_gcry_detect_hw_features): Remove arg DISABLED_FEATURES.
* src/global.c (print_config, _gcry_vcontrol, global_init): Adjust
accordingly.
--

It is better to keep the hardware feature infor at one place.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agoRemove macro hacks for internal vs. external functions. Part 2 and last.
Werner Koch [Thu, 12 Dec 2013 14:13:09 +0000 (15:13 +0100)]
Remove macro hacks for internal vs. external functions.  Part 2 and last.

* src/visibility.h: Remove remaining define/undef hacks for symbol
visibility.  Add macros to detect the use of the public functions.
Change all affected functions by replacing them by the x-macros.
* src/g10lib.h: Add internal prototypes.
(xtrymalloc, xtrycalloc, xtrymalloc_secure, xtrycalloc_secure)
(xtryrealloc, xtrystrdup, xmalloc, xcalloc, xmalloc_secure)
(xcalloc_secure, xrealloc, xstrdup, xfree): New macros.

--

The use of xmalloc/xtrymalloc/xfree is a more common pattern than the
gcry_free etc. functions.  Those functions behave like those defined
by C and thus for better readability we  use these macros and not
the underscore prefixed functions.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agorandom: Add a feature to close device file descriptors.
Werner Koch [Wed, 11 Dec 2013 15:59:41 +0000 (16:59 +0100)]
random: Add a feature to close device file descriptors.

* src/gcrypt.h.in (GCRYCTL_CLOSE_RANDOM_DEVICE): New.
* src/global.c (_gcry_vcontrol): Call _gcry_random_close_fds.
* random/random.c (_gcry_random_close_fds): New.
* random/random-csprng.c (_gcry_rngcsprng_close_fds): New.
* random/random-fips.c (_gcry_rngfips_close_fds): New.
* random/random-system.c (_gcry_rngsystem_close_fds): New.
* random/rndlinux.c (open_device): Add arg retry.
(_gcry_rndlinux_gather_random): Add mode to close open fds.

* tests/random.c (check_close_random_device): New.
(main): Call new test.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agoFix last commit (9a37470c)
Werner Koch [Tue, 22 Oct 2013 12:26:53 +0000 (14:26 +0200)]
Fix last commit (9a37470c)

* src/secmem.c (lock_pool): Remove remaining line.  Reported by Ian
Goldberg.

5 years agoFix one-off memory leak when build with Linux capability support.
Werner Koch [Tue, 22 Oct 2013 12:26:53 +0000 (14:26 +0200)]
Fix one-off memory leak when build with Linux capability support.

* src/secmem.c (lock_pool, secmem_init): Use cap_free.  Reported by
Mike Crowe <mac@mcrowe.com>.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agoUpdate libtool to support Android.
David 'Digit' Turner [Tue, 22 Oct 2013 12:26:53 +0000 (14:26 +0200)]
Update libtool to support Android.

* m4/libtool.m4: Add "linux*android*" case.  Taken from the libtool
repository.
--

The patch, which cleanly applies, is

  commit 8eeeb00daef8c4f720c9b79a0cdb89225d9909b6
  Author: David 'Digit' Turner <digit@google.com>
  Date:   Tue Oct 8 14:37:32 2013 -0700

  This patch adds proper Android support to libtool. The main
  issues are the following:

      - Versioned libraries are not supported by the platform and
        its build/packaging tools.

      - The dynamic linker is not GNU ld, there is no support for
        DT_RUNPATH.

      - Similarly, there is no ldconfig.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agotests: Speed up benchmarks in regression test mode.
Werner Koch [Tue, 22 Oct 2013 12:26:53 +0000 (14:26 +0200)]
tests: Speed up benchmarks in regression test mode.

* tests/tsexp.c (check_extract_param): Fix compiler warning.
* tests/Makefile.am (TESTS_ENVIRONMENT): Set GCRYPT_IN_REGRESSION_TEST.
* tests/bench-slope.c (main): Speed up if in regression test mode.
* tests/benchmark.c (main): Ditto.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agotests: Add --csv option to bench-slope.
Werner Koch [Tue, 22 Oct 2013 12:26:53 +0000 (14:26 +0200)]
tests: Add --csv option to bench-slope.

* tests/bench-slope.c (STR, STR2): New.
(cvs_mode): New.
(num_measurement_repetitions): New.  Replace use of
NUM_MEASUREMENT_REPETITIONS by this.
(current_section_name, current_algo_name, current_mode_name): New.
(bench_print_result_csv): New.
(bench_print_result_std): Rename from bench_print_result.
(bench_print_result): New. Divert depending on CSV_MODE.
(bench_print_header, bench_print_footer): take care of CSV_MODE.
(bench_print_algo, bench_print_mode): New.  Use them instead of
explicit printfs.
(main): Add options --csv and --repetitions.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agosexp: Allow long names and white space in gcry_sexp_extract_param.
Werner Koch [Tue, 22 Oct 2013 12:26:53 +0000 (14:26 +0200)]
sexp: Allow long names and white space in gcry_sexp_extract_param.

* src/sexp.c (_gcry_sexp_vextract_param): Skip white space.  Support
long parameter names.
* tests/tsexp.c (check_extract_param): Add test cases for long parameter
names and white space.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agoecc: Merge partly duplicated code.
Werner Koch [Tue, 22 Oct 2013 12:26:53 +0000 (14:26 +0200)]
ecc: Merge partly duplicated code.

* cipher/ecc-eddsa.c (_gcry_ecc_eddsa_sign): Factor A hashing out to ...
(_gcry_ecc_eddsa_compute_h_d): new function.
* cipher/ecc-misc.c (_gcry_ecc_compute_public): Use new function.
(reverse_buffer): Remove.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agoecc: Remove unused internal function.
Werner Koch [Tue, 22 Oct 2013 12:26:53 +0000 (14:26 +0200)]
ecc: Remove unused internal function.

* src/cipher-proto.h (gcry_pk_spec): Remove get_param.
* cipher/ecc-curves.c (_gcry_ecc_get_param_sexp): Merge in code from
_gcry_ecc_get_param.
(_gcry_ecc_get_param): Remove.
* cipher/ecc.c (_gcry_pubkey_spec_ecc): Remove _gcry_ecc_get_param.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agoFix building on mingw32
Jussi Kivilinna [Fri, 6 Dec 2013 00:02:06 +0000 (02:02 +0200)]
Fix building on mingw32

* src/gcrypt-int.h: Include <types.h>.
--

'ulong' is not defined on W32, so we need to include "types.h" in
'gcrypt-int.h'.

 In file included from ../src/visibility.h:53:0,
                  from ../src/g10lib.h:39,
                  from compat.c:22:
 ../src/gcrypt-int.h:365:49: error: unknown type name 'ulong'

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoecc: Change OID for Ed25519.
Werner Koch [Tue, 22 Oct 2013 12:26:53 +0000 (14:26 +0200)]
ecc: Change OID for Ed25519.

* cipher/ecc-curves.c (curve_aliased): Add more suitable OID for
Ed25519.
--

The formerly used OID has been assigned by Peter Gutmann for
Curve25519.  We better keep them distinct and assign a separate one
for Ed25519.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agoRemove macro hacks for internal vs. external functions. Part 1.
Werner Koch [Tue, 22 Oct 2013 12:26:53 +0000 (14:26 +0200)]
Remove macro hacks for internal vs. external functions.  Part 1.

* src/visibility.h: Remove almost all define/undef hacks for symbol
visibility.  Add macros to detect the use of the public functions.
Change all affected functions by prefixing them explicitly with an
underscore and change all internal callers to call the underscore
prefixed versions.  Provide convenience macros from sexp and mpi
functions.
* src/visibility.c: Change all functions to use only gpg_err_code_t
and translate to gpg_error_t only in visibility.c.
--

The use of the macro magic made if hard to follow the function calls
in the source.  It was not easy to see if an internal or external
function (as defined by visibility.c) was called.  The change is quite
large but hopefully makes  Libgcrypt easier to maintain.  Some
function have not yet been fixed; this will be done soon.

Because Libgcrypt does no make use of any other libgpg-error using
libraries it is useless to always translate between gpg_error_t and
gpg_err_code_t (i.e with and w/o error source identifier).  This
translation has no mostly be moved to the function wrappers in
visibility.c.  An additional advantage of using gpg_err_code_t is that
comparison can be done without using gpg_err_code().

I am sorry for that large patch, but a series of patches would
actually be more work to audit.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agompi: add inline assembly for x86-64
Jussi Kivilinna [Wed, 4 Dec 2013 16:17:22 +0000 (18:17 +0200)]
mpi: add inline assembly for x86-64

* mpi/longlong.h [__x86_64] (add_ssaaaa, sub_ddmmss, umul_ppmm)
(udiv_qrnnd, count_leading_zeros, count_trailing_zeros): New.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agompi: fix gcry_mpi_powm for negative base.
NIIBE Yutaka [Wed, 4 Dec 2013 01:03:57 +0000 (10:03 +0900)]
mpi: fix gcry_mpi_powm for negative base.

* mpi/mpi-pow.c (gcry_mpi_powm) [USE_ALGORITHM_SIMPLE_EXPONENTIATION]:
Fix for the case where BASE is negative.
* tests/mpitests.c (test_powm): Add a test case of (-17)^6 mod 19.

Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
5 years agoAdd build support for ppc64le.
Werner Koch [Tue, 22 Oct 2013 12:26:53 +0000 (14:26 +0200)]
Add build support for ppc64le.

* config.guess, config.sub: Update to latest version (2013-11-29).
* m4/libtool.m4: Add patches for ppc64le.
--

We don't want to update libtool, thus we use patches supplied by IBM.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agorijndael: fix compiler warning on aarch64
Jussi Kivilinna [Tue, 3 Dec 2013 12:03:09 +0000 (14:03 +0200)]
rijndael: fix compiler warning on aarch64

* cipher/rijndael.c (do_setkey): Use braces for empty if statement
instead of semicolon.
--

Patch fixes following warning:

 rijndael.c: In function 'do_setkey':
 rijndael.c:507:9: warning: suggest braces around empty body in an 'if' statement [-Wempty-body]
          ;
          ^

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoAdd aarch64 (arm64) mpi assembly
Jussi Kivilinna [Tue, 3 Dec 2013 11:57:02 +0000 (13:57 +0200)]
Add aarch64 (arm64) mpi assembly

* mpi/aarch64/mpi-asm-defs.h: New.
* mpi/aarch64/mpih-add1.S: New.
* mpi/aarch64/mpih-mul1.S: New.
* mpi/aarch64/mpih-mul2.S: New.
* mpi/aarch64/mpih-mul3.S: New.
* mpi/aarch64/mpih-sub1.S: New.
* mpi/config.links [host=aarch64-*-*]: Add configguration for aarch64
assembly.
* mpi/longlong.h [__aarch64__] (add_ssaaaa, sub_ddmmss, umul_ppmm)
(count_leading_zeros): New.
--

Add preliminary aarch64 assembly implementations for mpi.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoecc: Use constant time point operation for Twisted Edwards.
Werner Koch [Mon, 2 Dec 2013 16:09:04 +0000 (17:09 +0100)]
ecc: Use constant time point operation for Twisted Edwards.

* mpi/ec.c (_gcry_mpi_ec_mul_point): Try to do a constant time
operation if needed.
* tests/benchmark.c (main): Add option --use-secmem.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agoecc: Make gcry_pk_testkey work for Ed25519.
Werner Koch [Mon, 2 Dec 2013 15:18:25 +0000 (16:18 +0100)]
ecc: Make gcry_pk_testkey work for Ed25519.

* cipher/ecc-misc.c (_gcry_ecc_compute_public): Add optional args G
and d.  Change all callers.
* cipher/ecc.c (gen_y_2): Remove.
(check_secret_key): Use generic public key compute function.  Adjust
for use with Ed25519 and EdDSA.
(nist_generate_key): Do not use the compliant key thingy for Ed25519.
(ecc_check_secret_key): Make parameter parsing similar to the other
functions.
* cipher/ecc-curves.c (domain_parms): Zero prefix some parameters so
that _gcry_ecc_update_curve_param works correctly.
* tests/keygen.c (check_ecc_keys): Add "param" flag.  Check all
Ed25519 keys.

5 years agoecc: Fix eddsa point decompression.
Werner Koch [Mon, 2 Dec 2013 15:06:40 +0000 (16:06 +0100)]
ecc: Fix eddsa point decompression.

* cipher/ecc-eddsa.c (_gcry_ecc_eddsa_recover_x): Fix the negative
case.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agoecc: Fix gcry_mpi_ec_curve_point for Weierstrass.
Werner Koch [Fri, 29 Nov 2013 16:14:33 +0000 (17:14 +0100)]
ecc: Fix gcry_mpi_ec_curve_point for Weierstrass.

* mpi/ec.c (_gcry_mpi_ec_curve_point): Use correct equation.
(ec_pow3): New.
(ec_p_init): Always copy B.
--

The code path was obviously never tested.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agompi: Introduce 4 user flags for gcry_mpi_t.
Werner Koch [Thu, 28 Nov 2013 08:07:15 +0000 (09:07 +0100)]
mpi: Introduce 4 user flags for gcry_mpi_t.

* src/gcrypt.h.in (GCRYMPI_FLAG_USER1, GCRYMPI_FLAG_USER2)
(GCRYMPI_FLAG_USER3, GCRYMPI_FLAG_USER4): New.
* mpi/mpiutil.c (gcry_mpi_set_flag, gcry_mpi_clear_flag)
(gcry_mpi_get_flag, _gcry_mpi_free): Implement them.
(gcry_mpi_set_opaque): Keep user flags.
--

The space for the flags in the MPI struct is free and thus we can help
applications to make use of some flags.  This is for example useful to
indicate that an MPI needs special processing before use.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agoFix armv3 compile error
Vladimir 'φ-coder/phcoder' Serbinenko [Fri, 29 Nov 2013 07:56:43 +0000 (08:56 +0100)]
Fix armv3 compile error

* mpi/longlong.h [__arm__ && __ARM_ARCH < 4] (umul_ppmm): Use
__AND_CLOBBER_CC instead of __CLOBBER_CC.
--

ARMv3 code uses __CLOBBER_CC at the end of clobber list while it should have
been __AND_CLOBBER_CC.

[jk: add changelog, rebase on libgcrypt repository]
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agolonglong.h on mips with clang
Vladimir 'φ-coder/phcoder' Serbinenko [Fri, 22 Nov 2013 04:24:44 +0000 (05:24 +0100)]
longlong.h on mips with clang

* mpi/longlong.h [__mips__]: Use C-language version with clang.
--
clang doesn't recognise =l / =h assembly operand specifiers but apparently
handles C version well.

[jk: add changelog, rebase on libgcrypt repository, reformat changed line so it
 does not go over 80 characters]
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoCamellia: Tweaks for AES-NI implementations
Jussi Kivilinna [Sun, 24 Nov 2013 15:54:15 +0000 (17:54 +0200)]
Camellia: Tweaks for AES-NI implementations

* cipher/camellia-aesni-avx-amd64.S: Align stack to 16 bytes; tweak
key-setup for small speed up.
* cipher/camellia-aesni-avx2-amd64.S: Use vmovdqu even with aligned
stack; reorder vinsert128 instructions; use rbp for stack frame.
--

Use of 'vmovdqa' with ymm registers produces quite interesting scattering in
measurement timings. By using 'vmovdqu' instead, repeated measuments produce
more stable results.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoAdd GMAC to MAC API
Jussi Kivilinna [Thu, 21 Nov 2013 19:34:21 +0000 (21:34 +0200)]
Add GMAC to MAC API

* cipher/Makefile.am: Add 'mac-gmac.c'.
* cipher/mac-gmac.c: New.
* cipher/mac-internal.h (gcry_mac_handle): Add 'u.gcm'.
(_gcry_mac_type_spec_gmac_aes, _gcry_mac_type_spec_gmac_twofish)
(_gcry_mac_type_spec_gmac_serpent, _gcry_mac_type_spec_gmac_seed)
(_gcry_mac_type_spec_gmac_camellia): New externs.
* cipher/mac.c (mac_list): Add GMAC specifications.
* doc/gcrypt.texi: Add mention of GMAC.
* src/gcrypt.h.in (gcry_mac_algos): Add GCM algorithms.
* tests/basic.c (check_one_mac): Add support for MAC IVs.
(check_mac): Add support for MAC IVs and add GMAC test vectors.
* tests/bench-slope.c (mac_bench): Iterate algorithm numbers to 499.
* tests/benchmark.c (mac_bench): Iterate algorithm numbers to 499.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoGCM: Move gcm_table initialization to setkey
Jussi Kivilinna [Wed, 20 Nov 2013 13:44:27 +0000 (15:44 +0200)]
GCM: Move gcm_table initialization to setkey

* cipher/cipher-gcm.c: Change all 'c->u_iv.iv' to
'c->u_mode.gcm.u_ghash_key.key'.
(_gcry_cipher_gcm_setkey): New.
(_gcry_cipher_gcm_initiv): Move ghash initialization to function above.
* cipher/cipher-internal.h (gcry_cipher_handle): Add
'u_mode.gcm.u_ghash_key'; Reorder 'u_mode.gcm' members for partial
clearing in gcry_cipher_reset.
(_gcry_cipher_gcm_setkey): New prototype.
* cipher/cipher.c (cipher_setkey): Add GCM setkey.
(cipher_reset): Clear 'u_mode' only partially for GCM.
--

GHASH tables can be generated at setkey time. No need to regenerate
for every new IV.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoGCM: Add support for split data buffers and online operation
Jussi Kivilinna [Wed, 20 Nov 2013 13:06:03 +0000 (15:06 +0200)]
GCM: Add support for split data buffers and online operation

* cipher/cipher-gcm.c (do_ghash_buf): Add buffering for less than
blocksize length input and padding handling.
(_gcry_cipher_gcm_encrypt, _gcry_cipher_gcm_decrypt): Add handling
for AAD padding and check if data has already being padded.
(_gcry_cipher_gcm_authenticate): Check that AAD or data has not being
padded yet.
(_gcry_cipher_gcm_initiv): Clear padding marks.
(_gcry_cipher_gcm_tag): Add finalization and padding; Clear sensitive
data from cipher handle, since they are not used after generating tag.
* cipher/cipher-internal.h (gcry_cipher_handle): Add 'u_mode.gcm.macbuf',
'u_mode.gcm.mac_unused', 'u_mode.gcm.ghash_data_finalized' and
'u_mode.gcm.ghash_aad_finalized'.
* tests/basic.c (check_gcm_cipher): Rename to...
(_check_gcm_cipher): ...this and add handling for different buffer step
lengths; Enable per byte buffer testing.
(check_gcm_cipher): Call _check_gcm_cipher with different buffer step
sizes.
--

Until now, GCM was expecting full data to be input in one go. This patch adds
support for feeding data continuously (for encryption/decryption/aad).

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoGCM: Use size_t for buffer sizes
Jussi Kivilinna [Tue, 19 Nov 2013 21:26:27 +0000 (23:26 +0200)]
GCM: Use size_t for buffer sizes

* cipher/cipher-gcm.c (ghash, gcm_bytecounter_add, do_ghash_buf)
(_gcry_cipher_gcm_encrypt, _gcry_cipher_gcm_decrypt)
(_gcry_cipher_gcm_authenticate, _gcry_cipher_gcm_geniv)
(_gcry_cipher_gcm_tag): Use size_t for buffer lengths.
* cipher/cipher-internal.h (_gcry_cipher_gcm_encrypt)
(_gcry_cipher_gcm_decrypt, _gcry_cipher_gcm_authenticate): Use size_t
for buffer lengths.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoGCM: add FIPS mode restrictions
Jussi Kivilinna [Tue, 19 Nov 2013 21:26:27 +0000 (23:26 +0200)]
GCM: add FIPS mode restrictions

* cipher/cipher-gcm.c (_gcry_cipher_gcm_encrypt)
(_gcry_cipher_gcm_get_tag): Do not allow using in FIPS mode is setiv
was invocated directly.
(_gcry_cipher_gcm_setiv): Rename to...
(_gcry_cipher_gcm_initiv): ...this.
(_gcry_cipher_gcm_setiv): New setiv function with check for FIPS mode.
[TODO] (_gcry_cipher_gcm_getiv): New.
* cipher/cipher-internal.h (gcry_cipher_handle): Add
'u_mode.gcm.disallow_encryption_because_of_setiv_in_fips_mode'.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoGCM: Add clearing and checking of marks.tag
Jussi Kivilinna [Tue, 19 Nov 2013 21:26:27 +0000 (23:26 +0200)]
GCM: Add clearing and checking of marks.tag

* cipher/cipher-gcm.c (_gcry_cipher_gcm_encrypt)
(_gcry_cipher_gcm_decrypt, _gcry_cipher_gcm_authenticate): Make sure
that tag has not been finalized yet.
(_gcry_cipher_gcm_setiv): Clear 'marks.tag'.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoGCM: Add stack burning
Jussi Kivilinna [Tue, 19 Nov 2013 21:26:26 +0000 (23:26 +0200)]
GCM: Add stack burning

* cipher/cipher-gcm.c (do_ghash, ghash): Return stack burn depth.
(setupM): Wipe 'tmp' buffer.
(do_ghash_buf): Wipe 'tmp' buffer and add stack burning.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoAdd aggregated bulk processing for GCM on x86-64
Jussi Kivilinna [Tue, 19 Nov 2013 21:26:26 +0000 (23:26 +0200)]
Add aggregated bulk processing for GCM on x86-64

* cipher/cipher-gcm.c [__x86_64__] (gfmul_pclmul_aggr4): New.
(ghash) [GCM_USE_INTEL_PCLMUL]: Add aggregated bulk processing
for __x86_64__.
(setupM) [__x86_64__]: Add initialization for aggregated bulk
processing.
--

Intel Haswell (x86-64):
Old:
AES     GCM enc |     0.990 ns/B     963.3 MiB/s      3.17 c/B
        GCM dec |     0.982 ns/B     970.9 MiB/s      3.14 c/B
       GCM auth |     0.711 ns/B    1340.8 MiB/s      2.28 c/B
New:
AES     GCM enc |     0.535 ns/B    1783.8 MiB/s      1.71 c/B
        GCM dec |     0.531 ns/B    1796.2 MiB/s      1.70 c/B
       GCM auth |     0.255 ns/B    3736.4 MiB/s     0.817 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoGCM: Tweak Intel PCLMUL ghash loop for small speed-up
Jussi Kivilinna [Tue, 19 Nov 2013 21:26:26 +0000 (23:26 +0200)]
GCM: Tweak Intel PCLMUL ghash loop for small speed-up

* cipher/cipher-gcm.c (do_ghash): Mark 'inline'.
[GCM_USE_INTEL_PCLMUL] (do_ghash_pclmul): Rename to...
[GCM_USE_INTEL_PCLMUL] (gfmul_pclmul): ..this and make inline function.
(ghash) [GCM_USE_INTEL_PCLMUL]: Preload data before ghash-pclmul loop.
--

Intel Haswell:
Old:
AES     GCM enc |      1.12 ns/B     853.5 MiB/s      3.58 c/B
        GCM dec |      1.12 ns/B     853.4 MiB/s      3.58 c/B
       GCM auth |     0.843 ns/B    1131.5 MiB/s      2.70 c/B
New:
AES     GCM enc |     0.990 ns/B     963.3 MiB/s      3.17 c/B
        GCM dec |     0.982 ns/B     970.9 MiB/s      3.14 c/B
       GCM auth |     0.711 ns/B    1340.8 MiB/s      2.28 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoGCM: Use counter mode code for speed-up
Jussi Kivilinna [Wed, 20 Nov 2013 13:01:51 +0000 (15:01 +0200)]
GCM: Use counter mode code for speed-up

* cipher/cipher-gcm.c (ghash): Add process for multiple blocks.
(gcm_bytecounter_add, gcm_add32_be128, gcm_check_datalen)
(gcm_check_aadlen_or_ivlen, do_ghash_buf): New functions.
(_gcry_cipher_gcm_encrypt, _gcry_cipher_gcm_decrypt)
(_gcry_cipher_gcm_authenticate, _gcry_cipher_gcm_set_iv)
(_gcry_cipher_gcm_tag): Adjust to use above new functions and
counter mode functions for encryption/decryption.
* cipher/cipher-internal.h (gcry_cipher_handle): Remove 'length'; Add
'u_mode.gcm.(addlen|datalen|tagiv|datalen_over_limits)'.
(_gcry_cipher_gcm_setiv): Return gcry_err_code_t.
* cipher/cipher.c (cipher_setiv): Return error code.
(_gcry_cipher_setiv): Handle error code from 'cipher_setiv'.
--

Patch changes GCM to use counter mode code for bulk speed up and also adds data
length checks as given in NIST SP-800-38D section 5.2.1.1.

Bit length requirements from section 5.2.1.1:

 len(plaintext) <= 2^39-256 bits == 2^36-32 bytes == 2^32-2 blocks
 len(aad) <= 2^64-1 bits ~= 2^61-1 bytes
 len(iv) <= 2^64-1 bit ~= 2^61-1 bytes

Intel Haswell:
Old:
AES     GCM enc |      3.00 ns/B     317.4 MiB/s      9.61 c/B
        GCM dec |      1.96 ns/B     486.9 MiB/s      6.27 c/B
       GCM auth |     0.848 ns/B    1124.7 MiB/s      2.71 c/B
New:
AES     GCM enc |      1.12 ns/B     851.8 MiB/s      3.58 c/B
        GCM dec |      1.12 ns/B     853.7 MiB/s      3.57 c/B
       GCM auth |     0.843 ns/B    1131.4 MiB/s      2.70 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoAdd Intel PCLMUL acceleration for GCM
Jussi Kivilinna [Tue, 19 Nov 2013 21:26:26 +0000 (23:26 +0200)]
Add Intel PCLMUL acceleration for GCM

* cipher/cipher-gcm.c (fillM): Rename...
(do_fillM): ...to this.
(ghash): Remove.
(fillM): New macro.
(GHASH): Use 'do_ghash' instead of 'ghash'.
[GCM_USE_INTEL_PCLMUL] (do_ghash_pclmul): New.
(ghash): New.
(setupM): New.
(_gcry_cipher_gcm_encrypt, _gcry_cipher_gcm_decrypt)
(_gcry_cipher_gcm_authenticate, _gcry_cipher_gcm_setiv)
(_gcry_cipher_gcm_tag): Use 'ghash' instead of 'GHASH' and
'c->u_mode.gcm.u_tag.tag' instead of 'c->u_tag.tag'.
* cipher/cipher-internal.h (GCM_USE_INTEL_PCLMUL): New.
(gcry_cipher_handle): Move 'u_tag' and 'gcm_table' under
'u_mode.gcm'.
* configure.ac (pclmulsupport, gcry_cv_gcc_inline_asm_pclmul): New.
* src/g10lib.h (HWF_INTEL_PCLMUL): New.
* src/global.c: Add "intel-pclmul".
* src/hwf-x86.c (detect_x86_gnuc): Add check for Intel PCLMUL.
--

Speed-up GCM for Intel CPUs.

Intel Haswell (x86-64):
Old:
AES     GCM enc |      5.17 ns/B     184.4 MiB/s     16.55 c/B
        GCM dec |      4.38 ns/B     218.0 MiB/s     14.00 c/B
       GCM auth |      3.17 ns/B     300.4 MiB/s     10.16 c/B
New:
AES     GCM enc |      3.01 ns/B     317.2 MiB/s      9.62 c/B
        GCM dec |      1.96 ns/B     486.9 MiB/s      6.27 c/B
       GCM auth |     0.848 ns/B    1124.8 MiB/s      2.71 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoGCM: GHASH optimizations
Jussi Kivilinna [Tue, 19 Nov 2013 21:26:26 +0000 (23:26 +0200)]
GCM: GHASH optimizations

* cipher/cipher-gcm.c [GCM_USE_TABLES] (gcmR, ghash): Replace with new.
[GCM_USE_TABLES] [GCM_TABLES_USE_U64] (bshift, fillM, do_ghash): New.
[GCM_USE_TABLES] [!GCM_TABLES_USE_U64] (bshift, fillM): Replace with
new.
[GCM_USE_TABLES] [!GCM_TABLES_USE_U64] (do_ghash): New.
(_gcry_cipher_gcm_tag): Remove extra memcpy to outbuf and use
buf_eq_const for comparing authentication tag.
* cipher/cipher-internal.h (gcry_cipher_handle): Different 'gcm_table'
for 32-bit and 64-bit platforms.
--

Patch improves GHASH speed.

Intel Haswell (x86-64):
Old:
       GCM auth |     26.22 ns/B     36.38 MiB/s     83.89 c/B
New:
       GCM auth |      3.18 ns/B     300.0 MiB/s     10.17 c/B

Intel Haswell (mingw32):
Old:
       GCM auth |     27.27 ns/B     34.97 MiB/s     87.27 c/B
New:
       GCM auth |      7.58 ns/B     125.7 MiB/s     24.27 c/B

Cortex-A8:
Old:
       GCM auth |     231.4 ns/B      4.12 MiB/s     233.3 c/B
New:
       GCM auth |     30.82 ns/B     30.94 MiB/s     31.07 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoAdd some documentation for GCM mode
Jussi Kivilinna [Wed, 20 Nov 2013 14:21:19 +0000 (16:21 +0200)]
Add some documentation for GCM mode

* doc/gcrypt.texi: Add mention of GCM mode.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoInitial implementation of GCM
Dmitry Eremin-Solenikov [Tue, 19 Nov 2013 21:26:26 +0000 (23:26 +0200)]
Initial implementation of GCM

* cipher/Makefile.am: Add 'cipher-gcm.c'.
* cipher/cipher-ccm.c (_gcry_ciphert_ccm_set_lengths)
(_gcry_cipher_ccm_authenticate, _gcry_cipher_ccm_tag)
(_gcry_cipher_ccm_encrypt, _gcry_cipher_ccm_decrypt): Change
'c->u_mode.ccm.tag' to 'c->marks.tag'.
* cipher/cipher-gcm.c: New.
* cipher/cipher-internal.h (GCM_USE_TABLES): New.
(gcry_cipher_handle): Add 'marks.tag', 'u_tag', 'length' and
'gcm_table'; Remove 'u_mode.ccm.tag'.
(_gcry_cipher_gcm_encrypt, _gcry_cipher_gcm_decrypt)
(_gcry_cipher_gcm_setiv, _gcry_cipher_gcm_authenticate)
(_gcry_cipher_gcm_get_tag, _gcry_cipher_gcm_check_tag): New.
* cipher/cipher.c (_gcry_cipher_open_internal, cipher_setkey)
(cipher_encrypt, cipher_decrypt, _gcry_cipher_authenticate)
(_gcry_cipher_gettag, _gcry_cipher_checktag): Add GCM mode handling.
* src/gcrypt.h.in (gcry_cipher_modes): Add GCRY_CIPHER_MODE_GCM.
(GCRY_GCM_BLOCK_LEN): New.
* tests/basic.c (check_gcm_cipher): New.
(check_ciphers): Add GCM check.
(check_cipher_modes): Call 'check_gcm_cipher'.
* tests/bench-slope.c (bench_gcm_encrypt_do_bench)
(bench_gcm_decrypt_do_bench, bench_gcm_authenticate_do_bench)
(gcm_encrypt_ops, gcm_decrypt_ops, gcm_authenticate_ops): New.
(cipher_modes): Add GCM enc/dec/auth.
(cipher_bench_one): Limit GCM to block ciphers with 16 byte block-size.
* tests/benchmark.c (cipher_bench): Add GCM.
--

Currently it is still quite slow.

Still no support for generate_iv(). Is it really necessary?

TODO: Merge/reuse cipher-internal state used by CCM.

Changelog entry will be present in final patch submission.

Changes since v1:
- 6x-7x speedup.
- added bench-slope support

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
[jk: mangle new file throught 'indent -nut']
[jk: few fixes]
[jk: changelog]

5 years agoCamellia: fix compiler warning
Jussi Kivilinna [Mon, 18 Nov 2013 18:27:35 +0000 (20:27 +0200)]
Camellia: fix compiler warning

* cipher/camellia-glue.c (camellia_setkey): Use braces around empty if
statement.
--

Patch silences following warning:

 camellia-glue.c: In function 'camellia_setkey':
 camellia-glue.c:183:5: warning: suggest braces around empty body in an 'if' statement [-Wempty-body]

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki-fi>
5 years agoTweak Camellia-AVX key-setup for small speed-up
Jussi Kivilinna [Tue, 19 Nov 2013 13:48:32 +0000 (15:48 +0200)]
Tweak Camellia-AVX key-setup for small speed-up

* cipher/camellia-aesni-avx-amd64.S (camellia_f): Merge S-function output
rotation with P-function.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoAdd CMAC (Cipher-based MAC) to MAC API
Jussi Kivilinna [Thu, 14 Nov 2013 12:10:27 +0000 (14:10 +0200)]
Add CMAC (Cipher-based MAC) to MAC API

* cipher/Makefile.am: Add 'cipher-cmac.c' and 'mac-cmac.c'.
* cipher/cipher-cmac.c: New.
* cipher/cipher-internal.h (gcry_cipher_handle.u_mode): Add 'cmac'.
* cipher/cipher.c (gcry_cipher_open): Rename to...
(_gcry_cipher_open_internal): ...this and add CMAC.
(gcry_cipher_open): New wrapper that disallows use of internal
modes (CMAC) from outside.
(cipher_setkey, cipher_encrypt, cipher_decrypt)
(_gcry_cipher_authenticate, _gcry_cipher_gettag)
(_gcry_cipher_checktag): Add handling for CMAC mode.
(cipher_reset): Do not reset 'marks.key' and do not clear subkeys in
'u_mode' in CMAC mode.
* cipher/mac-cmac.c: New.
* cipher/mac-internal.h: Add CMAC support and algorithms.
* cipher/mac.c: Add CMAC algorithms.
* doc/gcrypt.texi: Add documentation for CMAC.
* src/cipher.h (gcry_cipher_internal_modes): New.
(_gcry_cipher_open_internal, _gcry_cipher_cmac_authenticate)
(_gcry_cipher_cmac_get_tag, _gcry_cipher_cmac_check_tag)
(_gcry_cipher_cmac_set_subkeys): New prototypes.
* src/gcrypt.h.in (gcry_mac_algos): Add CMAC algorithms.
* tests/basic.c (check_mac): Add CMAC test vectors.
--

Patch adds CMAC (Cipher-based MAC) as defined in RFC 4493 and NIST
Special Publication 800-38B.

Internally CMAC is added to cipher module, but is available to outside
only through MAC API.

[v2]:
 - Add documentation.
[v3]:
 - CMAC algorithm ids start from 201.
 - Coding style fixes.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoAdd new MAC API, initially with HMAC
Jussi Kivilinna [Fri, 15 Nov 2013 10:28:07 +0000 (12:28 +0200)]
Add new MAC API, initially with HMAC

* cipher/Makefile.am: Add 'mac.c', 'mac-internal.h' and 'mac-hmac.c'.
* cipher/bufhelp.h (buf_eq_const): New.
* cipher/cipher-ccm.c (_gcry_cipher_ccm_tag): Use 'buf_eq_const' for
constant-time compare.
* cipher/mac-hmac.c: New.
* cipher/mac-internal.h: New.
* cipher/mac.c: New.
* doc/gcrypt.texi: Add documentation for MAC API.
* src/gcrypt-int.h [GPG_ERROR_VERSION_NUMBER < 1.13]
(GPG_ERR_MAC_ALGO): New.
* src/gcrypt.h.in (gcry_mac_handle, gcry_mac_hd_t, gcry_mac_algos)
(gcry_mac_flags, gcry_mac_open, gcry_mac_close, gcry_mac_ctl)
(gcry_mac_algo_info, gcry_mac_setkey, gcry_mac_setiv, gcry_mac_write)
(gcry_mac_read, gcry_mac_verify, gcry_mac_get_algo_maclen)
(gcry_mac_get_algo_keylen, gcry_mac_algo_name, gcry_mac_map_name)
(gcry_mac_reset, gcry_mac_test_algo): New.
* src/libgcrypt.def (gcry_mac_open, gcry_mac_close, gcry_mac_ctl)
(gcry_mac_algo_info, gcry_mac_setkey, gcry_mac_setiv, gcry_mac_write)
(gcry_mac_read, gcry_mac_verify, gcry_mac_get_algo_maclen)
(gcry_mac_get_algo_keylen, gcry_mac_algo_name, gcry_mac_map_name): New.
* src/libgcrypt.vers (gcry_mac_open, gcry_mac_close, gcry_mac_ctl)
(gcry_mac_algo_info, gcry_mac_setkey, gcry_mac_setiv, gcry_mac_write)
(gcry_mac_read, gcry_mac_verify, gcry_mac_get_algo_maclen)
(gcry_mac_get_algo_keylen, gcry_mac_algo_name, gcry_mac_map_name): New.
* src/visibility.c (gcry_mac_open, gcry_mac_close, gcry_mac_ctl)
(gcry_mac_algo_info, gcry_mac_setkey, gcry_mac_setiv, gcry_mac_write)
(gcry_mac_read, gcry_mac_verify, gcry_mac_get_algo_maclen)
(gcry_mac_get_algo_keylen, gcry_mac_algo_name, gcry_mac_map_name): New.
* src/visibility.h (gcry_mac_open, gcry_mac_close, gcry_mac_ctl)
(gcry_mac_algo_info, gcry_mac_setkey, gcry_mac_setiv, gcry_mac_write)
(gcry_mac_read, gcry_mac_verify, gcry_mac_get_algo_maclen)
(gcry_mac_get_algo_keylen, gcry_mac_algo_name, gcry_mac_map_name): New.
* tests/basic.c (check_one_mac, check_mac): New.
(main): Call 'check_mac'.
* tests/bench-slope.c (bench_print_header, bench_print_footer): Allow
variable algorithm name width.
(_cipher_bench, hash_bench): Update to above change.
(bench_hash_do_bench): Add 'gcry_md_reset'.
(bench_mac_mode, bench_mac_init, bench_mac_free, bench_mac_do_bench)
(mac_ops, mac_modes, mac_bench_one, _mac_bench, mac_bench): New.
(main): Add 'mac' benchmark options.
* tests/benchmark.c (mac_repetitions, mac_bench): New.
(main): Add 'mac' benchmark options.
--

Add MAC API, with HMAC algorithms. Internally uses HMAC functionality of the
MD module.

[v2]:
 - Add documentation for MAC API.
 - Change length argument for gcry_mac_read from size_t to size_t* for
   returning number of written bytes.
[v3]:
 - HMAC algorithm ids start from 101.
 - Fix coding style for new files.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoUse correct blocksize of 32 bytes for GOSTR3411-94 HMAC
Jussi Kivilinna [Sat, 16 Nov 2013 09:07:09 +0000 (11:07 +0200)]
Use correct blocksize of 32 bytes for GOSTR3411-94 HMAC

* cipher/md.c (md_open): Set macpads_Bsize to 32 for
GCRY_MD_GOST24311_94.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agocipher: use size_t for internal buffer lengths
Jussi Kivilinna [Fri, 15 Nov 2013 14:23:00 +0000 (16:23 +0200)]
cipher: use size_t for internal buffer lengths

* cipher/arcfour.c (do_encrypt_stream, encrypt_stream): Use 'size_t'
for buffer lengths.
* cipher/blowfish.c (_gcry_blowfish_ctr_enc, _gcry_blowfish_cbc_dec)
(_gcry_blowfish_cfb_dec): Ditto.
* cipher/camellia-glue.c (_gcry_camellia_ctr_enc)
(_gcry_camellia_cbc_dec, _gcry_blowfish_cfb_dec): Ditto.
* cipher/cast5.c (_gcry_cast5_ctr_enc, _gcry_cast5_cbc_dec)
(_gcry_cast5_cfb_dec): Ditto.
* cipher/cipher-aeswrap.c (_gcry_cipher_aeswrap_encrypt)
(_gcry_cipher_aeswrap_decrypt): Ditto.
* cipher/cipher-cbc.c (_gcry_cipher_cbc_encrypt)
(_gcry_cipher_cbc_decrypt): Ditto.
* cipher/cipher-ccm.c (_gcry_cipher_ccm_encrypt)
(_gcry_cipher_ccm_decrypt): Ditto.
* cipher/cipher-cfb.c (_gcry_cipher_cfb_encrypt)
(_gcry_cipher_cfb_decrypt): Ditto.
* cipher/cipher-ctr.c (_gcry_cipher_ctr_encrypt): Ditto.
* cipher/cipher-internal.h (gcry_cipher_handle->bulk)
(_gcry_cipher_cbc_encrypt, _gcry_cipher_cbc_decrypt)
(_gcry_cipher_cfb_encrypt, _gcry_cipher_cfb_decrypt)
(_gcry_cipher_ofb_encrypt, _gcry_cipher_ctr_encrypt)
(_gcry_cipher_aeswrap_encrypt, _gcry_cipher_aeswrap_decrypt)
(_gcry_cipher_ccm_encrypt, _gcry_cipher_ccm_decrypt): Ditto.
* cipher/cipher-ofb.c (_gcry_cipher_cbc_encrypt): Ditto.
* cipher/cipher-selftest.h (gcry_cipher_bulk_cbc_dec_t)
(gcry_cipher_bulk_cfb_dec_t, gcry_cipher_bulk_ctr_enc_t): Ditto.
* cipher/cipher.c (cipher_setkey, cipher_setiv, do_ecb_crypt)
(do_ecb_encrypt, do_ecb_decrypt, cipher_encrypt)
(cipher_decrypt): Ditto.
* cipher/rijndael.c (_gcry_aes_ctr_enc, _gcry_aes_cbc_dec)
(_gcry_aes_cfb_dec, _gcry_aes_cbc_enc, _gcry_aes_cfb_enc): Ditto.
* cipher/salsa20.c (salsa20_setiv, salsa20_do_encrypt_stream)
(salsa20_encrypt_stream, salsa20r12_encrypt_stream): Ditto.
* cipher/serpent.c (_gcry_serpent_ctr_enc, _gcry_serpent_cbc_dec)
(_gcry_serpent_cfb_dec): Ditto.
* cipher/twofish.c (_gcry_twofish_ctr_enc, _gcry_twofish_cbc_dec)
(_gcry_twofish_cfb_dec): Ditto.
* src/cipher-proto.h (gcry_cipher_stencrypt_t)
(gcry_cipher_stdecrypt_t, cipher_setiv_fuct_t): Ditto.
* src/cipher.h (_gcry_aes_cfb_enc, _gcry_aes_cfb_dec)
(_gcry_aes_cbc_enc, _gcry_aes_cbc_dec, _gcry_aes_ctr_enc)
(_gcry_blowfish_cfb_dec, _gcry_blowfish_cbc_dec)
(_gcry_blowfish_ctr_enc, _gcry_cast5_cfb_dec, _gcry_cast5_cbc_dec)
(_gcry_cast5_ctr_enc, _gcry_camellia_cfb_dec, _gcry_camellia_cbc_dec)
(_gcry_camellia_ctr_enc, _gcry_serpent_cfb_dec, _gcry_serpent_cbc_dec)
(_gcry_serpent_ctr_enc, _gcry_twofish_cfb_dec, _gcry_twofish_cbc_dec)
(_gcry_twofish_ctr_enc): Ditto.
--

On 64-bit platforms, cipher module internally converts 64-bit size_t values
to 32-bit unsigned integers.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoCamellia: Add AVX/AES-NI key setup
Jussi Kivilinna [Fri, 15 Nov 2013 14:23:00 +0000 (16:23 +0200)]
Camellia: Add AVX/AES-NI key setup

* cipher/camellia-aesni-avx-amd64.S (key_bitlength, key_table): New
order of fields in ctx.
(camellia_f, vec_rol128, vec_ror128): New macros.
(__camellia_avx_setup128, __camellia_avx_setup256)
(_gcry_camellia_aesni_avx_keygen): New functions.
* cipher/camellia-aesni-avx2-amd64.S (key_bitlength, key_table): New
order of fields in ctx.
* cipher/camellia-arm.S (CAMELLIA_TABLE_BYTE_LEN, key_length): Remove
unused macros.
* cipher/camellia-glue.c (CAMELLIA_context): Move keytable to head for
better alignment; Make 'use_aesni_avx' and 'use_aesni_avx2' bitfield
members.
[USE_AESNI_AVX] (_gcry_camellia_aesni_avx_keygen): New prototype.
(camellia_setkey) [USE_AESNI_AVX || USE_AESNI_AVX2]: Read hw features
to variable 'hwf' and match features from it.
(camellia_setkey) [USE_AESNI_AVX]: Use AES-NI/AVX key setup if
available.
--

Use AVX/AES-NI for key-setup for small speed-up.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoAvoid unneeded stack burning with AES-NI and reduce number of 'decryption_prepared...
Jussi Kivilinna [Fri, 15 Nov 2013 14:23:00 +0000 (16:23 +0200)]
Avoid unneeded stack burning with AES-NI and reduce number of 'decryption_prepared' checks

* cipher/rijndael.c (RIJNDAEL_context): Make 'decryption_prepared',
'use_padlock' and 'use_aesni' 1-bit members in bitfield.
(do_setkey): Move 'hwfeatures' inside [USE_AESNI || USE_PADLOCK].
(do_aesni_enc_aligned): Rename to...
(do_aesni_enc): ...this, as function does not require aligned input.
(do_aesni_dec_aligned): Rename to...
(do_aesni_dec): ...this, as function does not require aligned input.
(do_aesni): Remove.
(rijndael_encrypt): Call 'do_aesni_enc' instead of 'do_aesni'.
(rijndael_decrypt): Call 'do_aesni_dec' instead of 'do_aesni'.
(check_decryption_preparation): New.
(do_decrypt): Remove 'decryption_prepared' check.
(rijndael_decrypt): Ditto and call 'check_decryption_preparation'.
(_gcry_aes_cbc_dec): Ditto.
(_gcry_aes_cfb_enc): Add 'burn_depth' and burn stack only when needed.
(_gcry_aes_cbc_enc): Ditto.
(_gcry_aes_ctr_enc): Ditto.
(_gcry_aes_cfb_dec): Ditto.
(_gcry_aes_cbc_dec): Ditto and correct clearing of 'savebuf'.
--

Patch is mostly about reducing overhead for short buffers.

Results on Intel i5-4570:

After:
 $ tests/benchmark --cipher-repetitions 1000 --cipher-with-keysetup cipher aes
 Running each test 1000 times.
                 ECB/Stream         CBC             CFB             OFB             CTR             CCM
              --------------- --------------- --------------- --------------- --------------- ---------------
 AES            480ms   540ms  1750ms   300ms  1630ms   300ms  1640ms  1640ms   350ms   350ms  2130ms  2140ms

Before:
 $ tests/benchmark --cipher-repetitions 1000 --cipher-with-keysetup cipher aes
 Running each test 1000 times.
                 ECB/Stream         CBC             CFB             OFB             CTR             CCM
              --------------- --------------- --------------- --------------- --------------- ---------------
 AES            520ms   590ms  1760ms   310ms  1640ms   310ms  1610ms  1600ms   360ms   360ms  2150ms  2160ms

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agomd: Fix hashing for data >= 256 GB
Werner Koch [Thu, 14 Nov 2013 22:40:41 +0000 (23:40 +0100)]
md: Fix hashing for data >= 256 GB

* cipher/hash-common.h (gcry_md_block_ctx): Add "nblocks_high".
* cipher/hash-common.c (_gcry_md_block_write): Bump NBLOCKS_HIGH.
* cipher/md4.c (md4_init, md4_final): Take care of NBLOCKS_HIGH.
* cipher/md5.c (md5_init, md5_final): Ditto.
* cipher/rmd160.c (_gcry_rmd160_init, rmd160_final): Ditto.
* cipher/sha1.c (sha1_init, sha1_final): Ditto.
* cipher/sha256.c (sha256_init, sha224_init, sha256_final): Ditto.
* cipher/sha512.c (sha512_init, sha384_init, sha512_final): Ditto.
* cipher/tiger.c (do_init, tiger_final): Ditto.
* cipher/whirlpool.c (whirlpool_final): Ditto.

* cipher/md.c (gcry_md_algo_info): Add GCRYCTL_SELFTEST.
(_gcry_md_selftest): Return "not implemented" as required.
* tests/hashtest.c: New.
* tests/genhashdata.c: New.
* tests/Makefile.am (TESTS): Add hashtest.
(noinst_PROGRAMS): Add genhashdata
--

Problem found by Denis Corbin and analyzed by Yuriy Kaminskiy.

sha512 and whirlpool should not have this problem because they use 64
bit types for counting the blocks. However, a similar fix has been
employed to allow for really huge sizes - despite that it will be very
hard to test them.

The test vectors have been produced by sha{1,224,256}sum and the
genhashdata tool.  A sequence of 'a' is used for them because a test
using one million 'a' is commonly used for test vectors.  More test
vectors are required.  Running the large tests needs to be done
manual for now:

  ./hashtest --gigs 256

tests all algorithms,

  ./hashtest --gigs 256 sha1 sha224 sha256

only the given ones.  A configure option to include these test in the
standard regression suite will be useful.  The tests will take looong.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agoecc: Fix key generation for a plain Ed25519 key.
Christian Grothoff [Mon, 11 Nov 2013 15:04:30 +0000 (16:04 +0100)]
ecc: Fix key generation for a plain Ed25519 key.

* cipher/ecc.c (nist_generate_key): Use custom code for ED25519.
--

I wish there would a an RFC for Curve25519 - the description in the
paper is easy to misunderstand for a non-mathematician.  Source code
and a paper are nice but a proper description (like those in the HAC)
would be better.  Problem spotted by Florian Dold.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agoecc: Fix some memory leaks
Christian Grothoff [Mon, 11 Nov 2013 15:04:30 +0000 (16:04 +0100)]
ecc: Fix some memory leaks

* cipher/ecc-curves.c (_gcry_mpi_ec_new): Free ec->b before assigning.
* cipher/ecc.c (nist_generate_key): Release Q.
* cipher/ecc-eddsa.c (_gcry_ecc_eddsa_genkey): Ditto.
--

_gcry_mpi_ec_new: Fixing memory leak detected with valgrind; if 'b' is
non-NULL, the code in ec_p_init (ec.c:379) already makes a copy of
'b', so before we clobber ctx->b here, we need to at least release the
old value (however, it would of course be nicer to not first make a
copy of b in the first place, but this is the most localized change to
get rid of the memory leak).

nist_generate_key: Fixing rather obvious local leak; Q is first
initialized, then used, copied into the result but never released.

5 years agoecc: Change keygrip computation for Ed25519+EdDSA.
Werner Koch [Mon, 11 Nov 2013 18:14:40 +0000 (19:14 +0100)]
ecc: Change keygrip computation for Ed25519+EdDSA.

* cipher/ecc.c (compute_keygrip): Rework.
* cipher/ecc-eddsa.c (_gcry_ecc_eddsa_ensure_compact): New.
* cipher/ecc-curves.c (_gcry_ecc_update_curve_param): New.
* tests/keygrip.c (key_grips): Add flag param and test cases for
Ed25519.
--

The keygrip for Ed25519+EdDSA has not yet been used - thus it is
possible to change it.  Using the compact representation saves us the
recovering of x from the standard representation.  Compacting is
basically free.

5 years agompi: Add special format GCRYMPI_FMT_OPAQUE.
Werner Koch [Mon, 11 Nov 2013 10:07:56 +0000 (11:07 +0100)]
mpi: Add special format GCRYMPI_FMT_OPAQUE.

* src/gcrypt.h.in (GCRYMPI_FMT_OPAQUE): New.
(_gcry_sexp_nth_opaque_mpi): Remove.
* src/sexp.c (gcry_sexp_nth_mpi): Add support for GCRYMPI_FMT_OPAQUE.
(_gcry_sexp_vextract_param): Replace removed function by
GCRYMPI_FMT_OPAQUE.
--

Using a new formatting mode is easier than to add a dedicated
extraction function for opaque MPIs.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agoFix error output in CTR selftest
Jussi Kivilinna [Sun, 10 Nov 2013 19:32:29 +0000 (21:32 +0200)]
Fix error output in CTR selftest

* cipher/cipher-selftest.c (_gcry_selftest_helper_ctr): Change
fprintf(stderr,...) to syslog(); Correct error output for bulk
IV check, plaintext mismatch => ciphertext mismatch.
--

The 'fprintf's were debugging leftover that leaked into commit.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoFix Serpent-AVX2 and Camellia-AVX2 counter modes
Jussi Kivilinna [Sat, 9 Nov 2013 20:39:19 +0000 (22:39 +0200)]
Fix Serpent-AVX2 and Camellia-AVX2 counter modes

* cipher/camellia-aesni-avx2-amd64.S
(_gcry_camellia_aesni_avx2_ctr_enc): Byte-swap before checking for
overflow handling.
* cipher/camellia-glue.c (selftest_ctr_128, selftest_cfb_128)
(selftest_cbc_128): Add 16 to nblocks.
* cipher/cipher-selftest.c (_gcry_selftest_helper_ctr): Add test with
non-overflowing IV and modify overflow IV to detect broken endianness
handling.
* cipher/serpent-avx2-amd64.S (_gcry_serpent_avx2_ctr_enc): Byte-swap
before checking for overflow handling; Fix crazy-mixed-endian IV
construction to big-endian.
* cipher/serpent.c (selftest_ctr_128, selftest_cfb_128)
(selftest_cbc_128): Add 8 to nblocks.
--

The selftest for CTR was setting counter-IV to all '0xff' except last byte.
This had the effect that even with broken endianness handling Serpent-AVX2 and
Camellia-AVX2 passed the tests.

Patch corrects the CTR selftest and fixes the broken implementations.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agocipher/gost28147: optimization: use precomputed S-box tables
Sergey V [Sat, 9 Nov 2013 16:10:10 +0000 (20:10 +0400)]
cipher/gost28147: optimization: use precomputed S-box tables

* cipher/gost.h (GOST28147_context): Remove unneeded subst and
subst_set members.
* cipher/gost28147.c (max): Remove unneeded macro.
(test_sbox): Replace with new precomputed tables.
(gost_set_subst): Remove function.
(gost_val): Use new S-box tables.
(gost_encrypt_block, gost_decrypt_block): Tweak to use new ctx and
S-box tables.
--

Use generated 8->8 S-boxes with precomputed bitwise shifts and
bitwise rotations. So in the round function gost_val() we no need
to do this operations.

Before this patch:

 GOST28147      |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     24.00 ns/B     39.74 MiB/s         - c/B
        ECB dec |     26.41 ns/B     36.11 MiB/s         - c/B
        CBC enc |     24.57 ns/B     38.81 MiB/s         - c/B
        CBC dec |     26.58 ns/B     35.88 MiB/s         - c/B
        CFB enc |     24.79 ns/B     38.46 MiB/s         - c/B
        CFB dec |     24.72 ns/B     38.57 MiB/s         - c/B
        OFB enc |     24.38 ns/B     39.12 MiB/s         - c/B
        OFB dec |     24.35 ns/B     39.16 MiB/s         - c/B
        CTR enc |     24.83 ns/B     38.41 MiB/s         - c/B
        CTR dec |     25.27 ns/B     37.73 MiB/s         - c/B

After:

 GOST28147      |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     16.29 ns/B     58.55 MiB/s         - c/B
        ECB dec |     16.30 ns/B     58.50 MiB/s         - c/B
        CBC enc |     16.94 ns/B     56.29 MiB/s         - c/B
        CBC dec |     16.81 ns/B     56.72 MiB/s         - c/B
        CFB enc |     17.13 ns/B     55.66 MiB/s         - c/B
        CFB dec |     16.84 ns/B     56.63 MiB/s         - c/B
        OFB enc |     16.69 ns/B     57.13 MiB/s         - c/B
        OFB dec |     16.71 ns/B     57.08 MiB/s         - c/B
        CTR enc |     17.01 ns/B     56.06 MiB/s         - c/B
        CTR dec |     17.05 ns/B     55.93 MiB/s         - c/B

Signed-off-by: Sergey V <sftp.mtuci@gmail.com>
5 years agoFix tail handling for AES-NI counter mode
Jussi Kivilinna [Sat, 9 Nov 2013 19:04:14 +0000 (21:04 +0200)]
Fix tail handling for AES-NI counter mode

* cipher/rijndael.c (do_aesni_ctr): Fix outputting of updated
counter-IV.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoecc: Improve gcry_pk_get_curve.
Werner Koch [Fri, 8 Nov 2013 16:41:42 +0000 (17:41 +0100)]
ecc: Improve gcry_pk_get_curve.

* cipher/ecc-curves.c (_gcry_ecc_fill_in_curve): Factor some code out
to ..
(find_domain_parms_idx): new.
(_gcry_ecc_get_curve): Find by curve name on error.
--

This change allows the use of an input with just the curve name which
can be used to test whether a given curve has been implemented.  Is is
required because due to the "param" flag change the caller usually
does not have the key parameters available.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agocipher: Avoid signed divisions in idea.c
Werner Koch [Fri, 8 Nov 2013 16:21:02 +0000 (17:21 +0100)]
cipher: Avoid signed divisions in idea.c

* cipher/idea.c (mul_inv): Use unsigned division.
--

Reported-by: Vladimir 'φ-coder/phcoder' Serbinenko <phcoder@gmail.com>
  Hello, all. While compiling in an environment with only libgcc
  subset for ARM, I found out that idea.c uses signed divisions:
  Reading the code this seems to be unintended. Inlined patch replaces
  them with more appropriate unsigned division.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agoecc: Implement the "nocomp" flag for key generation.
Werner Koch [Fri, 8 Nov 2013 09:07:40 +0000 (10:07 +0100)]
ecc: Implement the "nocomp" flag for key generation.

* cipher/ecc.c (ecc_generate): Support the "nocomp" flag.
* tests/keygen.c (check_ecc_keys): Add a test for it.

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agoecc: Make "noparam" the default and replace by "param".
Werner Koch [Fri, 8 Nov 2013 08:53:32 +0000 (09:53 +0100)]
ecc: Make "noparam" the default and replace by "param".

* src/cipher.h (PUBKEY_FLAG_NOCOMP): New.
(PUBKEY_FLAG_NOPARAM): Remove.
(PUBKEY_FLAG_PARAM): New.
* cipher/pubkey-util.c (_gcry_pk_util_parse_flaglist): Support the new
flags and ignore the obsolete "noparam" flag.
* cipher/ecc-curves.c (_gcry_ecc_fill_in_curve): Return the curve name
also for curves selected by NBITS.
(_gcry_mpi_ec_new): Support the "param" flag.
* cipher/ecc.c (ecc_generate, ecc_sign, ecc_verify): Ditto.
* tests/keygen.c (check_ecc_keys): Remove the "noparam" flag.
--

This is an API change but there are not many ECC users yet and adding
the "param" flag for those who really need the parameters (e.g. if
private keys have been stored without the curve name, it can easily be
added.

Note that no version of Libgcrypt with support for "noparam" has been
released but for the sake of projects already working with the master
version we don't bail out on "noparam".

Signed-off-by: Werner Koch <wk@gnupg.org>
5 years agoFix decryption function size in AES AMD64 assembly
Jussi Kivilinna [Thu, 7 Nov 2013 10:33:59 +0000 (12:33 +0200)]
Fix decryption function size in AES AMD64 assembly

* cipher/rijndael-amd64.S (_gcry_aes_amd64_decrypt_block): Set '.size'
for '_gcry_aes_amd64_decrypt_block', not '..._encrypt_block'.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoChange 64-bit shift to 32-bit in AES AMD64 assembly
Jussi Kivilinna [Thu, 7 Nov 2013 10:24:04 +0000 (12:24 +0200)]
Change 64-bit shift to 32-bit in AES AMD64 assembly

* cipher/rijndael-amd64.S (do16bit_shr): Change 'shrq' to 'shrl'.
--

64-bit shift is not needed here as registers are used for 32-bit values.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoSpeed-up AES-NI key setup
Jussi Kivilinna [Wed, 6 Nov 2013 13:52:37 +0000 (15:52 +0200)]
Speed-up AES-NI key setup

* cipher/rijndael.c [USE_AESNI] (m128i_t): Remove.
[USE_AESNI] (u128_t): New.
[USE_AESNI] (aesni_do_setkey): New.
(do_setkey) [USE_AESNI]: Move AES-NI accelerated key setup to
'aesni_do_setkey'.
(do_setkey): Call _gcry_get_hw_features only once. Clear stack after
use in generic key setup part.
(rijndael_setkey): Remove stack burning.
(prepare_decryption) [USE_AESNI]: Use 'u128_t' instead of 'm128i_t' to
avoid compiler generated SSE2 instructions and XMM register usage,
unroll 'aesimc' setup loop
(prepare_decryption): Clear stack after use.
[USE_AESNI] (do_aesni_enc_aligned): Update comment about alignment.
(do_decrypt): Do not burning stack after prepare_decryption.
--

Patch improves the speed of AES key setup with AES-NI instructions. Patch also
removes problematic the use of vector typedef, which might cause interference
with XMM register usage in AES-NI accelerated code.

New:
 $ tests/benchmark --cipher-with-keysetup --cipher-repetitions 1000 cipher aes aes192 aes256
 Running each test 1000 times.
                 ECB/Stream         CBC             CFB             OFB             CTR             CCM
              --------------- --------------- --------------- --------------- --------------- ---------------
 AES            520ms   590ms  1760ms   310ms  1640ms   300ms  1620ms  1610ms   350ms   360ms  2160ms  2140ms
 AES192         640ms   680ms  2030ms   370ms  1920ms   350ms  1890ms  1880ms   400ms   410ms  2490ms  2490ms
 AES256         730ms   780ms  2330ms   430ms  2210ms   420ms  2170ms  2180ms   470ms   480ms  2830ms  2840ms

Old:
 $ tests/benchmark --cipher-with-keysetup --cipher-repetitions 1000 cipher aes aes192 aes256
 Running each test 1000 times.
                 ECB/Stream         CBC             CFB             OFB             CTR             CCM
              --------------- --------------- --------------- --------------- --------------- ---------------
 AES            670ms   740ms  1910ms   470ms  1790ms   470ms  1770ms  1760ms   520ms   510ms  2310ms  2310ms
 AES192         820ms   860ms  2220ms   550ms  2110ms   540ms  2070ms  2070ms   600ms   590ms  2670ms  2680ms
 AES256         920ms   970ms  2510ms   620ms  2390ms   600ms  2360ms  2370ms   650ms   660ms  3020ms  3020ms

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoAvoid burn stack in Arcfour setkey
Jussi Kivilinna [Mon, 4 Nov 2013 19:54:33 +0000 (21:54 +0200)]
Avoid burn stack in Arcfour setkey

* cipher/arcfour.c (arcfour_setkey): Remove stack burning.
--

Stack is already cleared in do_arcfour_setkey and GCC is inlining
do_arcfour_setkey to arcfour_setkey which renders this _gcry_burn_stack
broken anyways.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoAvoid burn_stack in CAST5 setkey
Jussi Kivilinna [Tue, 5 Nov 2013 10:30:23 +0000 (12:30 +0200)]
Avoid burn_stack in CAST5 setkey

* cipher/cast5.c (do_cast_setkey): Use wipememory instead of memset.
(cast_setkey): Remove stack burning.
--

Burning stack does not work properly when compiler inlines static functions,
therefore use wipememory to clear stack after use instead of relying on
_gcry_burn_stack.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoImprove Serpent key setup speed
Jussi Kivilinna [Mon, 4 Nov 2013 19:28:22 +0000 (21:28 +0200)]
Improve Serpent key setup speed

* cipher/serpent.c (SBOX, SBOX_INVERSE): Remove index argument.
(serpent_subkeys_generate): Use smaller temporary arrays for subkey
generation and perform stack clearing locally.
(serpent_setkey_internal): Use wipememory to clear stack and remove
_gcry_burn_stack.
(serpent_setkey): Remove unneeded _gcry_burn_stack.
--

Avoid using large arrays and large stack burning to gain extra speed for
key setup.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoModify encrypt/decrypt arguments for in-place
Jussi Kivilinna [Sun, 3 Nov 2013 20:07:19 +0000 (22:07 +0200)]
Modify encrypt/decrypt arguments for in-place

* cipher/cipher.c (gcry_cipher_encrypt, gcry_cipher_decrypt): Modify
local arguments if in-place operation.
--

Modify encrypt/decrypt argument variables instead of calling subfunction with
different arguments. This allows compiler to inline the subfunction for small
speedup.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
5 years agoSpeed up Stribog
Jussi Kivilinna [Tue, 1 Oct 2013 18:47:53 +0000 (21:47 +0300)]
Speed up Stribog

* cipher/stribog.c (STRIBOG_TABLES): Remove.
(Pi): Remove.
[!STRIBOG_TABLES] (A, strido): Remove.
(stribog_table): New table pre-reordered with Pi values.
(strido): Rewrite for new table.
(LPSX): Rewrite for new table.
(xor): Remove.
(g): Small tweaks.
--

Patch optimizes the table-lookup implementation a bit. Patch also removes
the unused non-table implementation from source.

On Intel Core i5-4570 (amd64, 3.2Ghz):

After:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 STRIBOG256     |      9.22 ns/B     103.4 MiB/s     29.53 c/B
 STRIBOG512     |      9.23 ns/B     103.4 MiB/s     29.53 c/B

Before:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 STRIBOG256     |     30.17 ns/B     31.61 MiB/s     96.56 c/B
 STRIBOG512     |     30.20 ns/B     31.57 MiB/s     96.68 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>