libgcrypt.git
6 years agompi: add ARMv6 assembly
Jussi Kivilinna [Sat, 17 Aug 2013 10:41:03 +0000 (13:41 +0300)]
mpi: add ARMv6 assembly

* mpi/armv6/mpi-asm-defs.h: New.
* mpi/armv6/mpih-add1.S: New.
* mpi/armv6/mpih-mul1.S: New.
* mpi/armv6/mpih-mul2.S: New.
* mpi/armv6/mpih-mul3.S: New.
* mpi/armv6/mpih-sub1.S: New.
* mpi/config.links [arm]: Enable ARMv6 assembly.
--

Add mpi assembly for ARMv6 (or later). These are partly based on ARM assembly
found in GMP 4.2.1.

Old vs new (Cortex-A8, 1Ghz):

Algorithm         generate  100*sign  100*verify
------------------------------------------------
ECDSA 192 bit        1.14x     1.10x       1.13x
ECDSA 224 bit        1.11x     1.12x       1.12x
ECDSA 256 bit        1.20x     1.13x       1.14x
ECDSA 384 bit        1.13x     1.21x       1.21x
ECDSA 521 bit        1.17x     1.20x       1.22x
Algorithm         generate  100*sign  100*verify
------------------------------------------------
RSA 1024 bit             -     1.31x       1.60x
RSA 2048 bit             -     1.41x       1.47x
RSA 3072 bit             -     1.50x       1.63x
RSA 4096 bit             -     1.50x       1.57x
Algorithm         generate  100*sign  100*verify
------------------------------------------------
DSA 1024/160             -     1.39x       1.38x
DSA 2048/224             -     1.50x       1.51x
DSA 3072/256             -     1.59x       1.64x

NEW:

Algorithm         generate  100*sign  100*verify
------------------------------------------------
ECDSA 192 bit         70ms    1750ms      3170ms
ECDSA 224 bit         90ms    2210ms      4250ms
ECDSA 256 bit        100ms    2710ms      5170ms
ECDSA 384 bit        230ms    5670ms     11040ms
ECDSA 521 bit        540ms   13370ms     25870ms
Algorithm         generate  100*sign  100*verify
------------------------------------------------
RSA 1024 bit         360ms    2200ms        50ms
RSA 2048 bit        2770ms   11900ms       150ms
RSA 3072 bit        6680ms   32530ms       270ms
RSA 4096 bit       10320ms   69440ms       460ms
Algorithm         generate  100*sign  100*verify
------------------------------------------------
DSA 1024/160             -     990ms       910ms
DSA 2048/224             -    3830ms      3410ms
DSA 3072/256             -    8270ms      7030ms

OLD:

Algorithm         generate  100*sign  100*verify
------------------------------------------------
ECDSA 192 bit         80ms    1920ms      3580ms
ECDSA 224 bit        100ms    2470ms      4760ms
ECDSA 256 bit        120ms    3050ms      5870ms
ECDSA 384 bit        260ms    6840ms     13330ms
ECDSA 521 bit        630ms   16080ms     31500ms
Algorithm         generate  100*sign  100*verify
------------------------------------------------
RSA 1024 bit         450ms    2890ms        80ms
RSA 2048 bit        2320ms   16760ms       220ms
RSA 3072 bit       26300ms   48650ms       440ms
RSA 4096 bit       15700ms   103910ms      720ms
Algorithm         generate  100*sign  100*verify
------------------------------------------------
DSA 1024/160             -    1380ms      1260ms
DSA 2048/224             -    5740ms      5140ms
DSA 3072/256             -   13130ms     11510ms

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoMove ARMv6 detection to configure.ac
Jussi Kivilinna [Sat, 17 Aug 2013 10:41:46 +0000 (13:41 +0300)]
Move ARMv6 detection to configure.ac

* cipher/blowfish-armv6.S: Replace __ARM_ARCH >= 6 checks with
HAVE_ARM_ARCH_V6.
* cipher/blowfish.c: Ditto.
* cipher/camellia-armv6.S: Ditto.
* cipher/camellia.h: Ditto.
* cipher/cast5-armv6.S: Ditto.
* cipher/cast5.c: Ditto.
* cipher/rijndael-armv6.S: Ditto.
* cipher/rijndael.c: Ditto.
* configure.ac: Add HAVE_ARM_ARCH_V6 check.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoAdd optimized wipememory for ARM
Jussi Kivilinna [Sat, 17 Aug 2013 07:09:33 +0000 (10:09 +0300)]
Add optimized wipememory for ARM

src/g10lib.h [__arm__] (fast_wipememory2_unaligned_head)
(fast_wipememory2): New macros.
--

Previous patch that removed _gcry_burn_stack optimization causes burn_stack
take over 30% CPU usage when looping 'benchmark cipher blowfish' on
ARM/Cortex-A8. Optimizing wipememory2 for ARM helps situation a lot.

Old vs new (Cortex-A8):
                  ECB/Stream         CBC             CFB             OFB             CTR
               --------------- --------------- --------------- --------------- ---------------
IDEA            1.20x   1.18x   1.16x   1.15x   1.16x   1.18x   1.18x   1.16x   1.16x   1.17x
3DES            1.14x   1.14x   1.12x   1.13x   1.12x   1.13x   1.12x   1.13x   1.13x   1.15x
CAST5           1.66x   1.67x   1.43x   1.00x   1.48x   1.00x   1.44x   1.44x   1.04x   0.96x
BLOWFISH        1.56x   1.66x   1.47x   1.00x   1.54x   1.05x   1.44x   1.47x   1.00x   1.00x
AES             1.52x   1.42x   1.04x   1.00x   1.00x   1.00x   1.38x   1.37x   1.00x   1.00x
AES192          1.36x   1.36x   1.00x   1.00x   1.00x   1.04x   1.26x   1.22x   1.00x   1.04x
AES256          1.32x   1.31x   1.03x   1.00x   1.00x   1.00x   1.24x   1.30x   1.03x   0.97x
TWOFISH         1.31x   1.26x   1.23x   1.00x   1.25x   1.00x   1.24x   1.23x   1.00x   1.03x
ARCFOUR         1.05x   0.96x
DES             1.31x   1.33x   1.26x   1.29x   1.28x   1.29x   1.26x   1.29x   1.27x   1.29x
TWOFISH128      1.27x   1.24x   1.23x   1.00x   1.28x   1.00x   1.21x   1.26x   0.97x   1.06x
SERPENT128      1.19x   1.19x   1.15x   1.00x   1.14x   1.00x   1.17x   1.17x   0.98x   1.00x
SERPENT192      1.19x   1.24x   1.17x   1.00x   1.14x   1.00x   1.15x   1.17x   1.00x   1.00x
SERPENT256      1.16x   1.19x   1.17x   1.00x   1.14x   1.00x   1.15x   1.15x   1.00x   1.00x
RFC2268_40      1.00x   0.99x   1.00x   1.01x   1.00x   1.00x   1.03x   1.00x   1.01x   1.00x
SEED            1.20x   1.20x   1.18x   1.17x   1.17x   1.19x   1.18x   1.16x   1.19x   1.19x
CAMELLIA128     1.38x   1.34x   1.31x   1.00x   1.31x   1.00x   1.29x   1.32x   1.00x   1.00x
CAMELLIA192     1.27x   1.27x   1.23x   1.00x   1.25x   1.03x   1.20x   1.23x   1.00x   1.00x
CAMELLIA256     1.27x   1.27x   1.26x   1.00x   1.25x   1.03x   1.20x   1.23x   1.00x   1.00x
SALSA20         1.04x   1.00x

(Note: bulk encryption/decryption do burn_stack after full buffer processing,
instead of after each block.)

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agocipher: bufhelp: allow unaligned memory accesses on ARM
Jussi Kivilinna [Fri, 16 Aug 2013 16:44:55 +0000 (19:44 +0300)]
cipher: bufhelp: allow unaligned memory accesses on ARM

* cipher/bufhelp.h [__arm__ && __ARM_FEATURE_UNALIGNED]: Enable
BUFHELP_FAST_UNALIGNED_ACCESS.
--

Newer ARM systems support unaligned memory accesses and on gcc-4.7 and onwards
this is identified by __ARM_FEATURE_UNALIGNED macro.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoRemove burn_stack optimization
Jussi Kivilinna [Sat, 17 Aug 2013 07:48:36 +0000 (10:48 +0300)]
Remove burn_stack optimization

* src/misc.c (_gcry_burn_stack): Remove SIZEOF_UNSIGNED_LONG == 4 or 8
optimization.
--

At least GCC 4.6 on Debian Wheezy (armhf) generates wrong code for burn_stack,
causing recursive structure to be transformed in to iterative without updating
stack pointer between iterations. Therefore only first 64 bytes of stack get
zeroed. This appears to be fixed in GCC 4.7, but lets play this safe and
remove this optimization.

Better approach would probably be to add architecture specific assembly
routine(s) that replace this generic function.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agocamellia: add ARMv6 assembly implementation
Jussi Kivilinna [Fri, 16 Aug 2013 11:40:34 +0000 (14:40 +0300)]
camellia: add ARMv6 assembly implementation

* cipher/Makefile.am: Add 'camellia-armv6.S'.
* cipher/camellia-armv6.S: New file.
* cipher/camellia-glue.c [USE_ARMV6_ASM]
(_gcry_camellia_armv6_encrypt_block)
(_gcry_camellia_armv6_decrypt_block): New prototypes.
[USE_ARMV6_ASM] (Camellia_EncryptBlock, Camellia_DecryptBlock)
(camellia_encrypt, camellia_decrypt): New functions.
* cipher/camellia.c [!USE_ARMV6_ASM]: Compile encryption and decryption
routines if USE_ARMV6_ASM macro is _not_ defined.
* cipher/camellia.h (USE_ARMV6_ASM): New macro.
[!USE_ARMV6_ASM] (Camellia_EncryptBlock, Camellia_DecryptBlock): If
USE_ARMV6_ASM is defined, disable these function prototypes.
(camellia) [arm]: Add 'camellia-armv6.lo'.
--

Add optimized ARMv6 assembly implementation for Camellia. Implementation is tuned
for Cortex-A8. Unaligned access handling is done in assembly part.

For now. only enable this on little-endian systems as big-endian correctness
have not been tested yet.

Old vs new. Cortex-A8 (on Debian Wheezy/armhf):
                ECB/Stream         CBC             CFB             OFB             CTR
             --------------- --------------- --------------- --------------- ---------------
CAMELLIA128   1.44x   1.47x   1.35x   1.34x   1.43x   1.39x   1.38x   1.36x   1.38x   1.39x
CAMELLIA192   1.60x   1.62x   1.52x   1.47x   1.56x   1.54x   1.52x   1.53x   1.52x   1.53x
CAMELLIA256   1.59x   1.60x   1.49x   1.47x   1.53x   1.54x   1.51x   1.50x   1.52x   1.53x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoblowfish: add ARMv6 assembly implementation
Jussi Kivilinna [Fri, 16 Aug 2013 09:51:52 +0000 (12:51 +0300)]
blowfish: add ARMv6 assembly implementation

* cipher/Makefile.am: Add 'blowfish-armv6.S'.
* cipher/blowfish-armv6.S: New file.
* cipher/blowfish.c (USE_ARMV6_ASM): New macro.
[USE_ARMV6_ASM] (_gcry_blowfish_armv6_do_encrypt)
(_gcry_blowfish_armv6_encrypt_block)
(_gcry_blowfish_armv6_decrypt_block, _gcry_blowfish_armv6_ctr_enc)
(_gcry_blowfish_armv6_cbc_dec, _gcry_blowfish_armv6_cfb_dec): New
prototypes.
[USE_ARMV6_ASM] (do_encrypt, do_encrypt_block, do_decrypt_block)
(encrypt_block, decrypt_block): New functions.
(_gcry_blowfish_ctr_enc) [USE_ARMV6_ASM]: Use ARMv6 assembly function.
(_gcry_blowfish_cbc_dec) [USE_ARMV6_ASM]: Use ARMv6 assembly function.
(_gcry_blowfish_cfb_dec) [USE_ARMV6_ASM]: Use ARMv6 assembly function.
* configure.ac (blowfish) [arm]: Add 'blowfish-armv6.lo'.
--

Patch provides non-parallel implementations for small speed-up and 2-way
parallel implementations that gets accelerated on multi-issue CPUs (hand-tuned
for in-order dual-issue Cortex-A8). Unaligned access handling is done in
assembly.

For now, only enable this on little-endian systems as big-endian correctness
have not been tested yet.

Old vs new (Cortex-A8, Debian Wheezy/armhf):

             ECB/Stream         CBC             CFB             OFB             CTR
  --------------- --------------- --------------- --------------- ---------------
BLOWFISH   1.28x   1.16x   1.21x   2.16x   1.26x   1.86x   1.21x   1.25x   1.89x   1.96x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agocast5: add ARMv6 assembly implementation
Jussi Kivilinna [Wed, 14 Aug 2013 18:06:15 +0000 (21:06 +0300)]
cast5: add ARMv6 assembly implementation

* cipher/Makefile.am: Add 'cast5-armv6.S'.
* cipher/cast5-armv6.S: New file.
* cipher/cast5.c (USE_ARMV6_ASM): New macro.
(CAST5_context) [USE_ARMV6_ASM]: New members 'Kr_arm_enc' and
'Kr_arm_dec'.
[USE_ARMV6_ASM] (_gcry_cast5_armv6_encrypt_block)
(_gcry_cast5_armv6_decrypt_block, _gcry_cast5_armv6_ctr_enc)
(_gcry_cast5_armv6_cbc_dec, _gcry_cast5_armv6_cfb_dec): New prototypes.
[USE_ARMV6_ASM] (do_encrypt_block, do_decrypt_block, encrypt_block)
(decrypt_block): New functions.
(_gcry_cast5_ctr_enc) [USE_ARMV6_ASM]: Use ARMv6 assembly function.
(_gcry_cast5_cbc_dec) [USE_ARMV6_ASM]: Use ARMv6 assembly function.
(_gcry_cast5_cfb_dec) [USE_ARMV6_ASM]: Use ARMv6 assembly function.
(do_cast_setkey) [USE_ARMV6_ASM]: Initialize 'Kr_arm_enc' and
'Kr_arm_dec'.
* configure.ac (cast5) [arm]: Add 'cast5-armv6.lo'.
--

Provides non-parallel implementations for small speed-up and 2-way parallel
implementations that gets accelerated on multi-issue CPUs (hand-tuned for
in-order dual-issue Cortex-A8). Unaligned access handling is done in assembly.

For now, only enable this on little-endian systems as big-endian correctness
have not been tested yet.

Old vs new (Cortex-A8, Debian Wheezy/armhf):

          ECB/Stream         CBC             CFB             OFB             CTR
       --------------- --------------- --------------- --------------- ---------------
CAST5   1.15x   1.12x   1.12x   2.07x   1.14x   1.60x   1.12x   1.13x   1.62x   1.63x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agorijndael: add ARMv6 assembly implementation
Jussi Kivilinna [Wed, 14 Aug 2013 14:10:00 +0000 (17:10 +0300)]
rijndael: add ARMv6 assembly implementation

* cipher/Makefile.am: Add 'rijndael-armv6.S'.
* cipher/rijndael-armv6.S: New file.
* cipher/rijndael.c (USE_ARMV6_ASM): New macro.
[USE_ARMV6_ASM] (_gcry_aes_armv6_encrypt_block)
(_gcry_aes_armv6_decrypt_block): New prototypes.
(do_encrypt_aligned) [USE_ARMV6_ASM]: Use ARMv6 assembly function.
(do_encrypt): Disable input/output alignment when USE_ARMV6_ASM.
(do_decrypt_aligned) [USE_ARMV6_ASM]: Use ARMv6 assembly function.
(do_decrypt): Disable input/output alignment when USE_ARMV6_ASM.
* configure.ac (HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS): New check for
gcc/as compatibility with ARM assembly implementations.
(aes) [arm]: Add 'rijndael-armv6.lo'.
--

Add optimized ARMv6 assembly implementation for AES. Implementation is tuned
for Cortex-A8. Unaligned access handling is done in assembly part.

For now, only enable this on little-endian systems as big-endian correctness
have not been tested yet.

Old vs new. Cortex-A8 (on Debian Wheezy/armhf):
          ECB/Stream         CBC             CFB             OFB             CTR
       --------------- --------------- --------------- --------------- ---------------
AES     2.61x   3.12x   2.16x   2.59x   2.26x   2.25x   2.08x   2.08x   2.23x   2.23x
AES192  2.60x   3.06x   2.18x   2.65x   2.29x   2.29x   2.12x   2.12x   2.25x   2.27x
AES256  2.62x   3.09x   2.24x   2.72x   2.30x   2.34x   2.17x   2.19x   2.32x   2.32x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agocipher: fix memory leak.
NIIBE Yutaka [Thu, 8 Aug 2013 23:26:27 +0000 (08:26 +0900)]
cipher: fix memory leak.

* cipher/pubkey.c (gcry_pk_sign): Handle the specific case of ECC,
where there is NULL whichi is not the sentinel.

--

This is a kind of makeshift fix, but the MPI array API is internal
only and will be removed, it is better not to change API now.

6 years agompi: Clear immutable flag on the result of gcry_mpi_set.
Werner Koch [Thu, 8 Aug 2013 13:16:48 +0000 (15:16 +0200)]
mpi: Clear immutable flag on the result of gcry_mpi_set.

* mpi/mpiutil.c (gcry_mpi_set): Reset immutable and const flags.
* tests/mpitests.c (test_const_and_immutable): Add a test for this.
--

gcry_mpi_set shall behave like gcry_mpi_copy and thus reset those
special flags.  Problem reported by Christian Grothoff.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agotests: fix memory leaks.
NIIBE Yutaka [Tue, 6 Aug 2013 23:56:18 +0000 (08:56 +0900)]
tests: fix memory leaks.

* tests/benchmark.c (dsa_bench): Release SIG.

* tests/mpitests.c (test_powm): Release BASE, EXP, MOD, and RES.

* tests/prime.c (check_primes): Release PRIME.

* tests/tsexp.c (basic): Use intermediate variable M for constant.
Release S1, S2 and A.

6 years agoFix building on W32 (cannot export symbol 'gcry_sexp_get_buffer')
Jussi Kivilinna [Wed, 7 Aug 2013 07:36:41 +0000 (10:36 +0300)]
Fix building on W32 (cannot export symbol 'gcry_sexp_get_buffer')

* src/libgcrypt.def: Change 'gcry_sexp_get_buffer' to
'gcry_sexp_nth_buffer'.
--

Commit 2d3e8d4d9 "sexp: Add function gcry_sexp_nth_buffer." added
'gcry_sexp_get_buffer' to libgcrypt.def, when it should have been
'gcry_sexp_nth_buffer'.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agocipher: fix another memory leak.
NIIBE Yutaka [Tue, 6 Aug 2013 05:38:51 +0000 (14:38 +0900)]
cipher: fix another memory leak.

* cipher/ecc.c (ecc_get_curve): Free TMP.

6 years agotests: fix memory leaks.
NIIBE Yutaka [Tue, 6 Aug 2013 03:59:35 +0000 (12:59 +0900)]
tests: fix memory leaks.

* tests/pubkey.c (check_keys_crypt): Release L, X0, and X1.
(check_keys): Release X.

6 years agocipher: fix memory leaks.
NIIBE Yutaka [Tue, 6 Aug 2013 03:57:10 +0000 (12:57 +0900)]
cipher: fix memory leaks.

* cipher/elgamal.c (elg_generate_ext): Free XVALUE.

* cipher/pubkey.c (sexp_elements_extract): Don't use IDX for loop.
Call mpi_free.
(sexp_elements_extract_ecc): Call mpi_free.

6 years agompi: Improve gcry_mpi_invm to detect bad input.
Werner Koch [Mon, 5 Aug 2013 16:58:41 +0000 (18:58 +0200)]
mpi: Improve gcry_mpi_invm to detect bad input.

* mpi/mpi-inv.c (gcry_mpi_invm): Return 0 for bad input.
--

Without this patch the function may enter and endless loop.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoCorrect checks for ecc secret key
Dmitry Eremin-Solenikov [Wed, 31 Jul 2013 13:20:58 +0000 (17:20 +0400)]
Correct checks for ecc secret key

* cipher/ecc.c (check_secret_key): replace wrong comparison of Q and
sk->Q points with correct one.

--
Currently check_secret_keys compares pointers to coordinates of Q
(calculated) and sk->Q (provided) points. Instead it should convert them
to affine representations and use mpi_cmp to compare coordinates.

This has an implication that keys that were (erroneously) verified as
valid could now become invalid.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
6 years agosexp: Allow white space anywhere in a hex format.
Werner Koch [Mon, 29 Jul 2013 13:16:02 +0000 (15:16 +0200)]
sexp: Allow white space anywhere in a hex format.

* src/sexp.c (hextobyte): Remove.
(hextonibble): New.
(vsexp_sscan): Skip whtespace between hex nibbles.
--

Before that patch a string
  "(a #123"
  "    456#")
was not correctly parsed because white space was only allowed between
two hex digits but not in between nibbles.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoImplement deterministic ECDSA as specified by rfc-6979.
Werner Koch [Mon, 29 Jul 2013 13:09:33 +0000 (15:09 +0200)]
Implement deterministic ECDSA as specified by rfc-6979.

* cipher/ecc.c (sign): Add args FLAGS and HASHALGO.  Convert an opaque
MPI as INPUT.  Implement rfc-6979.
(ecc_sign): Remove the opaque MPI code and pass FLAGS to sign.
(verify): Do not allocate and compute Y; it is not used.
(ecc_verify): Truncate the hash value if needed.
* tests/dsa-rfc6979.c (check_dsa_rfc6979): Add ECDSA test cases.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoImplement deterministic DSA as specified by rfc-6979.
Werner Koch [Fri, 26 Jul 2013 18:15:53 +0000 (20:15 +0200)]
Implement deterministic DSA as specified by rfc-6979.

* cipher/dsa.c (dsa_sign): Move opaque mpi extraction to sign.
(sign): Add args FLAGS and HASHALGO.  Implement deterministic DSA.
Add code path for R==0 to comply with the standard.
(dsa_verify): Left fill opaque mpi based hash values.
* cipher/dsa-common.c (int2octets, bits2octets): New.
(_gcry_dsa_gen_rfc6979_k): New.
* tests/dsa-rfc6979.c: New.
* tests/Makefile.am (TESTS): Add dsa-rfc6979.
--

This patch also fixes a recent patch (37d0a1e) which allows to pass
the hash in a (hash) element.

Support for deterministic ECDSA will come soon.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoAllow the use of a private-key s-expression with gcry_pk_verify.
Werner Koch [Fri, 26 Jul 2013 17:22:36 +0000 (19:22 +0200)]
Allow the use of a private-key s-expression with gcry_pk_verify.

* cipher/pubkey.c (sexp_to_key): Fallback to private key.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoMitigate a flush+reload cache attack on RSA secret exponents.
Werner Koch [Thu, 25 Jul 2013 09:17:52 +0000 (11:17 +0200)]
Mitigate a flush+reload cache attack on RSA secret exponents.

* mpi/mpi-pow.c (gcry_mpi_powm): Always perfrom the mpi_mul for
exponents in secure memory.
--

The attack is published as http://eprint.iacr.org/2013/448 :

Flush+Reload: a High Resolution, Low Noise, L3 Cache Side-Channel
Attack by Yuval Yarom and Katrina Falkner. 18 July 2013.

  Flush+Reload is a cache side-channel attack that monitors access to
  data in shared pages. In this paper we demonstrate how to use the
  attack to extract private encryption keys from GnuPG.  The high
  resolution and low noise of the Flush+Reload attack enables a spy
  program to recover over 98% of the bits of the private key in a
  single decryption or signing round. Unlike previous attacks, the
  attack targets the last level L3 cache. Consequently, the spy
  program and the victim do not need to share the execution core of
  the CPU. The attack is not limited to a traditional OS and can be
  used in a virtualised environment, where it can attack programs
  executing in a different VM.

(cherry picked from commit 55237c8f6920c6629debd23db65e90b42a3767de)

6 years agopk: Allow the use of a hash element for DSA sign and verify.
Werner Koch [Fri, 19 Jul 2013 16:14:38 +0000 (18:14 +0200)]
pk: Allow the use of a hash element for DSA sign and verify.

* cipher/pubkey.c (pubkey_sign): Add arg ctx and pass it to the sign
module.
(gcry_pk_sign): Pass CTX to pubkey_sign.
(sexp_data_to_mpi): Add flag rfc6979 and code to alls hash with *DSA
* cipher/rsa.c (rsa_sign, rsa_verify): Return an error if an opaque
MPI is given for DATA/HASH.
* cipher/elgamal.c (elg_sign, elg_verify): Ditto.
* cipher/dsa.c (dsa_sign, dsa_verify): Convert a given opaque MPI.
* cipher/ecc.c (ecc_sign, ecc_verify): Ditto.
* tests/basic.c (check_pubkey_sign_ecdsa): Add a test for using a hash
element with DSA.
--

This patch allows the use of

  (data (flags raw)
    (hash sha256 #80112233445566778899AABBCCDDEEFF
                  000102030405060708090A0B0C0D0E0F#))

in addition to the old but more efficient

  (data (flags raw)
    (value #80112233445566778899AABBCCDDEEFF
            000102030405060708090A0B0C0D0E0F#))

for DSA and ECDSA.  With the hash element the flag "raw" must be
explicitly given because existing regression test code expects that
conflict error is return if no flags but a hash element is given.

Note that the hash algorithm name is currently not checked.  It may
eventually be used to cross-check the length of the provided hash
value.  It is suggested that the correct hash name is given - even if
a truncated hash value is used.

Finally this patch adds a way to pass the hash algorithm and flag
values to the signing module.  "rfc6979" as been implemented as a new
but not yet used flag.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agosexp: Add function gcry_sexp_nth_buffer.
Werner Koch [Fri, 19 Jul 2013 13:54:03 +0000 (15:54 +0200)]
sexp: Add function gcry_sexp_nth_buffer.

* src/sexp.c (gcry_sexp_nth_buffer): New.
* src/visibility.c, src/visibility.h: Add function wrapper.
* src/libgcrypt.vers, src/libgcrypt.def: Add to API.
* src/gcrypt.h.in: Add prototype.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoUpdate AUTHORS with info on Salsa20.
Werner Koch [Thu, 18 Jul 2013 19:37:35 +0000 (21:37 +0200)]
Update AUTHORS with info on Salsa20.

--

6 years agoAdd support for Salsa20.
Werner Koch [Thu, 18 Jul 2013 19:32:05 +0000 (21:32 +0200)]
Add support for Salsa20.

* src/gcrypt.h.in (GCRY_CIPHER_SALSA20): New.
* cipher/salsa20.c: New.
* configure.ac (available_ciphers): Add Salsa20.
* cipher/cipher.c: Register Salsa20.
(cipher_setiv): Allow to divert an IV to a cipher module.
* src/cipher-proto.h (cipher_setiv_func_t): New.
(cipher_extra_spec): Add field setiv.
* src/cipher.h: Declare Salsa20 definitions.
* tests/basic.c (check_stream_cipher): New.
(check_stream_cipher_large_block): New.
(check_cipher_modes): Run new test functions.
(check_ciphers): Add simple test for Salsa20.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoTypo fix in comment.
Werner Koch [Wed, 17 Jul 2013 14:55:37 +0000 (16:55 +0200)]
Typo fix in comment.

--

6 years agoAllow gcry_mpi_dump to print opaque MPIs.
Werner Koch [Wed, 17 Jul 2013 14:55:02 +0000 (16:55 +0200)]
Allow gcry_mpi_dump to print opaque MPIs.

* mpi/mpicoder.c (gcry_mpi_dump): Detect abd print opaque MPIs.
* tests/mpitests.c (test_opaque): New.
(main): Call new test.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agocipher: Prepare to pass extra info to the sign functions.
Werner Koch [Wed, 17 Jul 2013 13:54:32 +0000 (15:54 +0200)]
cipher: Prepare to pass extra info to the sign functions.

* src/gcrypt-module.h (gcry_pk_sign_t): Add parms flags and hashalgo.
* cipher/rsa.c (rsa_sign): Add parms and mark them as unused.
* cipher/dsa.c (dsa_sign): Ditto.
* cipher/elgamal.c (elg_sign): Ditto.
* cipher/pubkey.c (dummy_sign): Ditto.
(pubkey_sign): Pass 0 for the new args.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoFix a special case bug in mpi_powm for e==0.
Werner Koch [Wed, 17 Jul 2013 08:18:39 +0000 (10:18 +0200)]
Fix a special case bug in mpi_powm for e==0.

* mpi/mpi-pow.c (gcry_mpi_powm): For a zero exponent, make sure that
the result has been allocated.
--

This code triggered the problem:

    modulus = gcry_mpi_set_ui(NULL, 100);
    generator = gcry_mpi_set_ui(NULL, 3);
    exponent = gcry_mpi_set_ui(NULL, 0);
    result = gcry_mpi_new(0);
    gcry_mpi_powm(result, generator, exponent, modulus);

gcry_mpi_new(0) does not allocate the limb space thus it is not
possible to write even into the first limb.  Workaround was to use
gcry_mpi_new (1) but a real fix is better.

Reported-by: Ian Goldberg
Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoRegister DCO for Dmitry Kasatkin.
Werner Koch [Mon, 15 Jul 2013 07:46:38 +0000 (09:46 +0200)]
Register DCO for Dmitry Kasatkin.

--

6 years agoFix memory leak in t-mpi-point test
Dmitry Eremin-Solenikov [Sat, 13 Jul 2013 14:50:05 +0000 (18:50 +0400)]
Fix memory leak in t-mpi-point test

* tests/t-mpi-point.c (basic_ec_math, basic_ec_math_simplified): add
calls to gcry_ctx_release() to free contexts after they become unused.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
6 years agoFix 'Please include winsock2.h before windows.h' warnings with mingw32
Jussi Kivilinna [Wed, 26 Jun 2013 12:28:49 +0000 (15:28 +0300)]
Fix 'Please include winsock2.h before windows.h' warnings with mingw32

* random/rndw32.c: include winsock2.h before windows.h.
* src/ath.h [_WIN32]: Ditto.
* tests/benchmark.c [_WIN32]: Ditto.
--

Patch silences warnings of following type:
/usr/lib/gcc/i686-w64-mingw32/4.6/../../../../i686-w64-mingw32/include/winsock2.h:15:2: warning: #warning Please include winsock2.h before windows.h [-Wcpp]

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoRemove duplicate header from mpi/amd64/mpih-mul2.S
Jussi Kivilinna [Wed, 26 Jun 2013 13:57:00 +0000 (16:57 +0300)]
Remove duplicate header from mpi/amd64/mpih-mul2.S

* mpi/amd64/mpih-mul2.S: remove duplicated header.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoFix i386/amd64 inline assembly "cc" clobbers
Jussi Kivilinna [Thu, 27 Jun 2013 11:40:12 +0000 (14:40 +0300)]
Fix i386/amd64 inline assembly "cc" clobbers

* cipher/bithelp.h [__GNUC__, __i386__] (rol, ror): add "cc" globber
for inline assembly.
* cipher/cast5.c [__GNUC__, __i386__] (rol): Ditto.
* random/rndhw.c [USE_DRNG] (rdrand_long): Ditto.
* src/hmac256.c [__GNUC__, __i386__] (ror): Ditto.
* mpi/longlong.c [__i386__] (add_ssaaaa, sub_ddmmss, umul_ppmm)
(udiv_qrnnd, count_leading_zeros, count_trailing_zeros): Ditto.
--

These assembly snippets modify cflags but do not mark "cc" clobber.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agobufhelp: Suppress 'cast increases required alignment' warning
Jussi Kivilinna [Wed, 3 Jul 2013 09:14:56 +0000 (12:14 +0300)]
bufhelp: Suppress 'cast increases required alignment' warning

* cipher/bufhelp.h (buf_xor, buf_xor_2dst, buf_xor_n_copy): Cast
to larger element pointer through (void *) to suppress -Wcast-error.
--

Patch disables bogus warnings caused by -Wcast-error. We know that byte
pointers are properly aligned at these phases, or that hardware can handle
unaligned accesses.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agompi: Add __ARM_ARCH for older GCC
Jussi Kivilinna [Wed, 3 Jul 2013 08:32:25 +0000 (11:32 +0300)]
mpi: Add __ARM_ARCH for older GCC

* mpi/longlong.h [__arm__]: Construct __ARM_ARCH if not provided by
compiler.
--

GCC 4.8 defines __ARM_ARCH which provides forward compatible way to detect
ARM architecture. Use this when available and construct otherwise.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agompi: add missing "cc" clobber for ARM assembly
Jussi Kivilinna [Wed, 3 Jul 2013 12:10:11 +0000 (15:10 +0300)]
mpi: add missing "cc" clobber for ARM assembly

* mpi/longlong.h [__arm__] (add_ssaaaa, sub_ddmmss): Add __CLOBBER_CC.
[__arm__][__ARM_ARCH <= 3] (umul_ppmm): Ditto.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoTweak ARM inline assembly for mpi
Jussi Kivilinna [Wed, 3 Jul 2013 08:14:56 +0000 (11:14 +0300)]
Tweak ARM inline assembly for mpi

mpi/longlong.h [__arm__]: Enable inline assembly if __thumb2__ is
defined.
[__arm__]: Use __ARCH_ARM when defined.
[__arm__] [__ARM_ARCH >= 5] (count_leading_zeros): New.
--

Current ARM Linux distributions use EABI that enables thumb2, and therefore
inline assembly is disable (because !defined(__thumb__) selector). However
thumb2 allows the use of assembly instructions that longlong.h contains for
ARM. So this patch enables inline assembly for ARM when __thumb2__ is defined
in addition to __thumb__.

Patch also adds optimization for count_leading_zeros() macro for ARM.

Results on Cortex-A8, 1Ghz:
===

Before:

Algorithm         generate  100*sign  100*verify
------------------------------------------------
RSA 1024 bit         750ms    2780ms       110ms
RSA 2048 bit       14280ms   17250ms       300ms
RSA 3072 bit       38630ms   51300ms       650ms
RSA 4096 bit       60940ms   111430ms      1000ms
jussi@cubie:~/libgcrypt$ tests/benchmark dsa
Algorithm         generate  100*sign  100*verify
------------------------------------------------
DSA 1024/160             -    1410ms      1680ms
DSA 2048/224             -    6100ms      7390ms
DSA 3072/256             -   14350ms     17120ms
jussi@cubie:~/libgcrypt$ tests/benchmark ecc
Algorithm         generate  100*sign  100*verify
------------------------------------------------
ECDSA 192 bit         90ms    2160ms      3940ms
ECDSA 224 bit        110ms    2810ms      5400ms
ECDSA 256 bit        150ms    3570ms      6970ms
ECDSA 384 bit        340ms    8320ms     16420ms
ECDSA 521 bit        850ms   19760ms     38480ms

After:

jussi@cubie:~/libgcrypt$ tests/benchmark rsa
Algorithm         generate  100*sign  100*verify
------------------------------------------------
RSA 1024 bit         590ms    2230ms        80ms
RSA 2048 bit        2320ms   13090ms       240ms
RSA 3072 bit       60580ms   38420ms       460ms
RSA 4096 bit       115130ms   82250ms       750ms
jussi@cubie:~/libgcrypt$ tests/benchmark dsa
Algorithm         generate  100*sign  100*verify
------------------------------------------------
DSA 1024/160             -    1070ms      1290ms
DSA 2048/224             -    4500ms      5550ms
DSA 3072/256             -   10280ms     12200ms
jussi@cubie:~/libgcrypt$ tests/benchmark ecc
Algorithm         generate  100*sign  100*verify
------------------------------------------------
ECDSA 192 bit         70ms    1900ms      3560ms
ECDSA 224 bit        100ms    2490ms      4750ms
ECDSA 256 bit        120ms    3140ms      5920ms
ECDSA 384 bit        270ms    6990ms     13790ms
ECDSA 521 bit        680ms   17080ms     33490ms

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoMake gpg-error replacement defines more robust.
Werner Koch [Wed, 26 Jun 2013 09:09:42 +0000 (11:09 +0200)]
Make gpg-error replacement defines more robust.

* configure.ac (AH_BOTTOM): Move GPG_ERR_ replacement defines to ...
* src/gcrypt-int.h: new file.
* src/visibility.h, src/cipher.h: Replace gcrypt.h by gcrypt-int.h.
* tests/: Ditto for all test files.
--

Defining newer gpg-error codes in config.h was not a good idea,
because config.h is usually included before gpg-error.h and thus
gpg-error.h would be double defines to lead to faulty code there like

  typedef enum
    {
      [...]
      191 = 191,
      [...]
    };

6 years agoCheck if assembler is compatible with AMD64 assembly implementations cipher-amd64-optimizations
Jussi Kivilinna [Thu, 20 Jun 2013 11:20:36 +0000 (14:20 +0300)]
Check if assembler is compatible with AMD64 assembly implementations

* cipher/blowfish-amd64.S: Enable only if
HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS is defined.
* cipher/camellia-aesni-avx-amd64.S: Ditto.
* cipher/camellia-aesni-avx2-amd64.S: Ditto.
* cipher/cast5-amd64.S: Ditto.
* cipher/rinjdael-amd64.S: Ditto.
* cipher/serpent-avx2-amd64.S: Ditto.
* cipher/serpent-sse2-amd64.S: Ditto.
* cipher/twofish-amd64.S: Ditto.
* cipher/blowfish.c: Use AMD64 assembly implementation only if
HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS is defined
* cipher/camellia-glue.c: Ditto.
* cipher/cast5.c: Ditto.
* cipher/rijndael.c: Ditto.
* cipher/serpent.c: Ditto.
* cipher/twofish.c: Ditto.
* configure.ac: Check gcc/as compatibility with AMD64 assembly
implementations.
--

Later these checks can be split and assembly implementations adapted to handle
different platforms, but for now disable AMD64 assembly implementations if
assembler does not look to be able to handle them.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoOptimize _gcry_burn_stack for 32-bit and 64-bit architectures
Jussi Kivilinna [Sun, 9 Jun 2013 13:37:38 +0000 (16:37 +0300)]
Optimize _gcry_burn_stack for 32-bit and 64-bit architectures

* src/misc.c (_gcry_burn_stack): Add optimization for 32-bit and 64-bit
architectures.
--

Busy looping 'tests/benchmark --cipher-repetitions 10 cipher blowfish' on ARM
Cortex-A8 shows that _gcry_burn_stack takes 21% of CPU time. With this patch,
that number drops to 3.4%.

On AMD64 (Intel i5-4570) CPU usage for _gcry_burn_stack in the same test drops
from 3.5% to 1.1%.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoAdd Camellia AES-NI/AVX2 implementation
Jussi Kivilinna [Sun, 9 Jun 2013 13:37:38 +0000 (16:37 +0300)]
Add Camellia AES-NI/AVX2 implementation

* cipher/Makefile.am: Add 'camellia-aesni-avx2-amd64.S'.
* cipher/camellia-aesni-avx2-amd64.S: New file.
* cipher/camellia-glue.c (USE_AESNI_AVX2): New macro.
(CAMELLIA_context) [USE_AESNI_AVX2]: Add 'use_aesni_avx2'.
[USE_AESNI_AVX2] (_gcry_camellia_aesni_avx2_ctr_enc)
(_gcry_camellia_aesni_avx2_cbc_dec)
(_gcry_camellia_aesni_avx2_cfb_dec): New prototypes.
(camellia_setkey) [USE_AESNI_AVX2]: Check AVX2+AES-NI capable hardware
and set 'ctx->use_aesni_avx2'.
(_gcry_camellia_ctr_enc) [USE_AESNI_AVX2]: Add AVX2 accelerated code.
(_gcry_camellia_cbc_dec) [USE_AESNI_AVX2]: Add AVX2 accelerated code.
(_gcry_camellia_cfb_dec) [USE_AESNI_AVX2]: Add AVX2 accelerated code.
(selftest_ctr_128, selftest_cbc_128, selftest_cfb_128): Grow 'nblocks'
so that AVX2 codepaths get tested.
* configure.ac (camellia) [avx2support, aesnisupport]: Add
'camellia-aesni-avx2-amd64.lo'.
--

Add new AVX2/AES-NI implementation of Camellia that processes 32 blocks in
parallel.

Speed old (AVX/AES-NI) vs. new (AVX2/AES-NI) on Intel Core i5-4570:
                 ECB/Stream         CBC             CFB             OFB             CTR
              --------------- --------------- --------------- --------------- ---------------
CAMELLIA128    1.00x   0.99x   1.00x   1.53x   1.00x   1.49x   1.00x   1.00x   1.54x   1.54x
CAMELLIA256    0.99x   1.00x   1.00x   1.50x   1.00x   1.50x   1.00x   1.00x   1.54x   1.52x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoAdd Serpent AVX2 implementation
Jussi Kivilinna [Sun, 9 Jun 2013 13:37:38 +0000 (16:37 +0300)]
Add Serpent AVX2 implementation

* cipher/Makefile.am: Add 'serpent-avx2-amd64.S'.
* cipher/serpent-avx2-amd64.S: New file.
* cipher/serpent.c (USE_AVX2): New macro.
(serpent_context_t) [USE_AVX2]: Add 'use_avx2'.
[USE_AVX2] (_gcry_serpent_avx2_ctr_enc, _gcry_serpent_avx2_cbc_dec)
(_gcry_serpent_avx2_cfb_dec): New prototypes.
(serpent_setkey_internal) [USE_AVX2]: Check for AVX2 capable hardware
and set 'use_avx2'.
(_gcry_serpent_ctr_enc) [USE_AVX2]: Use AVX2 accelerated functions.
(_gcry_serpent_cbc_dec) [USE_AVX2]: Use AVX2 accelerated functions.
(_gcry_serpent_cfb_dec) [USE_AVX2]: Use AVX2 accelerated functions.
(selftest_ctr_128, selftest_cbc_128, selftest_cfb_128): Grow 'nblocks'
so that AVX2 codepaths are tested.
* configure.ac (serpent) [avx2support]: Add 'serpent-avx2-amd64.lo'.
--

Add new AVX2 implementation of Serpent that processes 16 blocks in parallel.

Speed old (SSE2) vs. new (AVX2) on Intel Core i5-4570:
                ECB/Stream         CBC             CFB             OFB             CTR
             --------------- --------------- --------------- --------------- ---------------
SERPENT128    1.00x   1.00x   1.00x   2.10x   1.00x   2.16x   1.01x   1.00x   2.16x   2.18x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoAdd detection for Intel AVX2 instruction set
Jussi Kivilinna [Sun, 9 Jun 2013 13:37:38 +0000 (16:37 +0300)]
Add detection for Intel AVX2 instruction set

* configure.ac: Add option --disable-avx2-support.
(HAVE_GCC_INLINE_ASM_AVX2): New.
(ENABLE_AVX2_SUPPORT): New.
* src/g10lib.h (HWF_INTEL_AVX2): New.
* src/global.c (hwflist): Add HWF_INTEL_AVX2.
* src/hwf-x86.c [__i386__] (get_cpuid): Initialize registers to zero
before cpuid.
[__x86_64__] (get_cpuid): Initialize registers to zero before cpuid.
(detect_x86_gnuc): Store maximum cpuid level.
(detect_x86_gnuc) [ENABLE_AVX2_SUPPORT]: Add detection for AVX2.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agotwofish: add amd64 assembly implementation
Jussi Kivilinna [Sun, 9 Jun 2013 13:37:38 +0000 (16:37 +0300)]
twofish: add amd64 assembly implementation

* cipher/Makefile.am: Add 'twofish-amd64.S'.
* cipher/twofish-amd64.S: New file.
* cipher/twofish.c (USE_AMD64_ASM): New macro.
[USE_AMD64_ASM] (_gcry_twofish_amd64_encrypt_block)
(_gcry_twofish_amd64_decrypt_block, _gcry_twofish_amd64_ctr_enc)
(_gcry_twofish_amd64_cbc_dec, _gcry_twofish_amd64_cfb_dec): New
prototypes.
[USE_AMD64_ASM] (do_twofish_encrypt, do_twofish_decrypt)
(twofish_encrypt, twofish_decrypt): New functions.
(_gcry_twofish_ctr_enc, _gcry_twofish_cbc_dec, _gcry_twofish_cfb_dec)
(selftest_ctr, selftest_cbc, selftest_cfb): New functions.
(selftest): Call new bulk selftests.
* cipher/cipher.c (gcry_cipher_open) [USE_TWOFISH]: Register Twofish
bulk functions for ctr-enc, cbc-dec and cfb-dec.
* configure.ac (twofish) [x86_64]: Add 'twofish-amd64.lo'.
* src/cipher.h (_gcry_twofish_ctr_enc, _gcry_twofish_cbc_dec)
(gcry_twofish_cfb_dec): New prototypes.
--

Provides non-parallel implementations for small speed-up and 3-way parallel
implementations that gets accelerated on `out-of-order' CPUs.

Speed old vs. new on Intel Core i5-4570:
                ECB/Stream         CBC             CFB             OFB             CTR
             --------------- --------------- --------------- --------------- ---------------
TWOFISH128     1.08x  1.07x    1.10x  1.80x    1.09x  1.70x    1.08x  1.08x    1.70x  1.69x

Speed old vs. new on Intel Core2 T8100:
                ECB/Stream         CBC             CFB             OFB             CTR
             --------------- --------------- --------------- --------------- ---------------
TWOFISH128     1.11x  1.10x    1.13x  1.65x    1.13x  1.62x    1.12x  1.11x    1.63x  1.59x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agorinjdael: add amd64 assembly implementation
Jussi Kivilinna [Wed, 29 May 2013 13:40:27 +0000 (16:40 +0300)]
rinjdael: add amd64 assembly implementation

* cipher/Makefile.am: Add 'rijndael-amd64.S'.
* cipher/rijndael-amd64.S: New file.
* cipher/rijndael.c (USE_AMD64_ASM): New macro.
[USE_AMD64_ASM] (_gcry_aes_amd64_encrypt_block)
(_gcry_aes_amd64_decrypt_block): New prototypes.
(do_encrypt_aligned) [USE_AMD64_ASM]: Use amd64 assembly function.
(do_encrypt): Disable input/output alignment when USE_AMD64_ASM is set.
(do_decrypt_aligned) [USE_AMD64_ASM]: Use amd64 assembly function.
(do_decrypt): Disable input/output alignment when USE_AMD64_AES is set.
* configure.ac (aes) [x86-64]: Add 'rijndael-amd64.lo'.
--

Add optimized amd64 assembly implementation for AES.

Old vs new, on AMD Phenom II:
          ECB/Stream         CBC             CFB             OFB             CTR
       --------------- --------------- --------------- --------------- ---------------
AES     1.74x   1.72x   1.81x   1.85x   1.82x   1.76x   1.67x   1.64x   1.79x   1.81x
AES192  1.77x   1.77x   1.79x   1.88x   1.90x   1.80x   1.69x   1.69x   1.85x   1.81x
AES256  1.79x   1.81x   1.83x   1.89x   1.88x   1.82x   1.72x   1.70x   1.87x   1.89x

Old vs new, on Intel Core2:
          ECB/Stream         CBC             CFB             OFB             CTR
       --------------- --------------- --------------- --------------- ---------------
AES     1.77x   1.75x   1.78x   1.76x   1.76x   1.77x   1.75x   1.76x   1.76x   1.82x
AES192  1.80x   1.73x   1.81x   1.76x   1.79x   1.85x   1.77x   1.76x   1.80x   1.85x
AES256  1.81x   1.77x   1.81x   1.77x   1.80x   1.79x   1.78x   1.77x   1.81x   1.85x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoblowfish: add amd64 assembly implementation
Jussi Kivilinna [Wed, 29 May 2013 13:40:27 +0000 (16:40 +0300)]
blowfish: add amd64 assembly implementation

* cipher/Makefile.am: Add 'blowfish-amd64.S'.
* cipher/blowfish-amd64.S: New file.
* cipher/blowfish.c (USE_AMD64_ASM): New macro.
[USE_AMD64_ASM] (_gcry_blowfish_amd64_do_encrypt)
(_gcry_blowfish_amd64_encrypt_block)
(_gcry_blowfish_amd64_decrypt_block, _gcry_blowfish_amd64_ctr_enc)
(_gcry_blowfish_amd64_cbc_dec, _gcry_blowfish_amd64_cfb_dec): New
prototypes.
[USE_AMD64_ASM] (do_encrypt, do_encrypt_block, do_decrypt_block)
(encrypt_block, decrypt_block): New functions.
(_gcry_blowfish_ctr_enc, _gcry_blowfish_cbc_dec)
(_gcry_blowfish_cfb_dec, selftest_ctr, selftest_cbc, selftest_cfb): New
functions.
(selftest): Call new bulk selftests.
* cipher/cipher.c (gcry_cipher_open) [USE_BLOWFISH]: Register Blowfish
bulk functions for ctr-enc, cbc-dec and cfb-dec.
* configure.ac (blowfish) [x86_64]: Add 'blowfish-amd64.lo'.
* src/cipher.h (_gcry_blowfish_ctr_enc, _gcry_blowfish_cbc_dec)
(gcry_blowfish_cfb_dec): New prototypes.
--

Add non-parallel functions for small speed-up and 4-way parallel functions for
modes of operation that support parallel processing.

Speed old vs. new on AMD Phenom II X6 1055T:
                ECB/Stream         CBC             CFB             OFB             CTR
             --------------- --------------- --------------- --------------- ---------------
BLOWFISH      1.21x   1.12x   1.17x   3.52x   1.18x   3.34x   1.16x   1.15x   3.38x   3.47x

Speed old vs. new on Intel Core i5-2450M (Sandy-Bridge):
                ECB/Stream         CBC             CFB             OFB             CTR
             --------------- --------------- --------------- --------------- ---------------
BLOWFISH      1.16x   1.10x   1.17x   2.98x   1.18x   2.88x   1.16x   1.15x   3.00x   3.02x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoecc: Simplify the compliant point generation.
Werner Koch [Fri, 24 May 2013 14:54:52 +0000 (16:54 +0200)]
ecc: Simplify the compliant point generation.

* cipher/ecc.c (generate_key): Use point_snatch_set, replaces unneeded
variable copies, etc.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoecc: Fix a minor flaw in the generation of K.
Werner Koch [Fri, 24 May 2013 13:52:37 +0000 (15:52 +0200)]
ecc: Fix a minor flaw in the generation of K.

* cipher/dsa.c (gen_k): Factor code out to ..
* cipher/dsa-common.c (_gcry_dsa_gen_k): new file and function.  Add
arg security_level and re-indent a bit.
* cipher/ecc.c (gen_k): Remove and change callers to _gcry_dsa_gen_k.
* cipher/dsa.c: Include pubkey-internal.
* cipher/Makefile.am (libcipher_la_SOURCES): Add dsa-common.c
--

The ECDSA code used the simple $k = k \bmod p$ method which introduces
a small bias.  We now use the bias free method we have always used
with DSA.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agocast5: add amd64 assembly implementation
Jussi Kivilinna [Fri, 24 May 2013 09:43:29 +0000 (12:43 +0300)]
cast5: add amd64 assembly implementation

* cipher/Makefile.am: Add 'cast5-amd64.S'.
* cipher/cast5-amd64.S: New file.
* cipher/cast5.c (USE_AMD64_ASM): New macro.
(_gcry_cast5_s1tos4): Merge arrays s1, s2, s3, s4 to single array to
simplify access from assembly implementation.
(s1, s2, s3, s4): New macros pointing to subarrays in
_gcry_cast5_s1tos4.
[USE_AMD64_ASM] (_gcry_cast5_amd64_encrypt_block)
(_gcry_cast5_amd64_decrypt_block, _gcry_cast5_amd64_ctr_enc)
(_gcry_cast5_amd64_cbc_dec, _gcry_cast5_amd64_cfb_dec): New prototypes.
[USE_AMD64_ASM] (do_encrypt_block, do_decrypt_block, encrypt_block)
(decrypt_block): New functions.
(_gcry_cast5_ctr_enc, _gcry_cast5_cbc_dec, _gcry_cast5_cfb_dec)
(selftest_ctr, selftest_cbc, selftest_cfb): New functions.
(selftest): Call new bulk selftests.
* cipher/cipher.c (gcry_cipher_open) [USE_CAST5]: Register CAST5 bulk
functions for ctr-enc, cbc-dec and cfb-dec.
* configure.ac (cast5) [x86_64]: Add 'cast5-amd64.lo'.
* src/cipher.h (_gcry_cast5_ctr_enc, _gcry_cast5_cbc_dec)
(gcry_cast5_cfb_dec): New prototypes.
--

Provides non-parallel implementations for small speed-up and 4-way parallel
implementations that gets accelerated on `out-of-order' CPUs.

Speed old vs. new on AMD Phenom II X6 1055T:
                ECB/Stream         CBC             CFB             OFB             CTR
             --------------- --------------- --------------- --------------- ---------------
CAST5         1.23x   1.22x   1.21x   2.86x   1.21x   2.83x   1.22x   1.17x   2.73x   2.73x

Speed old vs. new on Intel Core i5-2450M (Sandy-Bridge):
                ECB/Stream         CBC             CFB             OFB             CTR
             --------------- --------------- --------------- --------------- ---------------
CAST5         1.00x   1.04x   1.06x   2.56x   1.06x   2.37x   1.03x   1.01x   2.43x   2.41x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agocipher-selftest: make selftest work with any block-size
Jussi Kivilinna [Fri, 24 May 2013 09:43:24 +0000 (12:43 +0300)]
cipher-selftest: make selftest work with any block-size

* cipher/cipher-selftest.c (_gcry_selftest_helper_cbc_128)
(_gcry_selftest_helper_cfb_128, _gcry_selftest_helper_ctr_128): Renamed
functions from '<name>_128' to '<name>'.
(_gcry_selftest_helper_cbc, _gcry_selftest_helper_cfb)
(_gcry_selftest_helper_ctr): Make work with different block sizes.
* cipher/cipher-selftest.h (_gcry_selftest_helper_cbc_128)
(_gcry_selftest_helper_cfb_128, _gcry_selftest_helper_ctr_128): Renamed
prototypes from '<name>_128' to '<name>'.
* cipher/camellia-glue.c (selftest_ctr_128, selftest_cfb_128)
(selftest_ctr_128): Change to use new function names.
* cipher/rijndael.c (selftest_ctr_128, selftest_cfb_128)
(selftest_ctr_128): Change to use new function names.
* cipher/serpent.c (selftest_ctr_128, selftest_cfb_128)
(selftest_ctr_128): Change to use new function names.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoserpent: add parallel processing for CFB decryption
Jussi Kivilinna [Thu, 23 May 2013 11:15:51 +0000 (14:15 +0300)]
serpent: add parallel processing for CFB decryption

* cipher/cipher.c (gcry_cipher_open): Add bulf CFB decryption function
for Serpent.
* cipher/serpent-sse2-amd64.S (_gcry_serpent_sse2_cfb_dec): New
function.
* cipher/serpent.c (_gcry_serpent_sse2_cfb_dec): New prototype.
(_gcry_serpent_cfb_dec) New function.
(selftest_cfb_128) New function.
(selftest) Call selftest_cfb_128.
* src/cipher.h (_gcry_serpent_cfb_dec): New prototype.
--

Patch makes Serpent-CFB decryption 4.0 times faster on Intel Sandy-Bridge and
2.7 times faster on AMD K10.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agocamellia: add parallel processing for CFB decryption
Jussi Kivilinna [Thu, 23 May 2013 11:15:46 +0000 (14:15 +0300)]
camellia: add parallel processing for CFB decryption

* cipher/camellia-aesni-avx-amd64.S
(_gcry_camellia_aesni_avx_cfb_dec): New function.
* cipher/camellia-glue.c (_gcry_camellia_aesni_avx_cfb_dec): New
prototype.
(_gcry_camellia_cfb_dec): New function.
(selftest_cfb_128): New function.
(selftest): Call selftest_cfb_128.
* cipher/cipher.c (gry_cipher_open): Add bulk CFB decryption function
for Camellia.
* src/cipher.h (_gcry_camellia_cfb_dec): New prototype.
--

Patch makes Camellia-CFB decryption 4.7 times faster on Intel Sandy-Bridge.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agorinjdael: add parallel processing for CFB decryption with AES-NI
Jussi Kivilinna [Thu, 23 May 2013 11:15:41 +0000 (14:15 +0300)]
rinjdael: add parallel processing for CFB decryption with AES-NI

* cipher/cipher-selftest.c (_gcry_selftest_helper_cfb_128): New
function for CFB selftests.
* cipher/cipher-selftest.h (_gcry_selftest_helper_cfb_128): New
prototype.
* cipher/rijndael.c [USE_AESNI] (do_aesni_enc_vec4): New function.
(_gcry_aes_cfb_dec) [USE_AESNI]: Add parallelized CFB decryption.
(selftest_cfb_128): New function.
(selftest): Call selftest_cfb_128.
--

CFB decryption can be parallelized for additional performance. On Intel
Sandy-Bridge processor, this change makes CFB decryption 4.6 times faster.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoAvoid compiler warning due to the global symbol setkey.
Werner Koch [Thu, 18 Apr 2013 12:40:43 +0000 (14:40 +0200)]
Avoid compiler warning due to the global symbol setkey.

* cipher/cipher-selftest.c (_gcry_selftest_helper_cbc_128)
(_gcry_selftest_helper_ctr_128): Rename setkey to setkey_func.
--

setkey is a POSIX.1 function defined in stdlib.

6 years agoserpent: add SSE2 accelerated amd64 implementation
Jussi Kivilinna [Thu, 23 May 2013 08:04:18 +0000 (11:04 +0300)]
serpent: add SSE2 accelerated amd64 implementation

* configure.ac (serpent): Add 'serpent-sse2-amd64.lo'.
* cipher/Makefile.am (EXTRA_libcipher_la_SOURCES): Add
'serpent-sse2-amd64.S'.
* cipher/cipher.c (gcry_cipher_open) [USE_SERPENT]: Register bulk
functions for CBC-decryption and CTR-mode.
* cipher/serpent.c (USE_SSE2): New macro.
[USE_SSE2] (_gcry_serpent_sse2_ctr_enc, _gcry_serpent_sse2_cbc_dec):
New prototypes to assembler functions.
(serpent_setkey): Set 'serpent_init_done' before calling serpent_test.
(_gcry_serpent_ctr_enc): New function.
(_gcry_serpent_cbc_dec): New function.
(selftest_ctr_128): New function.
(selftest_cbc_128): New function.
(selftest): Call selftest_ctr_128 and selftest_cbc_128.
* cipher/serpent-sse2-amd64.S: New file.
* src/cipher.h (_gcry_serpent_ctr_enc): New prototype.
(_gcry_serpent_cbc_dec): New prototype.
--

[v2]: Converted to SSE2, to support all amd64 processors (SSE2 is required
      feature by AMD64 SysV ABI).

Patch adds word-sliced SSE2 implementation of Serpent for amd64 for speeding
up parallelizable workloads (CTR mode, CBC mode decryption). Implementation
processes eight blocks in parallel, with two four-block sets interleaved for
out-of-order scheduling.

Speed old vs. new on Intel Core i5-2450M (Sandy-Bridge):
                ECB/Stream         CBC             CFB             OFB             CTR
             --------------- --------------- --------------- --------------- ---------------
SERPENT128    1.00x   0.99x   1.00x   3.98x   1.00x   1.01x   1.00x   1.01x   4.04x   4.04x

Speed old vs. new on AMD Phenom II X6 1055T:
                ECB/Stream         CBC             CFB             OFB             CTR
             --------------- --------------- --------------- --------------- ---------------
SERPENT128    1.02x   1.01x   1.00x   2.83x   1.00x   1.00x   1.00x   1.00x   2.72x   2.72x

Speed old vs. new on Intel Core2 Duo T8100:
                ECB/Stream         CBC             CFB             OFB             CTR
             --------------- --------------- --------------- --------------- ---------------
SERPENT128    1.00x   1.02x   0.97x   4.02x   0.98x   1.01x   0.98x   1.00x   3.82x   3.91x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoSerpent: faster S-box implementation
Jussi Kivilinna [Thu, 23 May 2013 08:04:13 +0000 (11:04 +0300)]
Serpent: faster S-box implementation

* cipher/serpent.c (SBOX0, SBOX1, SBOX2, SBOX3, SBOX4, SBOX5, SBOX6)
(SBOX7, SBOX0_INVERSE, SBOX1_INVERSE, SBOX2_INVERSE, SBOX3_INVERSE)
(SBOX4_INVERSE, SBOX5_INVERSE, SBOX6_INVERSE, SBOX7_INVERSE): Replace
with new definitions.
--

These new S-box definitions are from paper:
 D. A. Osvik, “Speeding up Serpent,” in Third AES Candidate Conference,
 (New York, New York, USA), p. 317–329, National Institute of Standards and
 Technology, 2000. Available at http://www.ii.uib.no/~osvik/pub/aes3.ps.gz

Although these were optimized for two-operand instructions on i386 and for
old Pentium-1 processors, they are slightly faster on current processors
on i386 and x86-64. On ARM, the performance of these S-boxes is about the
same as with the old S-boxes.

new vs old speed ratios (AMD K10, x86-64):
                 ECB/Stream         CBC             CFB             OFB             CTR
              --------------- --------------- --------------- --------------- ---------------
 SERPENT128     1.06x   1.02x   1.06x   1.02x   1.06x   1.06x   1.06x   1.05x   1.07x   1.07x

new vs old speed ratios (Intel Atom, i486):
                 ECB/Stream         CBC             CFB             OFB             CTR
              --------------- --------------- --------------- --------------- ---------------
 SERPENT128     1.12x   1.15x   1.12x   1.15x   1.13x   1.11x   1.12x   1.12x   1.12x   1.13x

new vs old speed ratios (ARM Cortex A8):
                 ECB/Stream         CBC             CFB             OFB             CTR
              --------------- --------------- --------------- --------------- ---------------
 SERPENT128     1.04x   1.02x   1.02x   0.99x   1.02x   1.02x   1.03x   1.03x   1.01x   1.01x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agow32: Fix installing of .def file.
Werner Koch [Thu, 18 Apr 2013 12:40:43 +0000 (14:40 +0200)]
w32: Fix installing of .def file.

* src/Makefile.am (install-def-file): Create libdir first.
--

Reported-by: LRN <lrn1986@gmail.com>
6 years agoRegister a DCO.
Werner Koch [Thu, 25 Apr 2013 11:00:16 +0000 (12:00 +0100)]
Register a DCO.

--

6 years agoAdd control commands to disable mlock and setuid dropping.
Werner Koch [Thu, 18 Apr 2013 12:40:43 +0000 (14:40 +0200)]
Add control commands to disable mlock and setuid dropping.

* src/gcrypt.h.in (GCRYCTL_DISABLE_LOCKED_SECMEM): New.
(GCRYCTL_DISABLE_PRIV_DROP): New.
* src/global.c (_gcry_vcontrol): Implement them.
* src/secmem.h (GCRY_SECMEM_FLAG_NO_MLOCK): New.
(GCRY_SECMEM_FLAG_NO_PRIV_DROP): New.
* src/secmem.c (no_mlock, no_priv_drop): New.
(_gcry_secmem_set_flags, _gcry_secmem_get_flags): Set and get them.
(lock_pool): Handle no_mlock and no_priv_drop.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoFix libtool 2.4.2 to correctly detect .def files.
Werner Koch [Thu, 18 Apr 2013 12:40:43 +0000 (14:40 +0200)]
Fix libtool 2.4.2 to correctly detect .def files.

* ltmain.sh (sed_uncomment_deffile): New.
(orig_export_symbols): Uncomment def file before testing for EXPORTS.
* m4/libtool.m4: Do the same for the generated code.
--

The old code was not correct in that it only looked at the first line
and puts an EXPORTS keyword in front if missing.  Binutils 2.22
accepted a duplicated EXPORTS keyword but at least 2.23.2 is more
stringent and bails out without this fix.

There is no need to send this upstream.  Upstream's git master has a
lot of changes including a similar fix for this problems.  There are
no signs that a libtool 2.4.3 will be released to fix this problem and
thus we need to stick to our copy of 2.4.2 along with this patch.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoAdd AES bulk CBC decryption selftest
Jussi Kivilinna [Wed, 22 May 2013 11:11:10 +0000 (14:11 +0300)]
Add AES bulk CBC decryption selftest

* cipher/rinjdael.c (selftest_cbc_128): New.
(selftest): Call selftest_cbc_128.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoChange AES bulk CTR encryption selftest use new selftest helper function
Jussi Kivilinna [Wed, 22 May 2013 11:11:04 +0000 (14:11 +0300)]
Change AES bulk CTR encryption selftest use new selftest helper function

* cipher/rinjdael.c: (selftest_ctr_128): Change to use new selftest
helper function.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoConvert bulk CTR and CBC selftest functions in Camellia to generic selftest helper...
Jussi Kivilinna [Wed, 22 May 2013 11:10:59 +0000 (14:10 +0300)]
Convert bulk CTR and CBC selftest functions in Camellia to generic selftest helper functions

* cipher/Makefile.am (libcipher_la_SOURCES): Add cipher-selftest files.
* cipher/camellia-glue.c (selftest_ctr_128, selftest_cbc_128): Change
to use the new selftest helper functions.
* cipher/cipher-selftest.c: New.
* cipher/cipher-selftest.h: New.
--

Convert selftest functions into generic helper functions for code sharing.

[v2]: use syslog for more detailed selftest error messages

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agocamellia: add bulk CBC decryption selftest
Jussi Kivilinna [Wed, 22 May 2013 11:10:54 +0000 (14:10 +0300)]
camellia: add bulk CBC decryption selftest

* cipher/camellia-glue.c: (selftest_cbc_128): New selftest function for
bulk CBC decryption.
(selftest): Add call to selftest_cbc_128.
--

Add selftest for the parallel code paths in bulk CBC decryption.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agocamellia: Rename camellia_aesni_avx_x86-64.S to camellia-aesni-avx-amd64.S
Jussi Kivilinna [Wed, 22 May 2013 09:06:03 +0000 (12:06 +0300)]
camellia: Rename camellia_aesni_avx_x86-64.S to camellia-aesni-avx-amd64.S

* cipher/camellia_aesni_avx_x86-64.S: Remove.
* cipher/camellia-aesni-avx-amd64.S: New.
* cipher/Makefile.am: Use the new filename.
* configure.ac: Use the new filename.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoFix indentation and save on string space.
Werner Koch [Thu, 25 Apr 2013 11:00:16 +0000 (12:00 +0100)]
Fix indentation and save on string space.

* cipher/ecc.c (generate_key): Use the same string for both fatal
messages.

6 years agompi_sub( r, a, b ) expects r to be initialized; other minor cleanup in ecc generate_k...
Andrey [Mon, 20 May 2013 04:34:48 +0000 (21:34 -0700)]
mpi_sub( r, a, b ) expects r to be initialized; other minor cleanup in ecc generate_key compliant key generation.

This fixes the 'make check' of libgcrypt.

6 years agoGenerate ECC keys Q=(x,y) as compliant keys, enabling their compact representation... compliant-ecc-keygen
Andrey [Thu, 9 May 2013 21:38:46 +0000 (14:38 -0700)]
Generate ECC keys Q=(x,y) as compliant keys, enabling their compact representation as simply x.

See http://tools.ietf.org/html/draft-jivsov-ecc-compact for the method description and security proof.
This tweak doesn't change any format; it is only a preparation without any negative impact for future changes.

6 years agocipher: Fix regression in Padlock support.
Werner Koch [Thu, 18 Apr 2013 12:40:43 +0000 (14:40 +0200)]
cipher: Fix regression in Padlock support.

* cipher/rijndael.c (do_setkey): Remove dummy padlock key generation case
and use the standard one.
--

This is really a brown paper bag bug.  I should have been able to
fix it by a bit of code staring or bi-secting it myself.  Instead
Rafaël Carré did this and with the donation of a VIA nano board from
Stefan Krüger.  Thanks to both of you.

(regression since commit b825c5db17292988d261fefdc83cbc43d97d4b02)

Signed-off-by: Werner Koch <wk@gnupg.org>
(cherry picked from commit f1f016855418aae561ede4472590d45a24ab4476)

6 years agompi: Yet another fix to get option flag munging right.
Werner Koch [Thu, 18 Apr 2013 12:40:43 +0000 (14:40 +0200)]
mpi: Yet another fix to get option flag munging right.

* cipher/Makefile.am (o_flag_munging): Yet another fix.

6 years agompi: Make using gcc's -Ofast easier.
Werner Koch [Mon, 18 Mar 2013 08:02:35 +0000 (09:02 +0100)]
mpi: Make using gcc's -Ofast easier.

* cipher/Makefile.am (o_flag_munging): Take -Ofast in account.
--

GnuPG-bug-id: 1468
(cherry picked from commit d313255350e6f397500ce23714ddec8780f32449)

6 years agoFix alignment problem in idea.c.
Werner Koch [Thu, 18 Apr 2013 12:40:43 +0000 (14:40 +0200)]
Fix alignment problem in idea.c.

* cipher/idea.c (cipher): Rework parameter use to fix alignment
problems.

* cipher/idea.c (FNCCAST_SETKEY, FNCCAST_CRYPT): Remove unused macros.

Signed-off-by: Werner Koch <wk@gnupg.org>
Fix alignment problem in idea.c.

* cipher/idea.c (cipher): Rework parameter use to fix alignment
problems.

* cipher/idea.c (FNCCAST_SETKEY, FNCCAST_CRYPT): Remove unused macros.

Signed-off-by: Werner Koch <wk@gnupg.org>
(cherry picked from 4cd279556777e02eda79973f68efaa4b741f9175)

6 years agoAdd some const attributes.
Vladimir Serbinenko [Thu, 18 Apr 2013 11:37:49 +0000 (13:37 +0200)]
Add some const attributes.

* cipher/md4.c (transform): Add const attribute.
* cipher/md5.c (transform): Ditto.
* cipher/rmd160.c (transform): Ditto.
--

This is the same as
  http://bzr.savannah.gnu.org/lh/grub/trunk/grub/revision/3685

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoFix alignment problem in serpent.c.
Vladimir Serbinenko [Thu, 18 Apr 2013 11:22:34 +0000 (13:22 +0200)]
Fix alignment problem in serpent.c.

* cipher/serpent.c (serpent_key_prepare): Fix misaligned access.
(serpent_setkey): Likewise.
(serpent_encrypt_internal): Likewise.
(serpent_decrypt_internal): Likewise.
(serpent_encrypt): Don't put an alignment-increasing cast.
(serpent_decrypt): Likewise.
(serpent_test): Likewise.
--

This is a port of the fix for the Libgcrypt code in GRUB:
  http://bzr.savannah.gnu.org/lh/grub/trunk/grub/revision/3685
GRUB is FSF copyrighted and thus we can use this code without a DCO.

Note that the above fix was not correct and failed the selftests, thus
I fixed this fix.

GnuPG-bug-id: 1384
Signed-off-by: Werner Koch <wk@gnupg.org>
(cherry picked from commit 8eab66ad6852ec985bfb1e7fec35981d5e31148a)

6 years agoFix multiply by zero in gcry_mpi_ec_mul.
Werner Koch [Tue, 16 Apr 2013 16:59:22 +0000 (18:59 +0200)]
Fix multiply by zero in gcry_mpi_ec_mul.

* mpi/ec.c (_gcry_mpi_ec_mul_point): Handle case of SCALAR == 0.
* tests/t-mpi-point.c (basic_ec_math): Add a test case for this.

Signed-off-by: Werner Koch <wk@wheatstone.g10code.de>
6 years agoAdd macros to return pre-defined MPIs.
Werner Koch [Mon, 15 Apr 2013 09:52:54 +0000 (11:52 +0200)]
Add macros to return pre-defined MPIs.

* src/gcrypt.h.in (GCRYMPI_CONST_ONE, GCRYMPI_CONST_TWO)
(GCRYMPI_CONST_THREE, GCRYMPI_CONST_FOUR, GCRYMPI_CONST_EIGHT): New.
(_gcry_mpi_get_const): New private function.
* src/visibility.c (_gcry_mpi_get_const): New.
* src/visibility.h: Mark it visible.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoFix addition of EC points.
Werner Koch [Mon, 15 Apr 2013 09:11:58 +0000 (11:11 +0200)]
Fix addition of EC points.

* mpi/ec.c (_gcry_mpi_ec_add_points): Fix case of P1 given in affine
coordinates.
--

This was a plain copy and paste error, which was found due to explicit
use of affine coordinates by GNUnet's new pseudonyms code.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoAdd hack to allow using an "ecc" key for "ecdsa" or "ecdh".
Werner Koch [Thu, 11 Apr 2013 22:16:24 +0000 (00:16 +0200)]
Add hack to allow using an "ecc" key for "ecdsa" or "ecdh".

* cipher/pubkey.c (sexp_to_key): Add optional arg USE.
(gcry_pk_encrypt, gcry_pk_decrypt): Call sexp_to_key with usage sign.
(gcry_pk_sign, gcry_pk_verify): Call sexp_to_key with usage encrypt.
* tests/basic.c (show_sexp): New.
(check_pubkey_sign): Print test number and add cases for ecc.
(check_pubkey_sign_ecdsa): New.
(do_check_one_pubkey): Divert to new function.
--

The problem we try to address is that in the mdoule specs both, ECDSA
and ECDH have the same alias name "ecc".  This patch allows to use for
example gcry_pk_verify with a key that has only "ecc" in it.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoAdd gcry_pubkey_get_sexp.
Werner Koch [Thu, 11 Apr 2013 18:27:46 +0000 (20:27 +0200)]
Add gcry_pubkey_get_sexp.

* src/gcrypt.h.in (GCRY_PK_GET_PUBKEY): New.
(GCRY_PK_GET_SECKEY): New.
(gcry_pubkey_get_sexp): New.
* src/visibility.c (gcry_pubkey_get_sexp): New.
* src/visibility.h (gcry_pubkey_get_sexp): Mark visible.
* src/libgcrypt.def, src/libgcrypt.vers: Add new function.
* cipher/pubkey-internal.h: New.
* cipher/Makefile.am (libcipher_la_SOURCES): Add new file.
* cipher/ecc.c: Include pubkey-internal.h
(_gcry_pk_ecc_get_sexp): New.
* cipher/pubkey.c: Include pubkey-internal.h and context.h.
(_gcry_pubkey_get_sexp): New.
* src/context.c (_gcry_ctx_find_pointer): New.
* src/cipher-proto.h: Add _gcry_pubkey_get_sexp.
* tests/t-mpi-point.c (print_sexp): New.
(context_param, basic_ec_math_simplified): Add tests for the new
function.

* configure.ac (NEED_GPG_ERROR_VERSION): Set to 1.11.
(AH_BOTTOM) Add error codes from gpg-error 1.12
* src/g10lib.h (fips_not_operational): Use GPG_ERR_NOT_OPERATIONAL.

* mpi/ec.c (_gcry_mpi_ec_get_mpi): Fix computation of Q.
(_gcry_mpi_ec_get_point): Ditto.
--

While checking the new code I figured that the auto-computation of Q
must have led to a segv.  It seems we had no test case for that.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoRemove obsolete warning note from gcry_pk_keygrip.
Werner Koch [Thu, 11 Apr 2013 08:43:05 +0000 (10:43 +0200)]
Remove obsolete warning note from gcry_pk_keygrip.

--

The keygrip is for a long time now a standard feature of libgcrypt.
The existance of the warning comment in gcrypt.h was an oversight.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoRemove unused code.
Werner Koch [Thu, 11 Apr 2013 08:38:22 +0000 (10:38 +0200)]
Remove unused code.

* cipher/pubkey.c (_gcry_pk_module_lookup, _gcry_pk_module_release)
(_gcry_pk_get_elements): Remove.
--

This code was only used by the removed ac interface.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoClarify DCO for Werner Koch
Werner Koch [Fri, 5 Apr 2013 16:28:01 +0000 (18:28 +0200)]
Clarify DCO for Werner Koch

--

All work on Libgcrypt done by Werner Koch is work made for hire by his
company.  Added as a mail style comment to the signed-off-by address.

6 years agoMake the Q parameter optional for ECC signing.
Werner Koch [Fri, 5 Apr 2013 16:08:36 +0000 (18:08 +0200)]
Make the Q parameter optional for ECC signing.

* cipher/ecc.c (ecc_sign): Remove the need for Q.
* cipher/pubkey.c (sexp_elements_extract_ecc): Make Q optional for a
private key.
(sexp_to_key): Add optional arg R_IS_ECC.
(gcry_pk_sign): Do not call gcry_pk_get_nbits for ECC keys.
* tests/pubkey.c (die): Make sure to print a LF.
(check_ecc_sample_key): New.
(main): Call new test.
--

Q is the actual public key which is not used for signing.  Thus we
can make it optional and even speed up the signing by parsing less
stuff.

Note: There seems to be a memory leak somewhere.  Running tests/pubkey
with just the new test enabled shows it.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoAdd test case for SCRYPT and rework the code.
Werner Koch [Fri, 5 Apr 2013 10:23:41 +0000 (12:23 +0200)]
Add test case for SCRYPT and rework the code.

* tests/t-kdf.c (check_scrypt): New.
(main): Call new test.

* configure.ac: Support disabling of the scrypt algorithm.  Make KDF
enabling similar to the other algorithm classes.  Disable scrypt if we
don't have a 64 bit type.
* cipher/memxor.c, cipher/memxor.h: Remove.
* cipher/scrypt.h: Remove.
* cipher/kdf-internal.h: New.
* cipher/Makefile.am: Remove files.  Add new file.  Move scrypt.c to
EXTRA_libcipher_la_SOURCES.
(GCRYPT_MODULES): Add GCRYPT_KDFS.
* src/gcrypt.h.in (GCRY_KDF_SCRYPT): Change value.
* cipher/kdf.c (pkdf2): Rename to _gcry_kdf_pkdf2.
(_gcry_kdf_pkdf2): Don't bail out for SALTLEN==0.
(gcry_kdf_derive): Allow for a passwordlen of zero for scrypt.  Check
for SALTLEN > 0 for GCRY_KDF_PBKDF2.  Pass algo to _gcry_kdf_scrypt.
(gcry_kdf_derive) [!USE_SCRYPT]: Return an error.
* cipher/scrypt.c: Replace memxor.h by bufhelp.h.  Replace scrypt.h by
kdf-internal.h.  Enable code only if HAVE_U64_TYPEDEF is defined.
Replace C99 types uint64_t, uint32_t, and uint8_t by libgcrypt types.
(_SALSA20_INPUT_LENGTH): Remove underscore from identifier.
(_scryptBlockMix): Replace memxor by buf_xor.
(_gcry_kdf_scrypt): Use gcry_malloc and gcry_free.  Check for integer
overflow.  Add hack to support blocksize of 1 for tests.  Return
errors from calls to _gcry_kdf_pkdf2.

* cipher/kdf.c (openpgp_s2k): Make static.
--

This patch prepares the addition of more KDF functions, brings the
code into Libgcrypt shape, adds a test case and makes the code more
robust.  For example, scrypt would have fail silently if Libgcrypt was
not build with SHA256 support.  Also fixed symbol naming for systems
without a visibility support.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoAdd the SCRYPT KDF function
Christian Grothoff [Thu, 4 Apr 2013 14:12:16 +0000 (16:12 +0200)]
Add the SCRYPT KDF function

* scrypt.c, scrypt.h: New files.
* memxor.c, memxor.h: New files.
* cipher/Makefile.am: Add new files.
* cipher/kdf.c (gcry_kdf_derive): Support GCRY_KDF_SCRYPT.
* src/gcrypt.h.in (GCRY_KDF_SCRYPT): New.
--

Signed-off-by: Christian Grothoff <christian@grothoff.org>
I added the ChangeLog entry and the missing signed-off line.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoDoc fix.
Werner Koch [Tue, 26 Mar 2013 20:21:41 +0000 (21:21 +0100)]
Doc fix.

--

6 years agoAdd DCO by Christian Grothoff
Werner Koch [Fri, 22 Mar 2013 10:57:46 +0000 (11:57 +0100)]
Add DCO by Christian Grothoff

--

6 years agoReplace deprecated AM_CONFIG_HEADER macro.
Werner Koch [Fri, 22 Mar 2013 10:44:15 +0000 (11:44 +0100)]
Replace deprecated AM_CONFIG_HEADER macro.

* configure.ac: s/AM_CONFIG_HEADER/AC_CONFIG_HEADER/

6 years agoDisable AES-NI support if as does not support SSSE3.
Werner Koch [Fri, 22 Mar 2013 10:41:11 +0000 (11:41 +0100)]
Disable AES-NI support if as does not support SSSE3.

* configure.ac (HAVE_GCC_INLINE_ASM_SSSE3): New test.
(ENABLE_AESNI_SUPPORT): Do not define without SSSE3 support.
(HAVE_GCC_INLINE_ASM_SSSE3, ENABLE_AVX_SUPPORT): Split up detection
and definition.
--

For example the assembler of FreeBSD 7.3 does not know about pshufb
and thus rijndael.c can't be compiled without using
--disable-aesni-support.  This check that the toolchain can use SSSE3
instructions before trying to build with AES_NI support.

6 years agoFix make dependency regression.
Werner Koch [Thu, 21 Mar 2013 14:19:34 +0000 (15:19 +0100)]
Fix make dependency regression.

* src/Makefile.am (libgcrypt_la_DEPENDENCIES): Add missing backslash.
Reported by LRN.
--

Fixes-commit: 09ac5d8

6 years agoUse finer grained on-the-fly helper computations for EC.
Werner Koch [Wed, 20 Mar 2013 16:23:54 +0000 (17:23 +0100)]
Use finer grained on-the-fly helper computations for EC.

* src/ec-context.h (mpi_ec_ctx_s): Replace NEED_SYNC by a bitfield.
* mpi/ec.c (ec_p_sync): Remove.
(ec_get_reset, ec_get_a_is_pminus3, ec_get_two_inv_p): New.
(ec_p_init): Use ec_get_reset.
(_gcry_mpi_ec_set_mpi, _gcry_mpi_ec_dup_point)
(_gcry_mpi_ec_add_points): Replace ec_p_sync by the ec_get_ accessors.

6 years agoAllow building with w64-mingw32
Werner Koch [Mon, 5 Nov 2012 18:21:51 +0000 (19:21 +0100)]
Allow building with w64-mingw32

* autogen.sh <--build-w32>: Support the w64-mingw32 toolchain.  Also
prepare for 64 bit building.
--

NB: Despite of this change in autogen.sh, there is no support for 64
bit Windows yet.  The change has only be done to eventually allow to
work on a W64 version.

6 years agoProvide GCRYPT_VERSION_NUMBER macro, add build info to the binary.
Werner Koch [Mon, 18 Mar 2013 14:31:34 +0000 (15:31 +0100)]
Provide GCRYPT_VERSION_NUMBER macro, add build info to the binary.

* src/gcrypt.h.in (GCRYPT_VERSION_NUMBER): New.
* configure.ac (VERSION_NUMBER): New ac_subst.
* src/global.c (_gcry_vcontrol): Move call to above function ...
(gcry_check_version): .. here.

* configure.ac (BUILD_REVISION, BUILD_FILEVERSION)
(BUILD_TIMESTAMP): Define on all platforms.
* compat/compat.c (_gcry_compat_identification): Include revision and
timestamp.

6 years agoFix a memory leak in the new EC code.
Werner Koch [Wed, 20 Mar 2013 14:18:08 +0000 (15:18 +0100)]
Fix a memory leak in the new EC code.

* cipher/ecc.c (point_from_keyparam): Always call mpi_free on A.

6 years agoExtend the new EC interface and fix two bugs.
Werner Koch [Tue, 19 Mar 2013 14:12:07 +0000 (15:12 +0100)]
Extend the new EC interface and fix two bugs.

* src/ec-context.h (mpi_ec_ctx_s): Add field NEED_SYNC.
* mpi/ec.c (ec_p_sync): New.
(ec_p_init): Only set NEED_SYNC.
(_gcry_mpi_ec_set_mpi): Set NEED_SYNC for 'p' and 'a'.
(_gcry_mpi_ec_dup_point, _gcry_mpi_ec_add_points)
(_gcry_mpi_ec_mul_point): Call ec_p_sync.
(_gcry_mpi_ec_get_point): Recompute 'q' is needed.
(_gcry_mpi_ec_get_mpi): Ditto.  Also allow for names 'q', 'q.x',
'q.y', and 'g'.
* cipher/ecc.c (_gcry_mpi_ec_ec2os): New.

* cipher/ecc.c (_gcry_mpi_ec_new): Fix init from parameters 'Q'->'q',
'G'->'q'.
--

Note that the parameter names are all lowercase.  This patch fixes an
inconsistency.

The other bug was that changing the parameters D or A may have
resulted in wrong computations because helper variables were not
updated.  Now we delay the computation of those helper variables until
we need them.

6 years agompi: Add functions to manipulate an EC context.
Werner Koch [Fri, 15 Mar 2013 13:43:19 +0000 (14:43 +0100)]
mpi: Add functions to manipulate an EC context.

* src/gcrypt.h.in (gcry_mpi_ec_p_new): Remove.
(gcry_mpi_ec_new): New.
(gcry_mpi_ec_get_mpi): New.
(gcry_mpi_ec_get_point): New.
(gcry_mpi_ec_set_mpi): New.
(gcry_mpi_ec_set_point): New.
* src/visibility.c (gcry_mpi_ec_p_new): Remove.
* mpi/ec.c (_gcry_mpi_ec_p_new): Make it an internal function and
change to return an error code.
(_gcry_mpi_ec_get_mpi): New.
(_gcry_mpi_ec_get_point): New.
(_gcry_mpi_ec_set_mpi): New.
(_gcry_mpi_ec_set_point): New.
* src/mpi.h: Add new prototypes.
* src/ec-context.h: New.
* mpi/ec.c: Include that header.
(mpi_ec_ctx_s): Move to ec-context.h, add new fields, and put some
fields into an inner struct.
(point_copy): New.
* cipher/ecc.c (fill_in_curve): Allow passing NULL for R_NBITS.
(mpi_from_keyparam, point_from_keyparam): New.
(_gcry_mpi_ec_new): New.

* tests/t-mpi-point.c (test-curve): New.
(ec_p_new): New.  Use it instead of the removed gcry_mpi_ec_p_new.
(get_and_cmp_mpi, get_and_cmp_point): New.
(context_param): New test.
(basic_ec_math_simplified): New test.
(main): Call new tests.

* src/context.c (_gcry_ctx_get_pointer): Check for a NULL CTX.
--

gcry_mpi_ec_p_new() was a specialized version of the more general new
gcry_mpi_ec_new().  It was added to master only a few days ago, thus
there should be no problem to remove it.  A replacement can easily be
written (cf. t-mpi-point.c).

Note that gcry_mpi_ec_set_mpi and gcry_mpi_ec_set_point have not yet
been tested.

6 years agoAdd GCRYMPI_FLAG_CONST and make use constants.
Werner Koch [Wed, 13 Mar 2013 14:08:33 +0000 (15:08 +0100)]
Add GCRYMPI_FLAG_CONST and make use constants.

* src/gcrypt.h.in (GCRYMPI_FLAG_CONST): New.
* src/mpi.h (mpi_is_const, mpi_const): New.
(enum gcry_mpi_constants, MPI_NUMBER_OF_CONSTANTS): New.
* mpi/mpiutil.c (_gcry_mpi_init): New.
(constants): New.
(_gcry_mpi_free): Do not release a constant flagged MPI.
(gcry_mpi_copy): Clear the const and immutable flags.
(gcry_mpi_set_flag, gcry_mpi_clear_flag, gcry_mpi_get_flag): Support
GCRYMPI_FLAG_CONST.
(_gcry_mpi_const): New.
* src/global.c (global_init): Call _gcry_mpi_init.
* mpi/ec.c (mpi_ec_ctx_s): Remove fields one, two, three, four, and
eight.  Change all users to call mpi_const() instead.

* src/mpiutils.c (gcry_mpi_set_opaque): Check the immutable flag.
--

Allocating the trivial constants newly for every EC context is a waste
of memory and cpu cycles.  We instead provide a simple mechanism to
internally support such constants.  Using a new flag in THE API also
allows to mark an arbitrary MPI as constant.  The drawback of the
constants is the their memory will never be deallocated.  However,
that is what constants are about.