libgcrypt.git
3 years agorandom: Symbol name cleanup for random-drbg.c.
Werner Koch [Thu, 18 Feb 2016 19:44:10 +0000 (20:44 +0100)]
random: Symbol name cleanup for random-drbg.c.

* random/random-drbg.c: Rename all static objects and macros from
"gcry_drbg" to "drbg".
(drbg_string_t): New typedef.
(drbg_gen_t): New typedef.
(drbg_state_t): New typedef.  Replace all "struct drbg_state_s *" by
this.
(_drbg_init_internal): Replace xcalloc_secure by xtrycalloc_secure so
that an error if actually returned.
(gcry_rngdrbg_cavs_test): Ditto.
(gcry_drbg_healthcheck_sanity): Ditto.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agorandom: Use our symbol name pattern also for drbg functions.
Werner Koch [Thu, 18 Feb 2016 18:24:47 +0000 (19:24 +0100)]
random: Use our symbol name pattern also for drbg functions.

* random/random-drbg.c: Rename global functions from _gcry_drbg_*
to _gcry_rngdrbg_*.
* random/random.c: Adjust for this change.
* src/global.c: Ditto.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agorandom: Rename drbg.c to random-drbg.c.
Werner Koch [Thu, 18 Feb 2016 14:37:31 +0000 (15:37 +0100)]
random: Rename drbg.c to random-drbg.c.

* random/drbg.c: Rename to ...
* random/random-drbg.c: this.
* random/Makefile.am (librandom_la_SOURCES): Adjust accordingly.
--

We should stick to our name comventions.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agorandom: Remove the new API introduced by the new DRBG.
Werner Koch [Thu, 18 Feb 2016 16:51:34 +0000 (17:51 +0100)]
random: Remove the new API introduced by the new DRBG.

* src/gcrypt.h.in (struct gcry_drbg_gen): Move to random/drbg.c.
(struct gcry_drbg_string): Ditto.
(gcry_drbg_string_fill): Ditto.
(gcry_randomize_drbg): Remove.
* random/drbg.c (parse_flag_string): New.
(_gcry_drbg_reinit): Change the way the arguments are passed.
* src/global.c (_gcry_vcontrol) <GCRYCTL_DRBG_REINIT>: Change calling
convention.
--

It does not make sense to extend the API for a somewhat questionable
feature.  For GCRYCTL_DRBG_REINIT we change to use a string with flags
and libgcrypt's native buffer data structure.

NB: GCRYCTL_DRBG_REINIT has not been tested!
Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agoAdd helper function _gcry_strtokenize.
Werner Koch [Thu, 18 Feb 2016 14:37:32 +0000 (15:37 +0100)]
Add helper function _gcry_strtokenize.

* src/misc.c (_gcry_strtokenize): New.
--

The code has been taken from GnuPG and re-licensed to LPGLv2+ by me as
its original author.  Minor changes for use in Libgcrypt.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agorandom: Remove DRBG constants from the public API.
Werner Koch [Thu, 18 Feb 2016 14:31:36 +0000 (15:31 +0100)]
random: Remove DRBG constants from the public API.

* src/gcrypt.h.in (GCRY_DRBG_): Remove all new flags to ...
* random/drbg.c: here.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agorandom: Add SP800-90A DRBG
Stephan Mueller [Tue, 16 Feb 2016 21:04:28 +0000 (22:04 +0100)]
random: Add SP800-90A DRBG

* random/drbg.c: New.
* random/random.c (_gcry_random_initialize): Replace rngfips init by
drbg init.
(__gcry_random_close_fds): Likewise.
(_gcry_random_dump_stats): Likewise.
(_gcry_random_is_faked): Likewise.
(do_randomize): Likewise.
(_gcry_random_selftest): Likewise.
(_gcry_create_nonce): Replace rngfips_create_noce by drbg_randomize.
(_gcry_random_init_external_test): Remove.
(_gcry_random_run_external_test): Remove.
(_gcry_random_deinit_external_test): Remove.
* random/random.h (struct gcry_drbg_test_vector): New.
* src/gcrypt.h.in (struct gcry_drbg_gen): New.
(struct gcry_drbg_string): New.
(gcry_drbg_string_fill): New.
(gcry_randomize_drbg): New.
(GCRY_DRBG_): Lots of new macros.
* src/global.c (_gcry_vcontrol) <Init external random test>: Turn into
a nop.
(_gcry_vcontrol) <Deinit external random test>: Ditto.
(_gcry_vcontrol) <Run external random test>: Change.
(_gcry_vcontrol) <GCRYCTL_DRBG_REINIT>: New.

--

This patch set adds the SP800-90A DRBG for AES128, AES192, AES256 with
derivation function, SHA-1 through SHA-512 with derivation function,
HMAC SHA-1 through HMAC SHA-512. All DRBGs are provided with and without
prediction resistance. In addition, all DRBGs allow reseeding by the
caller.

The default DRBG is HMAC SHA-256 without prediction resistance.

The caller may re-initialize the DRBG with the control
GCRYCTL_DRBG_REINIT:

The patch replaces the invocation of the existing ANSI X9.31 DRNG. This
covers the control calls of 58 through 60. Control call 58 and 60 are
simply deactivated. Control 59 is replaced with the DRBG CAVS test
interface.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
ChangeLog entries added by -wk

3 years agobufhelp: disable unaligned memory accesses on powerpc
Jussi Kivilinna [Sat, 13 Feb 2016 18:12:58 +0000 (20:12 +0200)]
bufhelp: disable unaligned memory accesses on powerpc

* cipher/bufhelp.h (BUFHELP_FAST_UNALIGNED_ACCESS): Disable for
__powerpc__ and __powerpc64__.

--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoDocument more non LGPL-licensed code.
Andreas Metzler [Fri, 12 Feb 2016 13:19:23 +0000 (14:19 +0100)]
Document more non LGPL-licensed code.

--

Add license and copyright statement for cipher/arcfour-amd64.S (public
domain) and cipher/cipher-ocb.c (OCB license 1)

3 years agoecc: Not validate input point for Curve25519.
NIIBE Yutaka [Fri, 12 Feb 2016 04:50:02 +0000 (13:50 +0900)]
ecc: Not validate input point for Curve25519.

* cipher/ecc.c (ecc_decrypt_raw): Curve25519 is an exception.

--

Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
3 years agoecc: Fix memory leaks on error.
NIIBE Yutaka [Wed, 10 Feb 2016 08:35:43 +0000 (17:35 +0900)]
ecc: Fix memory leaks on error.

* cipher/ecc.c (ecc_decrypt_raw): Go to leave to release memory.
* mpi/ec.c (_gcry_mpi_ec_curve_point): Likewise.

--

Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
3 years agodoc: about commit 23b72901f8a5ba9a78485b235c7a917fbc8faae0
NIIBE Yutaka [Tue, 9 Feb 2016 09:50:47 +0000 (18:50 +0900)]
doc: about commit 23b72901f8a5ba9a78485b235c7a917fbc8faae0

--

Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
Together with 88e1358962e902ff1cbec8d53ba3eee46407851a, it
could be an effective contermeasure to some chosen cipher
text attacks.

CVE-id: CVE-2015-7511

Thanks to Daniel Genkin, Lev Pachmanov, Itamar Pipman, and Eran
Tromer.   http://www.cs.tau.ac.IL/~tromer/ecdh/

3 years agoecc: input validation on ECDH.
NIIBE Yutaka [Tue, 24 Nov 2015 23:41:41 +0000 (08:41 +0900)]
ecc: input validation on ECDH.

* cipher/ecc.c (ecc_decrypt_raw): Validate the point.

--

Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
(forward port from LIBGCRYPT-1-6-BRANCH
 commit 28eb424e4427b320ec1c9c4ce56af25d495230bd)

3 years agoAdd ARM assembly implementation of SHA-512
Jussi Kivilinna [Mon, 8 Feb 2016 18:13:38 +0000 (20:13 +0200)]
Add ARM assembly implementation of SHA-512

* cipher/Makefile.am: Add 'sha512-arm.S'.
* cipher/sha512-arm.S: New.
* cipher/sha512.c (USE_ARM_ASM): New.
(_gcry_sha512_transform_arm): New.
(transform) [USE_ARM_ASM]: Use ARM assembly implementation instead of
generic.
* configure.ac: Add 'sha512-arm.lo'.
--

Benchmark on Cortex-A8 (armv6, 1008 Mhz):

 Before:
                 |  nanosecs/byte   mebibytes/sec   cycles/byte
  SHA512         |     112.0 ns/B      8.52 MiB/s     112.9 c/B

 After (3.3x faster):
                 |  nanosecs/byte   mebibytes/sec   cycles/byte
  SHA512         |     34.01 ns/B     28.04 MiB/s     34.28 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agotests: Add a test for Curve25519.
NIIBE Yutaka [Wed, 3 Feb 2016 03:24:46 +0000 (12:24 +0900)]
tests: Add a test for Curve25519.

* tests/Makefile.am (tests_bin): Add t-cv25519.
* tests/t-cv25519.c: New.

--

Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
3 years agoecc: Fix Curve25519 for data by older implementation.
NIIBE Yutaka [Tue, 2 Feb 2016 11:58:04 +0000 (20:58 +0900)]
ecc: Fix Curve25519 for data by older implementation.

* cipher/ecc-misc.c (gcry_ecc_mont_decodepoint): Fix code path for
short length data.

--

Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
3 years agoecc: more fix of Curve25519.
NIIBE Yutaka [Tue, 2 Feb 2016 08:24:10 +0000 (17:24 +0900)]
ecc: more fix of Curve25519.

* cipher/ecc-misc.c (gcry_ecc_mont_decodepoint): Fix removing of
prefix.  Clear the MSB, according to RFC7748.

--

This change fixes two things.

* Handle the case the prefix 0x40 comes at the end when scanned as
  standard MPI.

* Implement MSB handling.  In the page 7 of RFC7748, it says about
  decoding u-coordinate:

    When receiving such an array, implementations of X25519 (but not
    X448) MUST mask the most significant bit in the final byte.

Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
3 years agoecc: Fix ECDH of Curve25519.
NIIBE Yutaka [Tue, 2 Feb 2016 04:58:48 +0000 (13:58 +0900)]
ecc: Fix ECDH of Curve25519.

* cipher/ecc-misc.c (_gcry_ecc_mont_decodepoint): Fix calc of NBITS
and prefix detection.
* cipher/ecc.c (ecc_generate): Use NBITS instead of CTX->NBITS.
(ecc_encrypt_raw): Use NBITS from curve instead of from P.
Fix rawmpilen calculation.
(ecc_decrypt_raw): Likewise.  Add debug output.
--

This fixes the commit dd3d06e7.  NBITS is defined 256 in ecc-curves.c,
thus, ecc_get_nbits returns 256.  But CTX->NBITS has 255 for Montgomery
curve.

3 years agoUpdate 'Interface changes' in NEWS
Jussi Kivilinna [Fri, 29 Jan 2016 15:42:41 +0000 (17:42 +0200)]
Update 'Interface changes' in NEWS

--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoImprove performance of generic SHA256 implementation
Jussi Kivilinna [Fri, 29 Jan 2016 15:42:41 +0000 (17:42 +0200)]
Improve performance of generic SHA256 implementation

* cipher/sha256.c (R): Let caller do variable shuffling.
(Chro, Maj, Sum0, Sum1): Convert from inline functions to macros.
(W, I): New.
(transform_blk): Unroll round loop; inline message expansion to rounds
to make message expansion buffer smaller.
--

Benchmark on Cortex-A8 (armv6, 1008 Mhz):

 Before:
                 |  nanosecs/byte   mebibytes/sec   cycles/byte
  SHA256         |     27.63 ns/B     34.52 MiB/s     27.85 c/B

 After (1.31x faster):
                 |  nanosecs/byte   mebibytes/sec   cycles/byte
  SHA256         |     20.97 ns/B     45.48 MiB/s     21.13 c/B

Benchmark on Cortex-A8 (armv7, 1008 Mhz):

 Before:
                 |  nanosecs/byte   mebibytes/sec   cycles/byte
  SHA256         |     24.18 ns/B     39.43 MiB/s     24.38 c/B

 After (1.13x faster):
                 |  nanosecs/byte   mebibytes/sec   cycles/byte
  SHA256         |     21.28 ns/B     44.82 MiB/s     21.45 c/B

Benchmark on Intel Core i5-4570 (i386, 3.2 Ghz):

 Before:
                 |  nanosecs/byte   mebibytes/sec   cycles/byte
  SHA256         |      5.78 ns/B     164.9 MiB/s     18.51 c/B

 After (1.06x faster)
                 |  nanosecs/byte   mebibytes/sec   cycles/byte
  SHA256         |      5.41 ns/B     176.1 MiB/s     17.33 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoUpdate NEWS
Jussi Kivilinna [Thu, 28 Jan 2016 17:07:50 +0000 (19:07 +0200)]
Update NEWS

--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agodoc: Fix typos in gcry_mpi_ec_new.
Werner Koch [Thu, 28 Jan 2016 17:16:22 +0000 (18:16 +0100)]
doc: Fix typos in gcry_mpi_ec_new.

--
Reported-by: Hanno Böck <hanno@hboeck.de>
Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agoecc: New API function gcry_mpi_ec_decode_point.
Werner Koch [Thu, 28 Jan 2016 16:33:51 +0000 (17:33 +0100)]
ecc: New API function gcry_mpi_ec_decode_point.

* mpi/ec.c (_gcry_mpi_ec_decode_point): New.
* cipher/ecc-common.h: Move two prototypes to ...
* src/ec-context.h: here.
* src/gcrypt.h.in (gcry_mpi_ec_decode_point): New.
* src/libgcrypt.def (gcry_mpi_ec_decode_point): New.
* src/libgcrypt.vers (gcry_mpi_ec_decode_point): New.
* src/visibility.c (gcry_mpi_ec_decode_point): New.
* src/visibility.h: Add new function.
--

This new function make the use of the gcry_mpi_ec_curve_point function
possible in many contexts.  Here is a code snippet which could be used
in gpg to check a point:

static gpg_error_t
check_point (PKT_public_key *pk, gcry_mpi_t m_point)
{
  gpg_error_t err;
  char *curve;
  gcry_ctx_t gctx = NULL;
  gcry_mpi_point_t point = NULL;

  /* Get the curve name from the first OpenPGP key parameter.  */
  curve = openpgp_oid_to_str (pk->pkey[0]);
  if (!curve)
    {
      err = gpg_error_from_syserror ();
      goto leave;
    }

  point = gcry_mpi_point_new (0);
  if (!point)
    {
      err = gpg_error_from_syserror ();
      goto leave;
    }

  err = gcry_mpi_ec_new (&gctx, NULL, curve);
  if (err)
    goto leave;

  err = gcry_mpi_ec_decode_point (point, m_point, gctx);
  if (err)
    goto leave;

  if (!gcry_mpi_ec_curve_point (point, gctx))
    err = gpg_error (GPG_ERR_BAD_DATA);

 leave:
  gcry_ctx_release (gctx);
  gcry_mpi_point_release (point);
  xfree (curve);
  return err;
}

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agoFix build problem for rndegd.c
Werner Koch [Fri, 15 Jan 2016 15:10:34 +0000 (16:10 +0100)]
Fix build problem for rndegd.c

* Makefile.am (DISTCHECK_CONFIGURE_FLAGS): Test all RND modules.
* random/rndegd.c (_gcry_rndegd_connect_socket)
(my_make_filename): Use functions with '_' prefix.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agorandom: Fix possible AIX problem with sysconf in rndunix.
Werner Koch [Fri, 15 Jan 2016 15:01:35 +0000 (16:01 +0100)]
random: Fix possible AIX problem with sysconf in rndunix.

* random/rndunix.c [HAVE_STDINT_H]: Include stdint.h.
(start_gatherer): Detect misbehaving sysconf.
--

See
GnuPG-bug-id: 1778
for the reason of this patch. There is no concrete bug report but this
change should not harm.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agorandom: Take at max 25% from RDRAND
Werner Koch [Sun, 27 Dec 2015 11:39:45 +0000 (12:39 +0100)]
random: Take at max 25% from RDRAND

* random/rndlinux.c (_gcry_rndlinux_gather_random): Change use of
RDRAND from 50% to 25%.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agodoc: Typo fix and .gitignore addition.
Werner Koch [Fri, 2 Oct 2015 13:05:19 +0000 (15:05 +0200)]
doc: Typo fix and .gitignore addition.

--

3 years agodoc: Fix typo.
Justus Winter [Wed, 2 Dec 2015 11:49:59 +0000 (12:49 +0100)]
doc: Fix typo.

--
Signed-off-by: Justus Winter <justus@g10code.com>
3 years agocipher: Improve error handling.
Justus Winter [Mon, 7 Dec 2015 11:44:48 +0000 (12:44 +0100)]
cipher: Improve error handling.

* cipher/ecc.c (ecc_decrypt_raw): Improve error handling.
--
Found using the Clang Static Analyzer.

Signed-off-by: Justus Winter <justus@g10code.com>
3 years agocipher: Initialize 'flags'.
Justus Winter [Mon, 7 Dec 2015 11:39:41 +0000 (12:39 +0100)]
cipher: Initialize 'flags'.

* cipher/ecc.c (ecc_encrypt_raw): Initialize 'flags' to 0.
--
Found using the Clang Static Analyzer.

Signed-off-by: Justus Winter <justus@g10code.com>
3 years agoecc: CHANGE point representation of Curve25519.
NIIBE Yutaka [Sat, 5 Dec 2015 01:08:51 +0000 (10:08 +0900)]
ecc: CHANGE point representation of Curve25519.

* cipher/ecc-misc.c (_gcry_ecc_mont_decodepoint): Decode point with
the prefix 0x40, additional 0x00 by MPI handling, and shorter octets
by MPI normalization.
* cipher/ecc.c (ecc_generate, ecc_encrypt_raw, ecc_decrypt_raw):
Always add the prefix 0x40.

--

Curve25519 native little-endian point representation is not friendly
to existing practice of OpenPGP code, where MPI is assumed.  MPI
handling might insert 0x00 in the beginning to avoid sign confusion.
MPI handling also might remove 0x00s in the front.  So, it is safe
to put the prefix 0x40.

While we support old point representation of no prefix in
ecc_mont_decodepoint, new libgcrypt always put the prefix.

3 years agochacha20: fix alignment of self-test context
Jussi Kivilinna [Thu, 3 Dec 2015 19:06:50 +0000 (21:06 +0200)]
chacha20: fix alignment of self-test context

* cipher/chacha20.c (selftest): Ensure 16-byte alignment for chacha20
context structure.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agosalsa20: fix alignment of self-test context
Jussi Kivilinna [Thu, 3 Dec 2015 19:06:50 +0000 (21:06 +0200)]
salsa20: fix alignment of self-test context

* cipher/salsa20.c (selftest): Ensure 16-byte alignment for salsa20
context structure.
--

Reported-by: Carlos J Puga Medina <cpm@fbsd.es>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agorandom: Drop fake entropy gathering function.
Justus Winter [Wed, 2 Dec 2015 11:12:55 +0000 (12:12 +0100)]
random: Drop fake entropy gathering function.

* random/random-csprng.c (faked_rng): Drop variable.
(gather_faked): Drop prototype and function.
(initialize): Drop fallback code.
(_gcry_rngcsprng_is_faked): Change accordingly.

--
The fake entropy gathering function is deemed too dangerous to be
used by accident, and is therefore removed.

This reverts commit 468a5796ffb1a7776db4004d534376c1b981d740.

Signed-off-by: Justus Winter <justus@g10code.com>
3 years agorandom: Fix selection of entropy gathering function.
Justus Winter [Wed, 2 Dec 2015 10:54:40 +0000 (11:54 +0100)]
random: Fix selection of entropy gathering function.

* random/random-csprng.c (getfnc_gather_random): Do return NULL if no
usable entropy gathering function is found.  The callsite then
installs the fake gather function.

Signed-off-by: Justus Winter <justus@g10code.com>
3 years agoecc: minor improvement of point multiplication.
NIIBE Yutaka [Thu, 26 Nov 2015 02:37:47 +0000 (11:37 +0900)]
ecc: minor improvement of point multiplication.

* mpi/ec.c (_gcry_mpi_ec_mul_point): Move ec_subm out of the loop.

3 years agoecc: Constant-time multiplication for Weierstrass curve.
NIIBE Yutaka [Wed, 25 Nov 2015 03:46:19 +0000 (12:46 +0900)]
ecc: Constant-time multiplication for Weierstrass curve.

* mpi/ec.c (_gcry_mpi_ec_mul_point): Use simple left-to-right binary
method for Weierstrass curve when SCALAR is secure.

3 years agompi: fix gcry_mpi_swap_cond.
NIIBE Yutaka [Wed, 25 Nov 2015 03:13:04 +0000 (12:13 +0900)]
mpi: fix gcry_mpi_swap_cond.

* mpi/mpiutil.c (_gcry_mpi_swap_cond): Relax the condition.

3 years agompi: Fix mpi_set_cond and mpi_swap_cond .
NIIBE Yutaka [Wed, 25 Nov 2015 01:52:57 +0000 (10:52 +0900)]
mpi: Fix mpi_set_cond and mpi_swap_cond .

* mpi/mpiutil.c (_gcry_mpi_set_cond, _gcry_mpi_swap_cond): Don't use
the operator of !!, but assume SET/SWAP is 0 or 1.

--

If the code for !! would include a branch, it spoils the purpose of
mpi_set_cond/mpi_swap_cond at all.  It's better to make sure the use
of this function to be called with 0 or 1 for SET/SWAP.  Note that it
conforms when SET/SWAP is the result of conditional expression of
mpi_test_bit.

Reported-by: Taylor R Campbell.
3 years agoecc: multiplication of Edwards curve to be constant-time.
NIIBE Yutaka [Wed, 25 Nov 2015 01:42:47 +0000 (10:42 +0900)]
ecc: multiplication of Edwards curve to be constant-time.

* mpi/ec.c (_gcry_mpi_ec_mul_point): Use point_swap_cond.

--

Reported-by: Taylor R Campbell.
3 years agoecc: Add point_resize and point_swap_cond.
NIIBE Yutaka [Wed, 25 Nov 2015 01:19:39 +0000 (10:19 +0900)]
ecc: Add point_resize and point_swap_cond.

* mpi/ec.c (point_resize, point_swap_cond): New.
(_gcry_mpi_ec_mul_point): Use point_resize and point_swap_cond.

--

Thanks to Taylor R Campbell who suggests.

3 years agocipher: Fix error handling.
Justus Winter [Tue, 17 Nov 2015 15:00:16 +0000 (16:00 +0100)]
cipher: Fix error handling.

* cipher/cipher.c (_gcry_cipher_ctl): Fix error handling.
--
Found using the Clang Static Analyzer.

Signed-off-by: Justus Winter <justus@g10code.com>
3 years agoTweak Keccak for small speed-up
Jussi Kivilinna [Wed, 18 Nov 2015 07:44:18 +0000 (09:44 +0200)]
Tweak Keccak for small speed-up

* cipher/keccak_permute_32.h (KECCAK_F1600_PERMUTE_FUNC_NAME): Track
rounds with round constant pointer instead of separate round counter.
* cipher/keccak_permute_64.h (KECCAK_F1600_PERMUTE_FUNC_NAME): Ditto.
(KECCAK_F1600_ABSORB_FUNC_NAME): Tweak lanes pointer increment for bulk
absorb loops.
--

Patch makes small tweaks to improve performance.

Benchmark on Intel Haswell @ 3.2 Ghz:

Before:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 SHAKE128       |      2.27 ns/B     420.5 MiB/s      7.26 c/B
 SHAKE256       |      2.79 ns/B     341.4 MiB/s      8.94 c/B
 SHA3-224       |      2.64 ns/B     361.7 MiB/s      8.44 c/B
 SHA3-256       |      2.79 ns/B     341.4 MiB/s      8.94 c/B
 SHA3-384       |      3.65 ns/B     261.3 MiB/s     11.68 c/B
 SHA3-512       |      5.27 ns/B     181.0 MiB/s     16.86 c/B

After:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 SHAKE128       |      2.25 ns/B     423.5 MiB/s      7.21 c/B
 SHAKE256       |      2.77 ns/B     343.9 MiB/s      8.88 c/B
 SHA3-224       |      2.62 ns/B     364.1 MiB/s      8.38 c/B
 SHA3-256       |      2.77 ns/B     343.8 MiB/s      8.88 c/B
 SHA3-384       |      3.63 ns/B     262.6 MiB/s     11.63 c/B
 SHA3-512       |      5.23 ns/B     182.3 MiB/s     16.75 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoUpdate license information for CRC
Jussi Kivilinna [Wed, 18 Nov 2015 07:44:18 +0000 (09:44 +0200)]
Update license information for CRC

* LICENSES: Remove 'Simple permissive' and 'IETF permissive' licenses
for 'cipher/crc.c' as result of rewrite of CRC implementations.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoFix typos found using codespell
Justus Winter [Mon, 16 Nov 2015 11:18:47 +0000 (12:18 +0100)]
Fix typos found using codespell

* cipher/cipher-ocb.c: Fix typos.
* cipher/des.c: Likewise.
* cipher/dsa-common.c: Likewise.
* cipher/ecc.c: Likewise.
* cipher/pubkey.c: Likewise.
* cipher/rsa-common.c: Likewise.
* cipher/scrypt.c: Likewise.
* random/random-csprng.c: Likewise.
* random/random-fips.c: Likewise.
* random/rndw32.c: Likewise.
* src/cipher-proto.h: Likewise.
* src/context.c: Likewise.
* src/fips.c: Likewise.
* src/gcrypt.h.in: Likewise.
* src/global.c: Likewise.
* src/sexp.c: Likewise.
* tests/mpitests.c: Likewise.
* tests/t-lock.c: Likewise.

Signed-off-by: Justus Winter <justus@g10code.com>
3 years agoImprove performance of Tiger hash algorithms
Jussi Kivilinna [Sun, 1 Nov 2015 18:44:09 +0000 (20:44 +0200)]
Improve performance of Tiger hash algorithms

* cipher/tiger.c (tiger_round, pass, key_schedule): Convert functions
to macros.
(transform_blk): Pass variable names instead of pointers to 'pass'.
--

Benchmark results on Intel Haswell @ 3.2 Ghz:

Before:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 TIGER          |      3.25 ns/B     293.5 MiB/s     10.40 c/B

After (1.75x faster):
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 TIGER          |      1.85 ns/B     515.3 MiB/s      5.92 c/B

Benchmark results on Cortex-A8 @ 1008 Mhz:

Before:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 TIGER          |     63.42 ns/B     15.04 MiB/s     63.93 c/B

After (1.26x faster):
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 TIGER          |     49.99 ns/B     19.08 MiB/s     50.39 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoAdd ARMv7/NEON implementation of Keccak
Jussi Kivilinna [Sun, 1 Nov 2015 14:06:26 +0000 (16:06 +0200)]
Add ARMv7/NEON implementation of Keccak

* cipher/Makefile.am: Add 'keccak-armv7-neon.S'.
* cipher/keccak-armv7-neon.S: New.
* cipher/keccak.c (USE_64BIT_ARM_NEON): New.
(NEED_COMMON64): Select if USE_64BIT_ARM_NEON.
[NEED_COMMON64] (round_consts_64bit): Rename to...
[NEED_COMMON64] (_gcry_keccak_round_consts_64bit): ...this; Add
terminator at end.
[USE_64BIT_ARM_NEON] (_gcry_keccak_permute_armv7_neon)
(_gcry_keccak_absorb_lanes64_armv7_neon, keccak_permute64_armv7_neon)
(keccak_absorb_lanes64_armv7_neon, keccak_armv7_neon_64_ops): New.
(keccak_init) [USE_64BIT_ARM_NEON]: Select ARM/NEON implementation
if supported by HW.
* cipher/keccak_permute_64.h (KECCAK_F1600_PERMUTE_FUNC_NAME): Update
to use new round constant table.
* configure.ac: Add 'keccak-armv7-neon.lo'.
--

Patch adds ARMv7/NEON implementation of Keccak (SHAKE/SHA3). Patch
is based on public-domain implementation by Ronny Van Keer from
SUPERCOP package:
 https://github.com/floodyberry/supercop/blob/master/crypto_hash/\
keccakc1024/inplace-armv7a-neon/keccak2.s

Benchmark results on Cortex-A8 @ 1008 Mhz:

Before (generic 32-bit bit-interleaved impl.):
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 SHAKE128       |     83.00 ns/B     11.49 MiB/s     83.67 c/B
 SHAKE256       |     101.7 ns/B      9.38 MiB/s     102.5 c/B
 SHA3-224       |     96.13 ns/B      9.92 MiB/s     96.90 c/B
 SHA3-256       |     101.5 ns/B      9.40 MiB/s     102.3 c/B
 SHA3-384       |     131.4 ns/B      7.26 MiB/s     132.5 c/B
 SHA3-512       |     189.1 ns/B      5.04 MiB/s     190.6 c/B

After (ARM/NEON, ~3.2x faster):
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 SHAKE128       |     25.09 ns/B     38.01 MiB/s     25.29 c/B
 SHAKE256       |     30.95 ns/B     30.82 MiB/s     31.19 c/B
 SHA3-224       |     29.24 ns/B     32.61 MiB/s     29.48 c/B
 SHA3-256       |     30.95 ns/B     30.82 MiB/s     31.19 c/B
 SHA3-384       |     40.42 ns/B     23.59 MiB/s     40.74 c/B
 SHA3-512       |     58.37 ns/B     16.34 MiB/s     58.84 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoOptimize Keccak 64-bit absorb functions
Jussi Kivilinna [Sat, 31 Oct 2015 19:29:56 +0000 (21:29 +0200)]
Optimize Keccak 64-bit absorb functions

* cipher/keccak.c [USE_64BIT] [__x86_64__] (absorb_lanes64_8)
(absorb_lanes64_4, absorb_lanes64_2, absorb_lanes64_1): New.
* cipher/keccak.c [USE_64BIT] [!__x86_64__] (absorb_lanes64_8)
(absorb_lanes64_4, absorb_lanes64_2, absorb_lanes64_1): New.
[USE_64BIT] (KECCAK_F1600_ABSORB_FUNC_NAME): New.
[USE_64BIT] (keccak_absorb_lanes64): Remove.
[USE_64BIT_SHLD] (KECCAK_F1600_ABSORB_FUNC_NAME): New.
[USE_64BIT_SHLD] (keccak_absorb_lanes64_shld): Remove.
[USE_64BIT_BMI2] (KECCAK_F1600_ABSORB_FUNC_NAME): New.
[USE_64BIT_BMI2] (keccak_absorb_lanes64_bmi2): Remove.
* cipher/keccak_permute_64.h (KECCAK_F1600_ABSORB_FUNC_NAME): New.
--

Optimize 64-bit absorb functions for small speed-up. After this
change, 64-bit BMI2 implementation matches speed of fastest results
from SUPERCOP for Intel Haswell CPUs (long messages).

Benchmark on Intel Haswell @ 3.2 Ghz:

Before:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 SHAKE128       |      2.32 ns/B     411.7 MiB/s      7.41 c/B
 SHAKE256       |      2.84 ns/B     336.2 MiB/s      9.08 c/B
 SHA3-224       |      2.69 ns/B     354.9 MiB/s      8.60 c/B
 SHA3-256       |      2.84 ns/B     336.0 MiB/s      9.08 c/B
 SHA3-384       |      3.69 ns/B     258.4 MiB/s     11.81 c/B
 SHA3-512       |      5.30 ns/B     179.9 MiB/s     16.97 c/B

After:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 SHAKE128       |      2.27 ns/B     420.6 MiB/s      7.26 c/B
 SHAKE256       |      2.79 ns/B     341.4 MiB/s      8.94 c/B
 SHA3-224       |      2.64 ns/B     361.7 MiB/s      8.44 c/B
 SHA3-256       |      2.79 ns/B     341.5 MiB/s      8.94 c/B
 SHA3-384       |      3.65 ns/B     261.4 MiB/s     11.68 c/B
 SHA3-512       |      5.27 ns/B     181.0 MiB/s     16.87 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoEnable CRC test vectors with zero bytes
Jussi Kivilinna [Sat, 31 Oct 2015 18:19:59 +0000 (20:19 +0200)]
Enable CRC test vectors with zero bytes

* tests/basic.c (check_digests): Enable CRC test-vectors with zero
bytes.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoKeccak: Add SHAKE Extendable-Output Functions
Jussi Kivilinna [Sun, 25 Oct 2015 18:34:50 +0000 (20:34 +0200)]
Keccak: Add SHAKE Extendable-Output Functions

* src/hash-common.c (_gcry_hash_selftest_check_one): Add handling for
XOFs.
* src/keccak.c (keccak_ops_t): Rename 'extract_inplace' to 'extract'
and add 'pos' argument.
(KECCAK_CONTEXT): Add 'suffix'.
(keccak_extract_inplace64): Rename to...
(keccak_extract64): ...this; Add handling for 'pos' argument.
(keccak_extract_inplace32bi): Rename to...
(keccak_extract32bi): ...this; Add handling for 'pos' argument.
(keccak_extract_inplace64): Rename to...
(keccak_extract64): ...this; Add handling for 'pos' argument.
(keccak_extract_inplace32bi_bmi2): Rename to...
(keccak_extract32bi_bmi2): ...this; Add handling for 'pos' argument.
(keccak_init): Setup 'suffix'; add SHAKE128 & SHAKE256.
(shake128_init, shake256_init): New.
(keccak_final): Do not initial permute for SHAKE output; use correct
suffix for SHAKE.
(keccak_extract): New.
(keccak_selftests_keccak): Add SHAKE128 & SHAKE256 test-vectors.
(run_selftests): Add SHAKE128 & SHAKE256.
(shake128_asn, oid_spec_shake128, shake256_asn, oid_spec_shake256)
(_gcry_digest_spec_shake128, _gcry_digest_spec_shake256): New.
* cipher/md.c (digest_list): Add SHAKE128 & SHAKE256.
* doc/gcrypt.texi: Ditto.
* src/cipher.h (_gcry_digest_spec_shake128)
(_gcry_digest_spec_shake256): New.
* src/gcrypt.h.in (GCRY_MD_SHAKE128, GCRY_MD_SHAKE256): New.
* tests/basic.c (check_one_md): Add XOF check; Add 'elen' argument.
(check_one_md_multi): Skip if algo is XOF.
(check_digests): Add SHAKE128 & SHAKE256 test vectors.
* tests/bench-slope.c (kdf_bench_one): Skip XOFs.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoFew updates to documentation
Jussi Kivilinna [Sun, 25 Oct 2015 16:57:15 +0000 (18:57 +0200)]
Few updates to documentation

* doc/gcrypt.text: Add mention of new 'intel-fast-shld' hw feature
flag; Add mention of x86 RDRAND support in rndhw.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoAdd HMAC-SHA3 test vectors
Jussi Kivilinna [Sun, 25 Oct 2015 15:59:33 +0000 (17:59 +0200)]
Add HMAC-SHA3 test vectors

* tests/basic.c (check_mac): Add HMAC_SHA3 test vectors.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agomd: add variable length output interface
Jussi Kivilinna [Sun, 25 Oct 2015 12:50:41 +0000 (14:50 +0200)]
md: add variable length output interface

* cipher/crc.c (_gcry_digest_spec_crc32)
(_gcry_digest_spec_crc32_rfc1510, _gcry_digest_spec_crc24_rfc2440): Set
'extract' NULL.
* cipher/gostr3411-94.c (_gcry_digest_spec_gost3411_94)
(_gcry_digest_spec_gost3411_cp): Ditto.
* cipher/keccak.c (_gcry_digest_spec_sha3_224)
(_gcry_digest_spec_sha3_256, _gcry_digest_spec_sha3_384)
(_gcry_digest_spec_sha3_512): Ditto.
* cipher/md2.c (_gcry_digest_spec_md2): Ditto.
* cipher/md4.c (_gcry_digest_spec_md4): Ditto.
* cipher/md5.c (_gcry_digest_spec_md5): Ditto.
* cipher/rmd160.c (_gcry_digest_spec_rmd160): Ditto.
* cipher/sha1.c (_gcry_digest_spec_sha1): Ditto.
* cipher/sha256.c (_gcry_digest_spec_sha224)
(_gcry_digest_spec_sha256): Ditto.
* cipher/sha512.c (_gcry_digest_spec_sha384)
(_gcry_digest_spec_sha512): Ditto.
* cipher/stribog.c (_gcry_digest_spec_stribog_256)
(_gcry_digest_spec_stribog_512): Ditto.
* cipher/tiger.c (_gcry_digest_spec_tiger)
(_gcry_digest_spec_tiger1, _gcry_digest_spec_tiger2): Ditto.
* cipher/whirlpool.c (_gcry_digest_spec_whirlpool): Ditto.
* cipher/md.c (md_enable): Do not allow combination of HMAC and
'expandable-output function'.
(md_final): Check if spec->read is NULL before calling.
(md_read): Ditto.
(md_extract, _gcry_md_extract): New.
* doc/gcrypt.texi: Add SHA3 algorithms and gcry_md_extract.
* src/cipher-proto.h (gcry_md_extract_t): New.
(gcry_md_spec_t): Add 'extract'.
* src/gcrypt-int.g (_gcry_md_extract): New.
* src/gcrypt.h.in (gcry_md_extract): New.
* src/libgcrypt.def: Add gcry_md_extract.
* src/libgcrypt.vers: Add gcry_md_extract.
* src/visibility.c (gcry_md_extract): New.
* src/visibility.h (gcry_md_extract): New.
--

Patch adds new interface for reading output from 'expandable-output
function' MD algorithms that can give variable length output (ie.
SHAKE algorithms from FIPS-202). New function to read output is

 gpg_error_t gcry_md_extract(gcry_md_hd_t md, int algo,
     void *buffer, size_t length);

Function implicitly finalizes algorithm so that no new input can
be given. Subsequents calls of the function return more output
bytes from the algorithm.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agomd: check hmac flag in prepare_macpads
Jussi Kivilinna [Sun, 25 Oct 2015 13:11:14 +0000 (15:11 +0200)]
md: check hmac flag in prepare_macpads

* cipher/md.c (prepare_macpads): Check hmac flag.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agokeccak: rewrite for improved performance
Jussi Kivilinna [Fri, 23 Oct 2015 19:30:48 +0000 (22:30 +0300)]
keccak: rewrite for improved performance

* cipher/Makefile.am: Add 'keccak_permute_32.h' and
'keccak_permute_64.h'.
* cipher/hash-common.h [USE_SHA3] (MD_BLOCK_MAX_BLOCKSIZE): Remove.
* cipher/keccak.c (USE_64BIT, USE_32BIT, USE_64BIT_BMI2)
(USE_64BIT_SHLD, USE_32BIT_BMI2, NEED_COMMON64, NEED_COMMON32BI)
(keccak_ops_t): New.
(KECCAK_STATE): Add 'state64' and 'state32bi' members.
(KECCAK_CONTEXT): Remove 'bctx'; add 'blocksize', 'count' and 'ops'.
(rol64, keccak_f1600_state_permute): Remove.
[NEED_COMMON64] (round_consts_64bit, keccak_extract_inplace64): New.
[NEED_COMMON32BI] (round_consts_32bit, keccak_extract_inplace32bi)
(keccak_absorb_lane32bi): New.
[USE_64BIT] (ANDN64, ROL64, keccak_f1600_state_permute64)
(keccak_absorb_lanes64, keccak_generic64_ops): New.
[USE_64BIT_SHLD] (ANDN64, ROL64, keccak_f1600_state_permute64_shld)
(keccak_absorb_lanes64_shld, keccak_shld_64_ops): New.
[USE_64BIT_BMI2] (ANDN64, ROL64, keccak_f1600_state_permute64_bmi2)
(keccak_absorb_lanes64_bmi2, keccak_bmi2_64_ops): New.
[USE_32BIT] (ANDN64, ROL64, keccak_f1600_state_permute32bi)
(keccak_absorb_lanes32bi, keccak_generic32bi_ops): New.
[USE_32BIT_BMI2] (ANDN64, ROL64, keccak_f1600_state_permute32bi_bmi2)
(pext, pdep, keccak_absorb_lane32bi_bmi2, keccak_absorb_lanes32bi_bmi2)
(keccak_extract_inplace32bi_bmi2, keccak_bmi2_32bi_ops): New.
(keccak_write): New.
(keccak_init): Adjust to KECCAK_CONTEXT changes; add implementation
selection based on HWF features.
(keccak_final): Adjust to KECCAK_CONTEXT changes; use selected 'ops'
for state manipulation.
(keccak_read): Adjust to KECCAK_CONTEXT changes.
(_gcry_digest_spec_sha3_224, _gcry_digest_spec_sha3_256)
(_gcry_digest_spec_sha3_348, _gcry_digest_spec_sha3_512): Use
'keccak_write' instead of '_gcry_md_block_write'.
* cipher/keccak_permute_32.h: New.
* cipher/keccak_permute_64.h: New.
--

Patch adds new generic 64-bit and 32-bit implementations and
optimized implementations for SHA3:
 - Generic 64-bit implementation based on 'simple' implementation
   from SUPERCOP package.
 - Generic 32-bit bit-inteleaved implementataion based on
   'simple32bi' implementation from SUPERCOP package.
 - Intel BMI2 optimized variants of 64-bit and 32-bit BI
   implementations.
 - Intel SHLD optimized variant of 64-bit implementation.

Patch also makes proper use of sponge construction to avoid
use of addition input buffer.

Below are bench-slope benchmarks for new 64-bit implementations
made on Intel Core i5-4570 (no turbo, 3.2 Ghz, gcc-4.9.2).

Before (amd64):

 SHA3-224       |      3.92 ns/B     243.2 MiB/s     12.55 c/B
 SHA3-256       |      4.15 ns/B     230.0 MiB/s     13.27 c/B
 SHA3-384       |      5.40 ns/B     176.6 MiB/s     17.29 c/B
 SHA3-512       |      7.77 ns/B     122.7 MiB/s     24.87 c/B

After (generic 64-bit, amd64), 1.10x faster):

 SHA3-224       |      3.57 ns/B     267.4 MiB/s     11.42 c/B
 SHA3-256       |      3.77 ns/B     252.8 MiB/s     12.07 c/B
 SHA3-384       |      4.91 ns/B     194.1 MiB/s     15.72 c/B
 SHA3-512       |      7.06 ns/B     135.0 MiB/s     22.61 c/B

After (Intel SHLD 64-bit, amd64, 1.13x faster):

 SHA3-224       |      3.48 ns/B     273.7 MiB/s     11.15 c/B
 SHA3-256       |      3.68 ns/B     258.9 MiB/s     11.79 c/B
 SHA3-384       |      4.80 ns/B     198.7 MiB/s     15.36 c/B
 SHA3-512       |      6.89 ns/B     138.4 MiB/s     22.05 c/B

After (Intel BMI2 64-bit, amd64, 1.45x faster):

 SHA3-224       |      2.71 ns/B     352.1 MiB/s      8.67 c/B
 SHA3-256       |      2.86 ns/B     333.2 MiB/s      9.16 c/B
 SHA3-384       |      3.72 ns/B     256.2 MiB/s     11.91 c/B
 SHA3-512       |      5.34 ns/B     178.5 MiB/s     17.10 c/B

Benchmarks of new 32-bit implementations on Intel Core i5-4570
(no turbo, 3.2 Ghz, gcc-4.9.2):

Before (win32):

 SHA3-224       |     12.05 ns/B     79.16 MiB/s     38.56 c/B
 SHA3-256       |     12.75 ns/B     74.78 MiB/s     40.82 c/B
 SHA3-384       |     16.63 ns/B     57.36 MiB/s     53.22 c/B
 SHA3-512       |     23.97 ns/B     39.79 MiB/s     76.72 c/B

After (generic 32-bit BI, win32, 1.23x to 1.29x faster):

 SHA3-224       |      9.76 ns/B     97.69 MiB/s     31.25 c/B
 SHA3-256       |     10.27 ns/B     92.82 MiB/s     32.89 c/B
 SHA3-384       |     13.22 ns/B     72.16 MiB/s     42.31 c/B
 SHA3-512       |     18.65 ns/B     51.13 MiB/s     59.70 c/B

After (Intel BMI2 32-bit BI, win32, 1.66x to 1.70x faster):

 SHA3-224       |      7.26 ns/B     131.4 MiB/s     23.23 c/B
 SHA3-256       |      7.65 ns/B     124.7 MiB/s     24.47 c/B
 SHA3-384       |      9.87 ns/B     96.67 MiB/s     31.58 c/B
 SHA3-512       |     14.05 ns/B     67.85 MiB/s     44.99 c/B

Benchmarks of new 32-bit implementation on ARM Cortex-A8
(1008 Mhz, gcc-4.9.1):

Before:

 SHA3-224       |     148.6 ns/B      6.42 MiB/s     149.8 c/B
 SHA3-256       |     157.2 ns/B      6.07 MiB/s     158.4 c/B
 SHA3-384       |     205.3 ns/B      4.65 MiB/s     206.9 c/B
 SHA3-512       |     296.3 ns/B      3.22 MiB/s     298.6 c/B

After (1.56x faster):

 SHA3-224       |     96.12 ns/B      9.92 MiB/s     96.89 c/B
 SHA3-256       |     101.5 ns/B      9.40 MiB/s     102.3 c/B
 SHA3-384       |     131.4 ns/B      7.26 MiB/s     132.5 c/B
 SHA3-512       |     188.2 ns/B      5.07 MiB/s     189.7 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agohwf-x86: add detection for Intel CPUs with fast SHLD instruction
Jussi Kivilinna [Fri, 23 Oct 2015 19:39:47 +0000 (22:39 +0300)]
hwf-x86: add detection for Intel CPUs with fast SHLD instruction

* cipher/sha1.c (sha1_init): Use HWF_INTEL_FAST_SHLD instead of
HWF_INTEL_CPU.
* cipher/sha256.c (sha256_init, sha224_init): Ditto.
* cipher/sha512.c (sha512_init, sha384_init): Ditto.
* src/g10lib.h (HWF_INTEL_FAST_SHLD): New.
(HWF_INTEL_BMI2, HWF_INTEL_SSSE3, HWF_INTEL_PCLMUL, HWF_INTEL_AESNI)
(HWF_INTEL_RDRAND, HWF_INTEL_AVX, HWF_INTEL_AVX2)
(HWF_ARM_NEON): Update.
* src/hwf-x86.c (detect_x86_gnuc): Add detection of Intel Core
CPUs with fast SHLD/SHRD instruction.
* src/hwfeatures.c (hwflist): Add "intel-fast-shld".
--

Intel Core CPUs since codename sandy-bridge have been able to
execute SHLD/SHRD instructions faster than rotate instructions
ROL/ROR. Since SHLD/SHRD can be used to do rotation, some
optimized implementations (SHA1/SHA256/SHA512) use SHLD/SHRD
instructions in-place of ROL/ROR.

This patch provides more accurate detection of CPUs with
fast SHLD implementation.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoFix OCB amd64 assembly implementations for x32
Jussi Kivilinna [Sat, 24 Oct 2015 09:41:23 +0000 (12:41 +0300)]
Fix OCB amd64 assembly implementations for x32

* cipher/camellia-glue.c (_gcry_camellia_aesni_avx_ocb_enc)
(_gcry_camellia_aesni_avx_ocb_dec, _gcry_camellia_aesni_avx_ocb_auth)
(_gcry_camellia_aesni_avx2_ocb_enc, _gcry_camellia_aesni_avx2_ocb_dec)
(_gcry_camellia_aesni_avx2_ocb_auth, _gcry_camellia_ocb_crypt)
(_gcry_camellia_ocb_auth): Change 'Ls' from pointer array to u64 array.
* cipher/serpent.c (_gcry_serpent_sse2_ocb_enc)
(_gcry_serpent_sse2_ocb_dec, _gcry_serpent_sse2_ocb_auth)
(_gcry_serpent_avx2_ocb_enc, _gcry_serpent_avx2_ocb_dec)
(_gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth): Ditto.
* cipher/twofish.c (_gcry_twofish_amd64_ocb_enc)
(_gcry_twofish_amd64_ocb_dec, _gcry_twofish_amd64_ocb_auth)
(twofish_amd64_ocb_enc, twofish_amd64_ocb_dec, twofish_amd64_ocb_auth)
(_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): Ditto.
--

Pointers on x32 are 32-bit, but amd64 assembly implementations
expect 64-bit pointers. Pass 'Ls' array to 64-bit integers so
that input arrays has correct format for assembly functions.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agobench-slope: add KDF/PBKDF2 benchmark
Jussi Kivilinna [Fri, 23 Oct 2015 19:24:47 +0000 (22:24 +0300)]
bench-slope: add KDF/PBKDF2 benchmark

* tests/bench-slope.c (bench_kdf_mode, bench_kdf_init, bench_kdf_free)
(bench_kdf_do_bench, kdf_ops, kdf_bench_one, kdf_bench): New.
(print_help): Add 'kdf'.
(main): Add KDF benchmarks.
--

Introduce KDF benchmarking to bench-slope. Output is given as
nanosecs/iter (and cycles/iter if --cpu-mhz used). Only PBKDF2
is support with this initial patch.

For example, below shows output of KDF bench-slope before
and after commit "md: keep contexts for HMAC in GcryDigestEntry",
on Intel Core i5-4570 @ 3.2 Ghz:

Before:

$ tests/bench-slope --cpu-mhz 3201 kdf
KDF:
                          |  nanosecs/iter   cycles/iter
 PBKDF2-HMAC-MD5          |          882.4        2824.7
 PBKDF2-HMAC-SHA1         |          832.6        2665.0
 PBKDF2-HMAC-RIPEMD160    |         1148.3        3675.6
 PBKDF2-HMAC-TIGER192     |         1339.6        4288.2
 PBKDF2-HMAC-SHA256       |         1460.5        4675.1
 PBKDF2-HMAC-SHA384       |         1723.2        5515.8
 PBKDF2-HMAC-SHA512       |         1729.1        5534.7
 PBKDF2-HMAC-SHA224       |         1424.0        4558.3
 PBKDF2-HMAC-WHIRLPOOL    |         2459.7        7873.5
 PBKDF2-HMAC-TIGER        |         1350.2        4322.1
 PBKDF2-HMAC-TIGER2       |         1348.7        4317.3
 PBKDF2-HMAC-GOSTR3411_94 |         7374.1       23604.4
 PBKDF2-HMAC-STRIBOG256   |         6060.0       19398.1
 PBKDF2-HMAC-STRIBOG512   |         7512.8       24048.3
 PBKDF2-HMAC-GOSTR3411_CP |         7378.3       23618.0
 PBKDF2-HMAC-SHA3-224     |         2789.6        8929.5
 PBKDF2-HMAC-SHA3-256     |         2785.1        8915.0
 PBKDF2-HMAC-SHA3-384     |         2955.5        9460.5
 PBKDF2-HMAC-SHA3-512     |         2859.7        9153.9
                          =

After:

$ tests/bench-slope --cpu-mhz 3201 kdf
KDF:
                          |  nanosecs/iter   cycles/iter
 PBKDF2-HMAC-MD5          |          405.9        1299.2
 PBKDF2-HMAC-SHA1         |          392.1        1255.0
 PBKDF2-HMAC-RIPEMD160    |          540.9        1731.5
 PBKDF2-HMAC-TIGER192     |          637.1        2039.4
 PBKDF2-HMAC-SHA256       |          691.8        2214.3
 PBKDF2-HMAC-SHA384       |          848.0        2714.3
 PBKDF2-HMAC-SHA512       |          875.7        2803.1
 PBKDF2-HMAC-SHA224       |          689.2        2206.0
 PBKDF2-HMAC-WHIRLPOOL    |         1535.6        4915.5
 PBKDF2-HMAC-TIGER        |          636.3        2036.7
 PBKDF2-HMAC-TIGER2       |          636.6        2037.7
 PBKDF2-HMAC-GOSTR3411_94 |         5311.5       17002.2
 PBKDF2-HMAC-STRIBOG256   |         4308.0       13790.0
 PBKDF2-HMAC-STRIBOG512   |         5767.4       18461.4
 PBKDF2-HMAC-GOSTR3411_CP |         5309.4       16995.4
 PBKDF2-HMAC-SHA3-224     |         1333.1        4267.2
 PBKDF2-HMAC-SHA3-256     |         1327.8        4250.4
 PBKDF2-HMAC-SHA3-384     |         1392.8        4458.3
 PBKDF2-HMAC-SHA3-512     |         1428.5        4572.7
                          =

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agomd: keep contexts for HMAC in GcryDigestEntry.
NIIBE Yutaka [Thu, 22 Oct 2015 00:58:24 +0000 (09:58 +0900)]
md: keep contexts for HMAC in GcryDigestEntry.

* cipher/md.c (struct gcry_md_context): Add flags.hmac.
Remove macpads and mcpads_Bsize.
(md_open): Initialize flags.hmac.  Remove macpads initialization.
(md_enable): Allocate contexts when flags.hmac is enabled.
(md_copy): Remove macpads copying.  Add copying contexts.
(_gcry_md_reset): When flags.hmac is enabled, restore precomputed
context with input pad
(md_close): Remove macpads wiping.
(md_final): When flags.hmac is enabled, compute hmac by precomputed
context with output pad.
(prepare_macpads): Prepare precomputed contexts with input pad and
output pad for each registered digest entry.
(_gcry_md_setkey): Just call prepare_macpads.

--

This change is making things straight in HMAC computation.  This makes
HMAC computation allow multple algorithms in future.

Libgcrypt's code has a potential to compute digests for multiple
algorithms at once (currently, it's not enabled).  HMAC code didn't
work well with multple algorithms, because the macpads were only
allocated for an algorithm.  Now, it's allocated for each algorithm.

We now precompute hash contexts, instead of keeping input pad and
output pad.  This can be performance improvement, which is described
in RFC 2104.

Thanks to:

   Andrea Visconti, Simone Bossi, Hany Ragab and Alexandro Calò

For the discussion and their paper of CANS2015, which titled:

   On the weaknesses of PBKDF2

3 years agoFix double free on error.
NIIBE Yutaka [Thu, 15 Oct 2015 02:28:54 +0000 (11:28 +0900)]
Fix double free on error.

* src/hmac256.c (_gcry_hmac256_finalize): Don't free HD.

3 years agoFix gpg_error_t and gpg_err_code_t confusion.
NIIBE Yutaka [Wed, 14 Oct 2015 02:52:40 +0000 (11:52 +0900)]
Fix gpg_error_t and gpg_err_code_t confusion.

* src/gcrypt-int.h (_gcry_sexp_extract_param): Revert the change.
* cipher/dsa.c (dsa_check_secret_key): Ditto.
* src/sexp.c (_gcry_sexp_extract_param): Return gpg_err_code_t.

* src/gcrypt-int.h (_gcry_err_make_from_errno)
(_gcry_error_from_errno): Return gpg_error_t.
* cipher/cipher.c (_gcry_cipher_open_internal)
(_gcry_cipher_ctl, _gcry_cipher_ctl): Don't use gcry_error.
* src/global.c (_gcry_vcontrol): Likewise.
* cipher/ecc-eddsa.c (_gcry_ecc_eddsa_genkey): Use
 gpg_err_code_from_syserror.
* cipher/mac.c (mac_reset, mac_setkey, mac_setiv, mac_write)
(mac_read, mac_verify): Return gcry_err_code_t.
* cipher/rsa-common.c (mgf1): Use gcry_err_code_t for ERR.
* src/visibility.c (gcry_error_from_errno): Return gpg_error_t.

--

Reverting a part of 73374fdd and fix _gcry_sexp_extract_param
return type, instead.

Fix similar coding mistakes, throughout.

3 years agoFix compiling AES/AES-NI implementation on linux-i386
Jussi Kivilinna [Tue, 13 Oct 2015 05:33:00 +0000 (08:33 +0300)]
Fix compiling AES/AES-NI implementation on linux-i386

* cipher/rijndael-aesni.c (do_aesni_ctr_4): Split assembly block in
two parts to reduce number of register constraints needed.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoFix declaration of return type.
NIIBE Yutaka [Tue, 13 Oct 2015 03:28:00 +0000 (12:28 +0900)]
Fix declaration of return type.

* src/gcrypt-int.h (_gcry_sexp_extract_param): Return gpg_error_t.
* cipher/dsa.c (dsa_generate): Fix call to _gcry_sexp_extract_param.
* src/g10lib.h (_gcry_vcontrol): Return gcry_err_code_t.
* src/visibility.c (gcry_mpi_snatch): Fix call to _gcry_mpi_snatch.

--

GnuPG-bug-id: 2074

4 years agoImprove GCRYCTL_DISABLE_PRIV_DROP by also disabling cap_ calls.
Werner Koch [Mon, 7 Sep 2015 12:02:09 +0000 (14:02 +0200)]
Improve GCRYCTL_DISABLE_PRIV_DROP by also disabling cap_ calls.

* src/secmem.c (lock_pool, secmem_init): Do not call any cap_
functions if NO_PRIV_DROP is set.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agow32: Avoid a few compiler warnings.
Werner Koch [Fri, 4 Sep 2015 10:39:56 +0000 (12:39 +0200)]
w32: Avoid a few compiler warnings.

* cipher/cipher-selftest.c (_gcry_selftest_helper_cbc)
(_gcry_selftest_helper_cfb, _gcry_selftest_helper_ctr): Mark variable
as unused.
* random/rndw32.c (slow_gatherer): Avoid signed pointer mismatch
warning.
* src/secmem.c (init_pool): Avoid unused variable warning.
* tests/random.c (writen, readn): Include on if needed.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agow32: Fix alignment problem with AESNI on Windows >= 8
Werner Koch [Fri, 4 Sep 2015 10:32:16 +0000 (12:32 +0200)]
w32: Fix alignment problem with AESNI on Windows >= 8

* cipher/cipher-selftest.c (_gcry_cipher_selftest_alloc_ctx): New.
* cipher/rijndael.c (selftest_basic_128, selftest_basic_192)
(selftest_basic_256): Allocate context on the heap.
--

The stack alignment on Windows changed and because ld seems to limit
stack variables to a 8 byte alignment (we request 16), we get bus
errors from the selftests if AESNI is in use.

GnuPG-bug-id: 2085
Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agorsa: Add verify after sign to avoid Lenstra's CRT attack.
Werner Koch [Mon, 31 Aug 2015 21:13:27 +0000 (23:13 +0200)]
rsa: Add verify after sign to avoid Lenstra's CRT attack.

* cipher/rsa.c (rsa_sign): Check the CRT.
--

Failures in the computation of the CRT (e.g. due faulty hardware) can
lead to a leak of the private key.  The standard precaution against
this is to verify the signature after signing.  GnuPG does this itself
and even has an option to disable this.  However, the low performance
impact of this extra precaution suggest that it should always be done
and Libgcrypt is the right place here.  For decryption is not done
because the application will detect the failure due to garbled
plaintext and in any case no key derived material will be send to the
user.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoAdd pubkey algo id for EdDSA.
Werner Koch [Mon, 31 Aug 2015 20:41:12 +0000 (22:41 +0200)]
Add pubkey algo id for EdDSA.

* src/gcrypt.h.in (GCRY_PK_EDDSA): New.
--

These ids are not actually used by Libgcrypt but other software makes
use of such algorithm ids.  Thus we provide them here.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoAdd configure option --enable-build-timestamp.
Werner Koch [Tue, 25 Aug 2015 19:11:05 +0000 (21:11 +0200)]
Add configure option --enable-build-timestamp.

* configure.ac (BUILD_TIMESTAMP): Set to "<none>" by default.
--

This is based on
libgpg-error commit d620005fd1a655d591fccb44639e22ea445e4554
but changed to be disabled by default.  Check there for some
background.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agotests: Add missing files for the make distcheck target.
Werner Koch [Sun, 23 Aug 2015 15:20:18 +0000 (17:20 +0200)]
tests: Add missing files for the make distcheck target.

* tests/Makefile.am (EXTRA_DIST): Add sha3-x test vector files.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoChange SHA-3 algorithm ids
Werner Koch [Wed, 19 Aug 2015 10:43:43 +0000 (12:43 +0200)]
Change SHA-3 algorithm ids

* src/gcrypt.h.in (GCRY_MD_SHA3_224, GCRY_MD_SHA3_256)
(GCRY_MD_SHA3_384, GCRY_MD_SHA3_512): Change values.
--

By using algorithm ids outside of the RFC-4880 range we make debugging
of GnuPG easier.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoKeccak: Fix array indexes in θ step
Jussi Kivilinna [Wed, 12 Aug 2015 15:17:01 +0000 (18:17 +0300)]
Keccak: Fix array indexes in θ step

* cipher/keccak.c (keccak_f1600_state_permute): Fix indexes for D[5].
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoSimplify OCB offset calculation for parallel implementations
Jussi Kivilinna [Tue, 11 Aug 2015 04:22:16 +0000 (07:22 +0300)]
Simplify OCB offset calculation for parallel implementations

* cipher/camellia-glue.c (_gcry_camellia_ocb_crypt)
(_gcry_camellia_ocb_auth): Precalculate Ls array always, instead of
just if 'blkn % <parallel blocks> == 0'.
* cipher/serpent.c (_gcry_serpent_ocb_crypt)
(_gcry_serpent_ocb_auth): Ditto.
* cipher/rijndael-aesni.c (get_l): Remove low-bit checks.
(aes_ocb_enc, aes_ocb_dec, _gcry_aes_aesni_ocb_auth): Handle leading
blocks until block counter is multiple of 4, so that parallel block
processing loop can use 'c->u_mode.ocb.L' array directly.
* tests/basic.c (check_ocb_cipher_largebuf): Rename to...
(check_ocb_cipher_largebuf_split): ...this and add option to process
large buffer as two split buffers.
(check_ocb_cipher_largebuf): New.
--

Patch simplifies source and reduce object size.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd carryless 8-bit addition fast-path for AES-NI CTR mode
Jussi Kivilinna [Mon, 10 Aug 2015 17:48:02 +0000 (20:48 +0300)]
Add carryless 8-bit addition fast-path for AES-NI CTR mode

* cipher/rijndael-aesni.c (do_aesni_ctr_4): Do addition using
CTR in big-endian form, if least-significant byte does not overflow.
--

Patch improves AES-NI CTR speed by 20%.

Benchmark on Intel Haswell (3.2 Ghz):

Before:
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        CTR enc |     0.273 ns/B    3489.8 MiB/s     0.875 c/B
        CTR dec |     0.273 ns/B    3491.0 MiB/s     0.874 c/B

After:
        CTR enc |     0.228 ns/B    4190.0 MiB/s     0.729 c/B
        CTR dec |     0.228 ns/B    4190.2 MiB/s     0.729 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd additional SHA3 test-vectors
Jussi Kivilinna [Sun, 9 Aug 2015 15:33:35 +0000 (18:33 +0300)]
Add additional SHA3 test-vectors

* tests/basic.c (check_digests): Allow datalen to be specified so that
input data can have byte with value 0x00; Include sha3-*.h header files
to test-vector structure.
* tests/sha3-224.h: New.
* tests/sha3-256.h: New.
* tests/sha3-384.h: New.
* tests/sha3-512.h: New.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd generic SHA3 implementation
Jussi Kivilinna [Mon, 10 Aug 2015 19:09:56 +0000 (22:09 +0300)]
Add generic SHA3 implementation

* cipher/hash-common.h (MD_BLOCK_MAX_BLOCKSIZE): Increase blocksize
USE_SHA3 enabled.
* cipher/keccak.c (SHA3_DELIMITED_SUFFIX, SHAKE_DELIMITED_SUFFIX): New.
(KECCAK_STATE): Add proper state.
(KECCAK_CONTEXT): Add 'outlen'.
(rol64, keccak_f1600_state_permute, transform_blk, transform): New.
(keccak_init): Add proper initialization.
(keccak_final): Add proper finalization.
(selftests_keccak): Add selftests.
(oid_spec_sha3_224, oid_spec_sha3_256, oid_spec_sha3_384)
(oid_spec_sha3_512): Add OID.
(_gcry_digest_spec_sha3_224, _gcry_digest_spec_sha3_256)
(_gcry_digest_spec_sha3_384, _gcry_digest_spec_sha3_512): Fix output
length.
* cipher/mac-hmac.c (map_mac_algo_to_md): Fix mapping for SHA3-512.
(hmac_get_keylen): Return proper blocksizes for SHA3 algorithms.
[USE_SHA3] (_gcry_mac_type_spec_hmac_sha3_224)
(_gcry_mac_type_spec_hmac_sha3_256, _gcry_mac_type_spec_hmac_sha3_384)
(_gcry_mac_type_spec_hmac_sha3_512): New.
* cipher/mac-internal [USE_SHA3] (_gcry_mac_type_spec_hmac_sha3_224)
(_gcry_mac_type_spec_hmac_sha3_256, _gcry_mac_type_spec_hmac_sha3_384)
(_gcry_mac_type_spec_hmac_sha3_512): New.
* cipher/mac.c (mac_list) [USE_SHA3]: Add SHA3 algorithms.
* cipher/md.c (md_open): Use proper SHA-3 blocksizes for HMAC macpads.
* tests/basic.c (check_digests): Add SHA3 test vectors.
--

Patch adds generic implementation for SHA3. Currently missing with this
patch:
 - HMAC SHA3 test vectors, not available from NIST (yet?)
 - ASNs

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoOptimize OCB offset calculation
Jussi Kivilinna [Mon, 10 Aug 2015 19:09:56 +0000 (22:09 +0300)]
Optimize OCB offset calculation

* cipher/cipher-internal.h (ocb_get_l): New.
* cipher/cipher-ocb.c (_gcry_cipher_ocb_authenticate)
(ocb_crypt): Use 'ocb_get_l' instead of '_gcry_cipher_ocb_get_l'.
* cipher/camellia-glue.c (get_l): Remove.
(_gcry_camellia_ocb_crypt, _gcry_camellia_ocb_auth): Precalculate
offset array when block count matches parallel operation size; Use
'ocb_get_l' instead of 'get_l'.
* cipher/rijndael-aesni.c (get_l): Add fast path for 75% most common
offsets.
(aesni_ocb_enc, aesni_ocb_dec, _gcry_aes_aesni_ocb_auth): Precalculate
offset array when block count matches parallel operation size.
* cipher/rijndael-ssse3-amd64.c (get_l): Add fast path for 75% most
common offsets.
* cipher/rijndael.c (_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth): Use
'ocb_get_l' instead of '_gcry_cipher_ocb_get_l'.
* cipher/serpent.c (get_l): Remove.
(_gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth): Precalculate
offset array when block count matches parallel operation size; Use
'ocb_get_l' instead of 'get_l'.
* cipher/twofish.c (get_l): Remove.
(_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): Use 'ocb_get_l'
instead of 'get_l'.
--

Patch optimizes OCB offset calculation for generic code and
assembly implementations with parallel block processing.

Benchmark of OCB AES-NI on Intel Haswell:

 $ tests/bench-slope --cpu-mhz 3201 cipher aes

 Before:
  AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
         CTR enc |     0.274 ns/B    3483.9 MiB/s     0.876 c/B
         CTR dec |     0.273 ns/B    3490.0 MiB/s     0.875 c/B
         OCB enc |     0.289 ns/B    3296.1 MiB/s     0.926 c/B
         OCB dec |     0.299 ns/B    3189.9 MiB/s     0.957 c/B
        OCB auth |     0.260 ns/B    3670.0 MiB/s     0.832 c/B

 After:
  AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
         CTR enc |     0.273 ns/B    3489.4 MiB/s     0.875 c/B
         CTR dec |     0.273 ns/B    3487.5 MiB/s     0.875 c/B
         OCB enc |     0.248 ns/B    3852.8 MiB/s     0.792 c/B
         OCB dec |     0.261 ns/B    3659.5 MiB/s     0.834 c/B
        OCB auth |     0.227 ns/B    4205.5 MiB/s     0.726 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoecc: fix Montgomery curve bugs.
NIIBE Yutaka [Mon, 10 Aug 2015 10:09:16 +0000 (19:09 +0900)]
ecc: fix Montgomery curve bugs.

* cipher/ecc.c (check_secret_key): Y1 should not be NULL when check.
(ecc_check_secret_key): Support Montgomery curve.
* mpi/ec.c (_gcry_mpi_ec_curve_point): Fix condition.

4 years agoAdd framework to eventually support SHA3.
Werner Koch [Sat, 8 Aug 2015 08:47:55 +0000 (10:47 +0200)]
Add framework to eventually support SHA3.

* src/gcrypt.h.in (GCRY_MD_SHA3_224, GCRY_MD_SHA3_256)
(GCRY_MD_SHA3_384, GCRY_MD_SHA3_512): New.
(GCRY_MAC_HMAC_SHA3_224, GCRY_MAC_HMAC_SHA3_256)
(GCRY_MAC_HMAC_SHA3_384, GCRY_MAC_HMAC_SHA3_512): New.
* cipher/keccak.c: New with stub functions.
* cipher/Makefile.am (EXTRA_libcipher_la_SOURCES): Add keccak.c.
* configure.ac (available_digests): Add sha3.
(USE_SHA3): New.
* src/fips.c (run_hmac_selftests): Add SHA3 to the required selftests.
* cipher/md.c (digest_list) [USE_SHA3]: Add standard SHA3 algos.
(md_open): Ditto for hmac processing.
* cipher/mac-hmac.c (map_mac_algo_to_md): Add mapping.
* cipher/hmac-tests.c (run_selftests): Prepare for tests.
* cipher/pubkey-util.c (get_hash_algo): Add "sha3-xxx".
--

Note that the algo GCRY_MD_SHA3_xxx are prelimanry.  We should try to
sync them with OpenPGP.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agotools: Fix memory leak for functions "I" and "G".
Werner Koch [Thu, 6 Aug 2015 12:57:44 +0000 (14:57 +0200)]
tools: Fix memory leak for functions "I" and "G".

* src/mpicalc.c (do_inv, do_gcd): Init A after stack check.
--

Reported-by: Ismo Puustinen <ismo.puustinen@intel.com>
Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoecc: Free memory also when in error branch.
Ismo Puustinen [Wed, 5 Aug 2015 12:27:43 +0000 (15:27 +0300)]
ecc: Free memory also when in error branch.

* cipher/ecc-eddsa.c (_gcry_ecc_eddsa_sign): Init DISGEST and goto
leave on error.
--

Fixing an issue found by static analysis.

Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
Added DIGEST init and wrote Changelog.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoAdd Curve25519 support.
NIIBE Yutaka [Thu, 6 Aug 2015 08:31:41 +0000 (17:31 +0900)]
Add Curve25519 support.

* cipher/ecc-curves.c (curve_aliases, domain_parms): Add Curve25519.
* tests/curves.c (N_CURVES): It's 22 now.
* src/cipher.h (PUBKEY_FLAG_DJB_TWEAK): New.
* cipher/ecc-common.h (_gcry_ecc_mont_decodepoint): New.
* cipher/ecc-misc.c (_gcry_ecc_mont_decodepoint): New.
* cipher/ecc.c (nist_generate_key): Handle the case of
PUBKEY_FLAG_DJB_TWEAK and Montgomery curve.
(test_ecdh_only_keys, check_secret_key): Likewise.
(ecc_generate): Support Curve25519 which is Montgomery curve with flag
PUBKEY_FLAG_DJB_TWEAK and PUBKEY_FLAG_COMP.
(ecc_encrypt_raw): Get flags from KEYPARMS and handle
PUBKEY_FLAG_DJB_TWEAK and Montgomery curve.
(ecc_decrypt_raw): Likewise.
(compute_keygrip): Handle the case of PUBKEY_FLAG_DJB_TWEAK.
* cipher/pubkey-util.c (_gcry_pk_util_parse_flaglist):
PUBKEY_FLAG_EDDSA implies PUBKEY_FLAG_DJB_TWEAK.
Parse "djb-tweak" for PUBKEY_FLAG_DJB_TWEAK.

--

With PUBKEY_FLAG_DJB_TWEAK, secret key has msb set and it should be
always multiple by cofactor.

4 years agoReduce code size for Twofish key-setup and remove key dependend branch
Jussi Kivilinna [Mon, 13 Jul 2015 13:16:13 +0000 (16:16 +0300)]
Reduce code size for Twofish key-setup and remove key dependend branch

* cipher/twofish.c (poly_to_exp): Increase size by one, change type
from byte to u16 and insert '492' to index 0.
(exp_to_poly): Increase size by 256, let new cells have zero value.
(CALC_S): Execute unconditionally with help of modified tables.
(do_twofish_setkey): Change type for 'tmp' to 'unsigned int'; Un-unroll
CALC_K256 and CALC_K phases to reduce generated object size.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoReduce amount of duplicated code in OCB bulk implementations
Jussi Kivilinna [Sun, 26 Jul 2015 20:39:51 +0000 (23:39 +0300)]
Reduce amount of duplicated code in OCB bulk implementations

* cipher/cipher-ocb.c (_gcry_cipher_ocb_authenticate)
(ocb_crypt): Change bulk function to return number of unprocessed
blocks.
* src/cipher.h (_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth)
(_gcry_camellia_ocb_crypt, _gcry_camellia_ocb_auth)
(_gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth)
(_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): Change return type
to 'size_t'.
* cipher/camellia-glue.c (get_l): Only if USE_AESNI_AVX or
USE_AESNI_AVX2 defined.
(_gcry_camellia_ocb_crypt, _gcry_camellia_ocb_auth): Change return type
to 'size_t' and return remaining blocks; Remove unaccelerated common
code path. Enable remaining common code only if USE_AESNI_AVX or
USE_AESNI_AVX2 defined; Remove unaccelerated common code.
* cipher/rijndael.c (_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth): Change
return type to 'size_t' and return zero.
* cipher/serpent.c (get_l): Only if USE_SSE2, USE_AVX2 or USE_NEON
defined.
(_gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth): Change return type
to 'size_t' and return remaining blocks; Remove unaccelerated common
code path. Enable remaining common code only if USE_SSE2, USE_AVX2 or
USE_NEON defined; Remove unaccelerated common code.
* cipher/twofish.c (get_l): Only if USE_AMD64_ASM defined.
(_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): Change return type
to 'size_t' and return remaining blocks; Remove unaccelerated common
code path. Enable remaining common code only if USE_AMD64_ASM defined;
Remove unaccelerated common code.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd bulk OCB for Serpent SSE2, AVX2 and NEON implementations
Jussi Kivilinna [Sun, 26 Jul 2015 14:17:20 +0000 (17:17 +0300)]
Add bulk OCB for Serpent SSE2, AVX2 and NEON implementations

* cipher/cipher.c (_gcry_cipher_open_internal): Setup OCB bulk
functions for Serpent.
* cipher/serpent-armv7-neon.S: Add OCB assembly functions.
* cipher/serpent-avx2-amd64.S: Add OCB assembly functions.
* cipher/serpent-sse2-amd64.S: Add OCB assembly functions.
* cipher/serpent.c (_gcry_serpent_sse2_ocb_enc)
(_gcry_serpent_sse2_ocb_dec, _gcry_serpent_sse2_ocb_auth)
(_gcry_serpent_neon_ocb_enc, _gcry_serpent_neon_ocb_dec)
(_gcry_serpent_neon_ocb_auth, _gcry_serpent_avx2_ocb_enc)
(_gcry_serpent_avx2_ocb_dec, _gcry_serpent_avx2_ocb_auth): New
prototypes.
(get_l, _gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth): New.
* src/cipher.h (_gcry_serpent_ocb_crypt)
(_gcry_serpent_ocb_auth): New.
* tests/basic.c (check_ocb_cipher): Add test-vector for serpent.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd bulk OCB for Twofish AMD64 implementation
Jussi Kivilinna [Tue, 7 Jul 2015 18:52:34 +0000 (21:52 +0300)]
Add bulk OCB for Twofish AMD64 implementation

* cipher/cipher.c (_gcry_cipher_open_internal): Setup OCB bulk
functions for Twofish.
* cipher/twofish-amd64.S: Add OCB assembly functions.
* cipher/twofish.c (_gcry_twofish_amd64_ocb_enc)
(_gcry_twofish_amd64_ocb_dec, _gcry_twofish_amd64_ocb_auth): New
prototypes.
(call_sysv_fn5, call_sysv_fn6, twofish_amd64_ocb_enc)
(twofish_amd64_ocb_dec, twofish_amd64_ocb_auth, get_l)
(_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): New.
* src/cipher.h (_gcry_twofish_ocb_crypt)
(_gcry_twofish_ocb_auth): New.
* tests/basic.c (check_ocb_cipher): Add test-vector for Twofish.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd bulk OCB for Camellia AES-NI/AVX and AES-NI/AVX2 implementations
Jussi Kivilinna [Tue, 7 Jul 2015 18:49:57 +0000 (21:49 +0300)]
Add bulk OCB for Camellia AES-NI/AVX and AES-NI/AVX2 implementations

* cipher/camellia-aesni-avx-amd64.S: Add OCB assembly functions.
* cipher/camellia-aesni-avx2-amd64.S: Add OCB assembly functions.
* cipher/camellia-glue.c (_gcry_camellia_aesni_avx_ocb_enc)
(_gcry_camellia_aesni_avx_ocb_dec, _gcry_camellia_aesni_avx_ocb_auth)
(_gcry_camellia_aesni_avx2_ocb_enc, _gcry_camellia_aesni_avx2_ocb_dec)
(_gcry_camellia_aesni_avx2_ocb_auth): New prototypes.
(get_l, _gcry_camellia_ocb_crypt, _gcry_camellia_ocb_auth): New.
* cipher/cipher.c (_gcry_cipher_open_internal): Setup OCB bulk
functions for Camellia.
* src/cipher.h (_gcry_camellia_ocb_crypt)
(_gcry_camellia_ocb_auth): New.
* tests/basic.c (check_ocb_cipher): Add test-vector for Camellia.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd OCB bulk mode for AES SSSE3 implementation
Jussi Kivilinna [Sun, 5 Jul 2015 17:58:56 +0000 (20:58 +0300)]
Add OCB bulk mode for AES SSSE3 implementation

* cipher/rijndael-ssse3-amd64.c (SSSE3_STATE_SIZE): New.
[HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS] (vpaes_ssse3_prepare): Use
'ssse3_state' for storing current SSSE3 state.
[HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS]
(vpaes_ssse3_cleanup): Restore SSSE3 state from 'ssse3_state'.
(_gcry_aes_ssse3_do_setkey, _gcry_aes_ssse3_prepare_decryption)
(_gcry_aes_ssse3_encrypt, _gcry_aes_ssse3_cfb_enc)
(_gcry_aes_ssse3_cbc_enc, _gcry_aes_ssse3_ctr_enc)
(_gcry_aes_ssse3_decrypt, _gcry_aes_ssse3_cfb_dec)
(_gcry_aes_ssse3_cbc_dec, _gcry_aes_ssse3_cbc_dec): Add 'ssse3_state'
array.
(get_l, ssse3_ocb_enc, ssse3_ocb_dec, _gcry_aes_ssse3_ocb_crypt)
(_gcry_aes_ssse3_ocb_auth): New.
* cipher/rijndael.c (_gcry_aes_ssse3_ocb_crypt)
(_gcry_aes_ssse3_ocb_auth): New.
(_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth) [USE_SSSE3]: Use SSSE3
implementation for OCB.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoFix undefined behavior wrt memcpy
Peter Wu [Sun, 26 Jul 2015 13:50:33 +0000 (16:50 +0300)]
Fix undefined behavior wrt memcpy

* cipher/cipher-gcm.c: Do not copy zero bytes from an empty buffer. Let
the function continue to add padding as needed though.
* cipher/mac-poly1305.c: If the caller requested to finish the hash
function without a copy of the result, return immediately.
--
Caught by UndefinedBehaviorSanitizer.

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
4 years agobuild: ignore scissor line for the commit-msg hook
Peter Wu [Thu, 9 Jul 2015 15:11:33 +0000 (17:11 +0200)]
build: ignore scissor line for the commit-msg hook

* build-aux/git-hooks/commit-msg: Stop processing more lines when the
  scissor line is encountered.
--
This allows the command `git commit -v` to work even if the code is
longer than 72 characters. Note that comments are already ignored by the
previous line.

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
4 years agoRegister DCO for Peter Wu.
Werner Koch [Thu, 23 Jul 2015 12:38:49 +0000 (14:38 +0200)]
Register DCO for Peter Wu.

--

4 years agorsa: Fix error in comments.
Peter Wu [Thu, 16 Jul 2015 04:59:44 +0000 (13:59 +0900)]
rsa: Fix error in comments.

* cipher/rsa.c: Fix.

--

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
4 years agosexp: Fix invalid deallocation in error path.
Peter Wu [Tue, 14 Jul 2015 00:53:38 +0000 (09:53 +0900)]
sexp: Fix invalid deallocation in error path.

* src/sexp.c: Fix wrong condition.

--

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
4 years agoecc: fix memory leak.
Peter Wu [Fri, 10 Jul 2015 01:15:26 +0000 (10:15 +0900)]
ecc: fix memory leak.

* cipher/ecc.c (ecc_verify): Release memory which was allocated before
by _gcry_pk_util_preparse_sigval.
(ecc_decrypt_raw): Likewise.

--

Caught by LeakSanitizer (LSan). Now the test suite (make check) passes
with no memleaks.

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
The last commit (0a7547e487a8bc4e7ac9599c55579eb2e4a13f06) includes
wrong fixes for sexp_release.

ecc_decrypt_raw fix added by gniibe.

4 years agoecc: fix memory leaks.
NIIBE Yutaka [Mon, 6 Jul 2015 03:01:00 +0000 (12:01 +0900)]
ecc: fix memory leaks.

cipher/ecc.c (ecc_generate): Fix memory leak on error of
_gcry_pk_util_parse_flaglist and _gcry_ecc_eddsa_encodepoint.
(ecc_check_secret_key): Fix memory leak on error of
_gcry_ecc_update_curve_param.
(ecc_sign, ecc_verify, ecc_encrypt_raw, ecc_decrypt_raw): Remove
unnecessary sexp_release and fix memory leak on error of
_gcry_ecc_fill_in_curve.
(ecc_decrypt_raw): Fix double free of the point kG and memory leak
on error of _gcry_ecc_os2ec.

4 years agompi: Support FreeBSD 10 or later.
NIIBE Yutaka [Thu, 11 Jun 2015 07:19:49 +0000 (16:19 +0900)]
mpi: Support FreeBSD 10 or later.

* mpi/config.links: Include FreeBSD 10 to 29.

--

Thanks to Yuta SATOH.

GnuPG-bug-id: 1936, 1974

4 years agoecc: Add key generation flag "no-keytest".
Werner Koch [Thu, 21 May 2015 14:24:36 +0000 (16:24 +0200)]
ecc: Add key generation flag "no-keytest".

* src/cipher.h (PUBKEY_FLAG_NO_KEYTEST): New.
* cipher/pubkey-util.c (_gcry_pk_util_parse_flaglist): Add flag
"no-keytest".  Return an error for invalid flags of length 10.

* cipher/ecc.c (nist_generate_key): Replace arg random_level by flags
set random level depending on flags.
* cipher/ecc-eddsa.c (_gcry_ecc_eddsa_genkey): Ditto.
* cipher/ecc.c (ecc_generate): Pass flags to generate fucntion and
remove var random_level.
(nist_generate_key): Implement "no-keytest" flag.

* tests/keygen.c (check_ecc_keys): Add tests for transient-key and
no-keytest.
--

After key creation we usually run a test to check whether the keys
really work.  However for transient keys this might be too time
consuming and given that a failed test would anyway abort the process
the optional use of a flag to skip the test is appropriate.

Using Ed25519 for EdDSA and the "no-keytest" flags halves the time to
create such a key.  This was measured by looping the last test from
check_ecc_keys() 1000 times with and without the flag.

Due to a bug in the flags parser unknown flags with a length of 10
characters were not detected.  Thus the "no-keytest" flag can be
employed by all software even for libraries before this.  That bug is
however solved with this version.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoecc: Avoid double conversion to affine coordinates in keygen.
Werner Koch [Thu, 21 May 2015 09:12:42 +0000 (11:12 +0200)]
ecc: Avoid double conversion to affine coordinates in keygen.

* cipher/ecc.c (nist_generate_key): Add args r_x and r_y.
(ecc_generate): Rename vars.  Convert to affine coordinates only if
not returned by the lower level generation function.
--

nist_generate_key already needs to convert to affine coordinates to
implement Jivsov's trick.  Thus we can return them and avoid calling
it in ecc_generate again.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agorandom: Change initial extra seeding from 2400 bits to 128 bits.
Werner Koch [Mon, 4 May 2015 14:46:02 +0000 (16:46 +0200)]
random: Change initial extra seeding from 2400 bits to 128 bits.

* random/random-csprng.c (read_pool): Reduce initial seeding.
--

See discussion starting at
 https://lists.gnupg.org/pipermail/gnupg-devel/2015-April/029750.html
and also in May.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoEnable AMD64 Twofish implementation on WIN64
Jussi Kivilinna [Thu, 14 May 2015 10:07:34 +0000 (13:07 +0300)]
Enable AMD64 Twofish implementation on WIN64

* cipher/twofish-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/twofish.c (USE_AMD64_ASM): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
[HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS] (call_sysv_fn): New.
(twofish_amd64_encrypt_block, twofish_amd64_decrypt_block)
(twofish_amd64_ctr_enc, twofish_amd64_cbc_dec)
(twofish_amd64_cfb_dec): New wrapper functions for AMD64
assembly functions.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>