libgcrypt.git
3 years agompi: Fix mpi_set_cond and mpi_swap_cond .
NIIBE Yutaka [Wed, 25 Nov 2015 01:52:57 +0000 (10:52 +0900)]
mpi: Fix mpi_set_cond and mpi_swap_cond .

* mpi/mpiutil.c (_gcry_mpi_set_cond, _gcry_mpi_swap_cond): Don't use
the operator of !!, but assume SET/SWAP is 0 or 1.

--

If the code for !! would include a branch, it spoils the purpose of
mpi_set_cond/mpi_swap_cond at all.  It's better to make sure the use
of this function to be called with 0 or 1 for SET/SWAP.  Note that it
conforms when SET/SWAP is the result of conditional expression of
mpi_test_bit.

Reported-by: Taylor R Campbell.
3 years agoecc: multiplication of Edwards curve to be constant-time.
NIIBE Yutaka [Wed, 25 Nov 2015 01:42:47 +0000 (10:42 +0900)]
ecc: multiplication of Edwards curve to be constant-time.

* mpi/ec.c (_gcry_mpi_ec_mul_point): Use point_swap_cond.

--

Reported-by: Taylor R Campbell.
3 years agoecc: Add point_resize and point_swap_cond.
NIIBE Yutaka [Wed, 25 Nov 2015 01:19:39 +0000 (10:19 +0900)]
ecc: Add point_resize and point_swap_cond.

* mpi/ec.c (point_resize, point_swap_cond): New.
(_gcry_mpi_ec_mul_point): Use point_resize and point_swap_cond.

--

Thanks to Taylor R Campbell who suggests.

3 years agocipher: Fix error handling.
Justus Winter [Tue, 17 Nov 2015 15:00:16 +0000 (16:00 +0100)]
cipher: Fix error handling.

* cipher/cipher.c (_gcry_cipher_ctl): Fix error handling.
--
Found using the Clang Static Analyzer.

Signed-off-by: Justus Winter <justus@g10code.com>
3 years agoTweak Keccak for small speed-up
Jussi Kivilinna [Wed, 18 Nov 2015 07:44:18 +0000 (09:44 +0200)]
Tweak Keccak for small speed-up

* cipher/keccak_permute_32.h (KECCAK_F1600_PERMUTE_FUNC_NAME): Track
rounds with round constant pointer instead of separate round counter.
* cipher/keccak_permute_64.h (KECCAK_F1600_PERMUTE_FUNC_NAME): Ditto.
(KECCAK_F1600_ABSORB_FUNC_NAME): Tweak lanes pointer increment for bulk
absorb loops.
--

Patch makes small tweaks to improve performance.

Benchmark on Intel Haswell @ 3.2 Ghz:

Before:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 SHAKE128       |      2.27 ns/B     420.5 MiB/s      7.26 c/B
 SHAKE256       |      2.79 ns/B     341.4 MiB/s      8.94 c/B
 SHA3-224       |      2.64 ns/B     361.7 MiB/s      8.44 c/B
 SHA3-256       |      2.79 ns/B     341.4 MiB/s      8.94 c/B
 SHA3-384       |      3.65 ns/B     261.3 MiB/s     11.68 c/B
 SHA3-512       |      5.27 ns/B     181.0 MiB/s     16.86 c/B

After:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 SHAKE128       |      2.25 ns/B     423.5 MiB/s      7.21 c/B
 SHAKE256       |      2.77 ns/B     343.9 MiB/s      8.88 c/B
 SHA3-224       |      2.62 ns/B     364.1 MiB/s      8.38 c/B
 SHA3-256       |      2.77 ns/B     343.8 MiB/s      8.88 c/B
 SHA3-384       |      3.63 ns/B     262.6 MiB/s     11.63 c/B
 SHA3-512       |      5.23 ns/B     182.3 MiB/s     16.75 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoUpdate license information for CRC
Jussi Kivilinna [Wed, 18 Nov 2015 07:44:18 +0000 (09:44 +0200)]
Update license information for CRC

* LICENSES: Remove 'Simple permissive' and 'IETF permissive' licenses
for 'cipher/crc.c' as result of rewrite of CRC implementations.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoFix typos found using codespell
Justus Winter [Mon, 16 Nov 2015 11:18:47 +0000 (12:18 +0100)]
Fix typos found using codespell

* cipher/cipher-ocb.c: Fix typos.
* cipher/des.c: Likewise.
* cipher/dsa-common.c: Likewise.
* cipher/ecc.c: Likewise.
* cipher/pubkey.c: Likewise.
* cipher/rsa-common.c: Likewise.
* cipher/scrypt.c: Likewise.
* random/random-csprng.c: Likewise.
* random/random-fips.c: Likewise.
* random/rndw32.c: Likewise.
* src/cipher-proto.h: Likewise.
* src/context.c: Likewise.
* src/fips.c: Likewise.
* src/gcrypt.h.in: Likewise.
* src/global.c: Likewise.
* src/sexp.c: Likewise.
* tests/mpitests.c: Likewise.
* tests/t-lock.c: Likewise.

Signed-off-by: Justus Winter <justus@g10code.com>
3 years agoImprove performance of Tiger hash algorithms
Jussi Kivilinna [Sun, 1 Nov 2015 18:44:09 +0000 (20:44 +0200)]
Improve performance of Tiger hash algorithms

* cipher/tiger.c (tiger_round, pass, key_schedule): Convert functions
to macros.
(transform_blk): Pass variable names instead of pointers to 'pass'.
--

Benchmark results on Intel Haswell @ 3.2 Ghz:

Before:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 TIGER          |      3.25 ns/B     293.5 MiB/s     10.40 c/B

After (1.75x faster):
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 TIGER          |      1.85 ns/B     515.3 MiB/s      5.92 c/B

Benchmark results on Cortex-A8 @ 1008 Mhz:

Before:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 TIGER          |     63.42 ns/B     15.04 MiB/s     63.93 c/B

After (1.26x faster):
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 TIGER          |     49.99 ns/B     19.08 MiB/s     50.39 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoAdd ARMv7/NEON implementation of Keccak
Jussi Kivilinna [Sun, 1 Nov 2015 14:06:26 +0000 (16:06 +0200)]
Add ARMv7/NEON implementation of Keccak

* cipher/Makefile.am: Add 'keccak-armv7-neon.S'.
* cipher/keccak-armv7-neon.S: New.
* cipher/keccak.c (USE_64BIT_ARM_NEON): New.
(NEED_COMMON64): Select if USE_64BIT_ARM_NEON.
[NEED_COMMON64] (round_consts_64bit): Rename to...
[NEED_COMMON64] (_gcry_keccak_round_consts_64bit): ...this; Add
terminator at end.
[USE_64BIT_ARM_NEON] (_gcry_keccak_permute_armv7_neon)
(_gcry_keccak_absorb_lanes64_armv7_neon, keccak_permute64_armv7_neon)
(keccak_absorb_lanes64_armv7_neon, keccak_armv7_neon_64_ops): New.
(keccak_init) [USE_64BIT_ARM_NEON]: Select ARM/NEON implementation
if supported by HW.
* cipher/keccak_permute_64.h (KECCAK_F1600_PERMUTE_FUNC_NAME): Update
to use new round constant table.
* configure.ac: Add 'keccak-armv7-neon.lo'.
--

Patch adds ARMv7/NEON implementation of Keccak (SHAKE/SHA3). Patch
is based on public-domain implementation by Ronny Van Keer from
SUPERCOP package:
 https://github.com/floodyberry/supercop/blob/master/crypto_hash/\
keccakc1024/inplace-armv7a-neon/keccak2.s

Benchmark results on Cortex-A8 @ 1008 Mhz:

Before (generic 32-bit bit-interleaved impl.):
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 SHAKE128       |     83.00 ns/B     11.49 MiB/s     83.67 c/B
 SHAKE256       |     101.7 ns/B      9.38 MiB/s     102.5 c/B
 SHA3-224       |     96.13 ns/B      9.92 MiB/s     96.90 c/B
 SHA3-256       |     101.5 ns/B      9.40 MiB/s     102.3 c/B
 SHA3-384       |     131.4 ns/B      7.26 MiB/s     132.5 c/B
 SHA3-512       |     189.1 ns/B      5.04 MiB/s     190.6 c/B

After (ARM/NEON, ~3.2x faster):
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 SHAKE128       |     25.09 ns/B     38.01 MiB/s     25.29 c/B
 SHAKE256       |     30.95 ns/B     30.82 MiB/s     31.19 c/B
 SHA3-224       |     29.24 ns/B     32.61 MiB/s     29.48 c/B
 SHA3-256       |     30.95 ns/B     30.82 MiB/s     31.19 c/B
 SHA3-384       |     40.42 ns/B     23.59 MiB/s     40.74 c/B
 SHA3-512       |     58.37 ns/B     16.34 MiB/s     58.84 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoOptimize Keccak 64-bit absorb functions
Jussi Kivilinna [Sat, 31 Oct 2015 19:29:56 +0000 (21:29 +0200)]
Optimize Keccak 64-bit absorb functions

* cipher/keccak.c [USE_64BIT] [__x86_64__] (absorb_lanes64_8)
(absorb_lanes64_4, absorb_lanes64_2, absorb_lanes64_1): New.
* cipher/keccak.c [USE_64BIT] [!__x86_64__] (absorb_lanes64_8)
(absorb_lanes64_4, absorb_lanes64_2, absorb_lanes64_1): New.
[USE_64BIT] (KECCAK_F1600_ABSORB_FUNC_NAME): New.
[USE_64BIT] (keccak_absorb_lanes64): Remove.
[USE_64BIT_SHLD] (KECCAK_F1600_ABSORB_FUNC_NAME): New.
[USE_64BIT_SHLD] (keccak_absorb_lanes64_shld): Remove.
[USE_64BIT_BMI2] (KECCAK_F1600_ABSORB_FUNC_NAME): New.
[USE_64BIT_BMI2] (keccak_absorb_lanes64_bmi2): Remove.
* cipher/keccak_permute_64.h (KECCAK_F1600_ABSORB_FUNC_NAME): New.
--

Optimize 64-bit absorb functions for small speed-up. After this
change, 64-bit BMI2 implementation matches speed of fastest results
from SUPERCOP for Intel Haswell CPUs (long messages).

Benchmark on Intel Haswell @ 3.2 Ghz:

Before:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 SHAKE128       |      2.32 ns/B     411.7 MiB/s      7.41 c/B
 SHAKE256       |      2.84 ns/B     336.2 MiB/s      9.08 c/B
 SHA3-224       |      2.69 ns/B     354.9 MiB/s      8.60 c/B
 SHA3-256       |      2.84 ns/B     336.0 MiB/s      9.08 c/B
 SHA3-384       |      3.69 ns/B     258.4 MiB/s     11.81 c/B
 SHA3-512       |      5.30 ns/B     179.9 MiB/s     16.97 c/B

After:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 SHAKE128       |      2.27 ns/B     420.6 MiB/s      7.26 c/B
 SHAKE256       |      2.79 ns/B     341.4 MiB/s      8.94 c/B
 SHA3-224       |      2.64 ns/B     361.7 MiB/s      8.44 c/B
 SHA3-256       |      2.79 ns/B     341.5 MiB/s      8.94 c/B
 SHA3-384       |      3.65 ns/B     261.4 MiB/s     11.68 c/B
 SHA3-512       |      5.27 ns/B     181.0 MiB/s     16.87 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoEnable CRC test vectors with zero bytes
Jussi Kivilinna [Sat, 31 Oct 2015 18:19:59 +0000 (20:19 +0200)]
Enable CRC test vectors with zero bytes

* tests/basic.c (check_digests): Enable CRC test-vectors with zero
bytes.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoKeccak: Add SHAKE Extendable-Output Functions
Jussi Kivilinna [Sun, 25 Oct 2015 18:34:50 +0000 (20:34 +0200)]
Keccak: Add SHAKE Extendable-Output Functions

* src/hash-common.c (_gcry_hash_selftest_check_one): Add handling for
XOFs.
* src/keccak.c (keccak_ops_t): Rename 'extract_inplace' to 'extract'
and add 'pos' argument.
(KECCAK_CONTEXT): Add 'suffix'.
(keccak_extract_inplace64): Rename to...
(keccak_extract64): ...this; Add handling for 'pos' argument.
(keccak_extract_inplace32bi): Rename to...
(keccak_extract32bi): ...this; Add handling for 'pos' argument.
(keccak_extract_inplace64): Rename to...
(keccak_extract64): ...this; Add handling for 'pos' argument.
(keccak_extract_inplace32bi_bmi2): Rename to...
(keccak_extract32bi_bmi2): ...this; Add handling for 'pos' argument.
(keccak_init): Setup 'suffix'; add SHAKE128 & SHAKE256.
(shake128_init, shake256_init): New.
(keccak_final): Do not initial permute for SHAKE output; use correct
suffix for SHAKE.
(keccak_extract): New.
(keccak_selftests_keccak): Add SHAKE128 & SHAKE256 test-vectors.
(run_selftests): Add SHAKE128 & SHAKE256.
(shake128_asn, oid_spec_shake128, shake256_asn, oid_spec_shake256)
(_gcry_digest_spec_shake128, _gcry_digest_spec_shake256): New.
* cipher/md.c (digest_list): Add SHAKE128 & SHAKE256.
* doc/gcrypt.texi: Ditto.
* src/cipher.h (_gcry_digest_spec_shake128)
(_gcry_digest_spec_shake256): New.
* src/gcrypt.h.in (GCRY_MD_SHAKE128, GCRY_MD_SHAKE256): New.
* tests/basic.c (check_one_md): Add XOF check; Add 'elen' argument.
(check_one_md_multi): Skip if algo is XOF.
(check_digests): Add SHAKE128 & SHAKE256 test vectors.
* tests/bench-slope.c (kdf_bench_one): Skip XOFs.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoFew updates to documentation
Jussi Kivilinna [Sun, 25 Oct 2015 16:57:15 +0000 (18:57 +0200)]
Few updates to documentation

* doc/gcrypt.text: Add mention of new 'intel-fast-shld' hw feature
flag; Add mention of x86 RDRAND support in rndhw.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoAdd HMAC-SHA3 test vectors
Jussi Kivilinna [Sun, 25 Oct 2015 15:59:33 +0000 (17:59 +0200)]
Add HMAC-SHA3 test vectors

* tests/basic.c (check_mac): Add HMAC_SHA3 test vectors.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agomd: add variable length output interface
Jussi Kivilinna [Sun, 25 Oct 2015 12:50:41 +0000 (14:50 +0200)]
md: add variable length output interface

* cipher/crc.c (_gcry_digest_spec_crc32)
(_gcry_digest_spec_crc32_rfc1510, _gcry_digest_spec_crc24_rfc2440): Set
'extract' NULL.
* cipher/gostr3411-94.c (_gcry_digest_spec_gost3411_94)
(_gcry_digest_spec_gost3411_cp): Ditto.
* cipher/keccak.c (_gcry_digest_spec_sha3_224)
(_gcry_digest_spec_sha3_256, _gcry_digest_spec_sha3_384)
(_gcry_digest_spec_sha3_512): Ditto.
* cipher/md2.c (_gcry_digest_spec_md2): Ditto.
* cipher/md4.c (_gcry_digest_spec_md4): Ditto.
* cipher/md5.c (_gcry_digest_spec_md5): Ditto.
* cipher/rmd160.c (_gcry_digest_spec_rmd160): Ditto.
* cipher/sha1.c (_gcry_digest_spec_sha1): Ditto.
* cipher/sha256.c (_gcry_digest_spec_sha224)
(_gcry_digest_spec_sha256): Ditto.
* cipher/sha512.c (_gcry_digest_spec_sha384)
(_gcry_digest_spec_sha512): Ditto.
* cipher/stribog.c (_gcry_digest_spec_stribog_256)
(_gcry_digest_spec_stribog_512): Ditto.
* cipher/tiger.c (_gcry_digest_spec_tiger)
(_gcry_digest_spec_tiger1, _gcry_digest_spec_tiger2): Ditto.
* cipher/whirlpool.c (_gcry_digest_spec_whirlpool): Ditto.
* cipher/md.c (md_enable): Do not allow combination of HMAC and
'expandable-output function'.
(md_final): Check if spec->read is NULL before calling.
(md_read): Ditto.
(md_extract, _gcry_md_extract): New.
* doc/gcrypt.texi: Add SHA3 algorithms and gcry_md_extract.
* src/cipher-proto.h (gcry_md_extract_t): New.
(gcry_md_spec_t): Add 'extract'.
* src/gcrypt-int.g (_gcry_md_extract): New.
* src/gcrypt.h.in (gcry_md_extract): New.
* src/libgcrypt.def: Add gcry_md_extract.
* src/libgcrypt.vers: Add gcry_md_extract.
* src/visibility.c (gcry_md_extract): New.
* src/visibility.h (gcry_md_extract): New.
--

Patch adds new interface for reading output from 'expandable-output
function' MD algorithms that can give variable length output (ie.
SHAKE algorithms from FIPS-202). New function to read output is

 gpg_error_t gcry_md_extract(gcry_md_hd_t md, int algo,
     void *buffer, size_t length);

Function implicitly finalizes algorithm so that no new input can
be given. Subsequents calls of the function return more output
bytes from the algorithm.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agomd: check hmac flag in prepare_macpads
Jussi Kivilinna [Sun, 25 Oct 2015 13:11:14 +0000 (15:11 +0200)]
md: check hmac flag in prepare_macpads

* cipher/md.c (prepare_macpads): Check hmac flag.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agokeccak: rewrite for improved performance
Jussi Kivilinna [Fri, 23 Oct 2015 19:30:48 +0000 (22:30 +0300)]
keccak: rewrite for improved performance

* cipher/Makefile.am: Add 'keccak_permute_32.h' and
'keccak_permute_64.h'.
* cipher/hash-common.h [USE_SHA3] (MD_BLOCK_MAX_BLOCKSIZE): Remove.
* cipher/keccak.c (USE_64BIT, USE_32BIT, USE_64BIT_BMI2)
(USE_64BIT_SHLD, USE_32BIT_BMI2, NEED_COMMON64, NEED_COMMON32BI)
(keccak_ops_t): New.
(KECCAK_STATE): Add 'state64' and 'state32bi' members.
(KECCAK_CONTEXT): Remove 'bctx'; add 'blocksize', 'count' and 'ops'.
(rol64, keccak_f1600_state_permute): Remove.
[NEED_COMMON64] (round_consts_64bit, keccak_extract_inplace64): New.
[NEED_COMMON32BI] (round_consts_32bit, keccak_extract_inplace32bi)
(keccak_absorb_lane32bi): New.
[USE_64BIT] (ANDN64, ROL64, keccak_f1600_state_permute64)
(keccak_absorb_lanes64, keccak_generic64_ops): New.
[USE_64BIT_SHLD] (ANDN64, ROL64, keccak_f1600_state_permute64_shld)
(keccak_absorb_lanes64_shld, keccak_shld_64_ops): New.
[USE_64BIT_BMI2] (ANDN64, ROL64, keccak_f1600_state_permute64_bmi2)
(keccak_absorb_lanes64_bmi2, keccak_bmi2_64_ops): New.
[USE_32BIT] (ANDN64, ROL64, keccak_f1600_state_permute32bi)
(keccak_absorb_lanes32bi, keccak_generic32bi_ops): New.
[USE_32BIT_BMI2] (ANDN64, ROL64, keccak_f1600_state_permute32bi_bmi2)
(pext, pdep, keccak_absorb_lane32bi_bmi2, keccak_absorb_lanes32bi_bmi2)
(keccak_extract_inplace32bi_bmi2, keccak_bmi2_32bi_ops): New.
(keccak_write): New.
(keccak_init): Adjust to KECCAK_CONTEXT changes; add implementation
selection based on HWF features.
(keccak_final): Adjust to KECCAK_CONTEXT changes; use selected 'ops'
for state manipulation.
(keccak_read): Adjust to KECCAK_CONTEXT changes.
(_gcry_digest_spec_sha3_224, _gcry_digest_spec_sha3_256)
(_gcry_digest_spec_sha3_348, _gcry_digest_spec_sha3_512): Use
'keccak_write' instead of '_gcry_md_block_write'.
* cipher/keccak_permute_32.h: New.
* cipher/keccak_permute_64.h: New.
--

Patch adds new generic 64-bit and 32-bit implementations and
optimized implementations for SHA3:
 - Generic 64-bit implementation based on 'simple' implementation
   from SUPERCOP package.
 - Generic 32-bit bit-inteleaved implementataion based on
   'simple32bi' implementation from SUPERCOP package.
 - Intel BMI2 optimized variants of 64-bit and 32-bit BI
   implementations.
 - Intel SHLD optimized variant of 64-bit implementation.

Patch also makes proper use of sponge construction to avoid
use of addition input buffer.

Below are bench-slope benchmarks for new 64-bit implementations
made on Intel Core i5-4570 (no turbo, 3.2 Ghz, gcc-4.9.2).

Before (amd64):

 SHA3-224       |      3.92 ns/B     243.2 MiB/s     12.55 c/B
 SHA3-256       |      4.15 ns/B     230.0 MiB/s     13.27 c/B
 SHA3-384       |      5.40 ns/B     176.6 MiB/s     17.29 c/B
 SHA3-512       |      7.77 ns/B     122.7 MiB/s     24.87 c/B

After (generic 64-bit, amd64), 1.10x faster):

 SHA3-224       |      3.57 ns/B     267.4 MiB/s     11.42 c/B
 SHA3-256       |      3.77 ns/B     252.8 MiB/s     12.07 c/B
 SHA3-384       |      4.91 ns/B     194.1 MiB/s     15.72 c/B
 SHA3-512       |      7.06 ns/B     135.0 MiB/s     22.61 c/B

After (Intel SHLD 64-bit, amd64, 1.13x faster):

 SHA3-224       |      3.48 ns/B     273.7 MiB/s     11.15 c/B
 SHA3-256       |      3.68 ns/B     258.9 MiB/s     11.79 c/B
 SHA3-384       |      4.80 ns/B     198.7 MiB/s     15.36 c/B
 SHA3-512       |      6.89 ns/B     138.4 MiB/s     22.05 c/B

After (Intel BMI2 64-bit, amd64, 1.45x faster):

 SHA3-224       |      2.71 ns/B     352.1 MiB/s      8.67 c/B
 SHA3-256       |      2.86 ns/B     333.2 MiB/s      9.16 c/B
 SHA3-384       |      3.72 ns/B     256.2 MiB/s     11.91 c/B
 SHA3-512       |      5.34 ns/B     178.5 MiB/s     17.10 c/B

Benchmarks of new 32-bit implementations on Intel Core i5-4570
(no turbo, 3.2 Ghz, gcc-4.9.2):

Before (win32):

 SHA3-224       |     12.05 ns/B     79.16 MiB/s     38.56 c/B
 SHA3-256       |     12.75 ns/B     74.78 MiB/s     40.82 c/B
 SHA3-384       |     16.63 ns/B     57.36 MiB/s     53.22 c/B
 SHA3-512       |     23.97 ns/B     39.79 MiB/s     76.72 c/B

After (generic 32-bit BI, win32, 1.23x to 1.29x faster):

 SHA3-224       |      9.76 ns/B     97.69 MiB/s     31.25 c/B
 SHA3-256       |     10.27 ns/B     92.82 MiB/s     32.89 c/B
 SHA3-384       |     13.22 ns/B     72.16 MiB/s     42.31 c/B
 SHA3-512       |     18.65 ns/B     51.13 MiB/s     59.70 c/B

After (Intel BMI2 32-bit BI, win32, 1.66x to 1.70x faster):

 SHA3-224       |      7.26 ns/B     131.4 MiB/s     23.23 c/B
 SHA3-256       |      7.65 ns/B     124.7 MiB/s     24.47 c/B
 SHA3-384       |      9.87 ns/B     96.67 MiB/s     31.58 c/B
 SHA3-512       |     14.05 ns/B     67.85 MiB/s     44.99 c/B

Benchmarks of new 32-bit implementation on ARM Cortex-A8
(1008 Mhz, gcc-4.9.1):

Before:

 SHA3-224       |     148.6 ns/B      6.42 MiB/s     149.8 c/B
 SHA3-256       |     157.2 ns/B      6.07 MiB/s     158.4 c/B
 SHA3-384       |     205.3 ns/B      4.65 MiB/s     206.9 c/B
 SHA3-512       |     296.3 ns/B      3.22 MiB/s     298.6 c/B

After (1.56x faster):

 SHA3-224       |     96.12 ns/B      9.92 MiB/s     96.89 c/B
 SHA3-256       |     101.5 ns/B      9.40 MiB/s     102.3 c/B
 SHA3-384       |     131.4 ns/B      7.26 MiB/s     132.5 c/B
 SHA3-512       |     188.2 ns/B      5.07 MiB/s     189.7 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agohwf-x86: add detection for Intel CPUs with fast SHLD instruction
Jussi Kivilinna [Fri, 23 Oct 2015 19:39:47 +0000 (22:39 +0300)]
hwf-x86: add detection for Intel CPUs with fast SHLD instruction

* cipher/sha1.c (sha1_init): Use HWF_INTEL_FAST_SHLD instead of
HWF_INTEL_CPU.
* cipher/sha256.c (sha256_init, sha224_init): Ditto.
* cipher/sha512.c (sha512_init, sha384_init): Ditto.
* src/g10lib.h (HWF_INTEL_FAST_SHLD): New.
(HWF_INTEL_BMI2, HWF_INTEL_SSSE3, HWF_INTEL_PCLMUL, HWF_INTEL_AESNI)
(HWF_INTEL_RDRAND, HWF_INTEL_AVX, HWF_INTEL_AVX2)
(HWF_ARM_NEON): Update.
* src/hwf-x86.c (detect_x86_gnuc): Add detection of Intel Core
CPUs with fast SHLD/SHRD instruction.
* src/hwfeatures.c (hwflist): Add "intel-fast-shld".
--

Intel Core CPUs since codename sandy-bridge have been able to
execute SHLD/SHRD instructions faster than rotate instructions
ROL/ROR. Since SHLD/SHRD can be used to do rotation, some
optimized implementations (SHA1/SHA256/SHA512) use SHLD/SHRD
instructions in-place of ROL/ROR.

This patch provides more accurate detection of CPUs with
fast SHLD implementation.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoFix OCB amd64 assembly implementations for x32
Jussi Kivilinna [Sat, 24 Oct 2015 09:41:23 +0000 (12:41 +0300)]
Fix OCB amd64 assembly implementations for x32

* cipher/camellia-glue.c (_gcry_camellia_aesni_avx_ocb_enc)
(_gcry_camellia_aesni_avx_ocb_dec, _gcry_camellia_aesni_avx_ocb_auth)
(_gcry_camellia_aesni_avx2_ocb_enc, _gcry_camellia_aesni_avx2_ocb_dec)
(_gcry_camellia_aesni_avx2_ocb_auth, _gcry_camellia_ocb_crypt)
(_gcry_camellia_ocb_auth): Change 'Ls' from pointer array to u64 array.
* cipher/serpent.c (_gcry_serpent_sse2_ocb_enc)
(_gcry_serpent_sse2_ocb_dec, _gcry_serpent_sse2_ocb_auth)
(_gcry_serpent_avx2_ocb_enc, _gcry_serpent_avx2_ocb_dec)
(_gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth): Ditto.
* cipher/twofish.c (_gcry_twofish_amd64_ocb_enc)
(_gcry_twofish_amd64_ocb_dec, _gcry_twofish_amd64_ocb_auth)
(twofish_amd64_ocb_enc, twofish_amd64_ocb_dec, twofish_amd64_ocb_auth)
(_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): Ditto.
--

Pointers on x32 are 32-bit, but amd64 assembly implementations
expect 64-bit pointers. Pass 'Ls' array to 64-bit integers so
that input arrays has correct format for assembly functions.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agobench-slope: add KDF/PBKDF2 benchmark
Jussi Kivilinna [Fri, 23 Oct 2015 19:24:47 +0000 (22:24 +0300)]
bench-slope: add KDF/PBKDF2 benchmark

* tests/bench-slope.c (bench_kdf_mode, bench_kdf_init, bench_kdf_free)
(bench_kdf_do_bench, kdf_ops, kdf_bench_one, kdf_bench): New.
(print_help): Add 'kdf'.
(main): Add KDF benchmarks.
--

Introduce KDF benchmarking to bench-slope. Output is given as
nanosecs/iter (and cycles/iter if --cpu-mhz used). Only PBKDF2
is support with this initial patch.

For example, below shows output of KDF bench-slope before
and after commit "md: keep contexts for HMAC in GcryDigestEntry",
on Intel Core i5-4570 @ 3.2 Ghz:

Before:

$ tests/bench-slope --cpu-mhz 3201 kdf
KDF:
                          |  nanosecs/iter   cycles/iter
 PBKDF2-HMAC-MD5          |          882.4        2824.7
 PBKDF2-HMAC-SHA1         |          832.6        2665.0
 PBKDF2-HMAC-RIPEMD160    |         1148.3        3675.6
 PBKDF2-HMAC-TIGER192     |         1339.6        4288.2
 PBKDF2-HMAC-SHA256       |         1460.5        4675.1
 PBKDF2-HMAC-SHA384       |         1723.2        5515.8
 PBKDF2-HMAC-SHA512       |         1729.1        5534.7
 PBKDF2-HMAC-SHA224       |         1424.0        4558.3
 PBKDF2-HMAC-WHIRLPOOL    |         2459.7        7873.5
 PBKDF2-HMAC-TIGER        |         1350.2        4322.1
 PBKDF2-HMAC-TIGER2       |         1348.7        4317.3
 PBKDF2-HMAC-GOSTR3411_94 |         7374.1       23604.4
 PBKDF2-HMAC-STRIBOG256   |         6060.0       19398.1
 PBKDF2-HMAC-STRIBOG512   |         7512.8       24048.3
 PBKDF2-HMAC-GOSTR3411_CP |         7378.3       23618.0
 PBKDF2-HMAC-SHA3-224     |         2789.6        8929.5
 PBKDF2-HMAC-SHA3-256     |         2785.1        8915.0
 PBKDF2-HMAC-SHA3-384     |         2955.5        9460.5
 PBKDF2-HMAC-SHA3-512     |         2859.7        9153.9
                          =

After:

$ tests/bench-slope --cpu-mhz 3201 kdf
KDF:
                          |  nanosecs/iter   cycles/iter
 PBKDF2-HMAC-MD5          |          405.9        1299.2
 PBKDF2-HMAC-SHA1         |          392.1        1255.0
 PBKDF2-HMAC-RIPEMD160    |          540.9        1731.5
 PBKDF2-HMAC-TIGER192     |          637.1        2039.4
 PBKDF2-HMAC-SHA256       |          691.8        2214.3
 PBKDF2-HMAC-SHA384       |          848.0        2714.3
 PBKDF2-HMAC-SHA512       |          875.7        2803.1
 PBKDF2-HMAC-SHA224       |          689.2        2206.0
 PBKDF2-HMAC-WHIRLPOOL    |         1535.6        4915.5
 PBKDF2-HMAC-TIGER        |          636.3        2036.7
 PBKDF2-HMAC-TIGER2       |          636.6        2037.7
 PBKDF2-HMAC-GOSTR3411_94 |         5311.5       17002.2
 PBKDF2-HMAC-STRIBOG256   |         4308.0       13790.0
 PBKDF2-HMAC-STRIBOG512   |         5767.4       18461.4
 PBKDF2-HMAC-GOSTR3411_CP |         5309.4       16995.4
 PBKDF2-HMAC-SHA3-224     |         1333.1        4267.2
 PBKDF2-HMAC-SHA3-256     |         1327.8        4250.4
 PBKDF2-HMAC-SHA3-384     |         1392.8        4458.3
 PBKDF2-HMAC-SHA3-512     |         1428.5        4572.7
                          =

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agomd: keep contexts for HMAC in GcryDigestEntry.
NIIBE Yutaka [Thu, 22 Oct 2015 00:58:24 +0000 (09:58 +0900)]
md: keep contexts for HMAC in GcryDigestEntry.

* cipher/md.c (struct gcry_md_context): Add flags.hmac.
Remove macpads and mcpads_Bsize.
(md_open): Initialize flags.hmac.  Remove macpads initialization.
(md_enable): Allocate contexts when flags.hmac is enabled.
(md_copy): Remove macpads copying.  Add copying contexts.
(_gcry_md_reset): When flags.hmac is enabled, restore precomputed
context with input pad
(md_close): Remove macpads wiping.
(md_final): When flags.hmac is enabled, compute hmac by precomputed
context with output pad.
(prepare_macpads): Prepare precomputed contexts with input pad and
output pad for each registered digest entry.
(_gcry_md_setkey): Just call prepare_macpads.

--

This change is making things straight in HMAC computation.  This makes
HMAC computation allow multple algorithms in future.

Libgcrypt's code has a potential to compute digests for multiple
algorithms at once (currently, it's not enabled).  HMAC code didn't
work well with multple algorithms, because the macpads were only
allocated for an algorithm.  Now, it's allocated for each algorithm.

We now precompute hash contexts, instead of keeping input pad and
output pad.  This can be performance improvement, which is described
in RFC 2104.

Thanks to:

   Andrea Visconti, Simone Bossi, Hany Ragab and Alexandro Calò

For the discussion and their paper of CANS2015, which titled:

   On the weaknesses of PBKDF2

3 years agoFix double free on error.
NIIBE Yutaka [Thu, 15 Oct 2015 02:28:54 +0000 (11:28 +0900)]
Fix double free on error.

* src/hmac256.c (_gcry_hmac256_finalize): Don't free HD.

3 years agoFix gpg_error_t and gpg_err_code_t confusion.
NIIBE Yutaka [Wed, 14 Oct 2015 02:52:40 +0000 (11:52 +0900)]
Fix gpg_error_t and gpg_err_code_t confusion.

* src/gcrypt-int.h (_gcry_sexp_extract_param): Revert the change.
* cipher/dsa.c (dsa_check_secret_key): Ditto.
* src/sexp.c (_gcry_sexp_extract_param): Return gpg_err_code_t.

* src/gcrypt-int.h (_gcry_err_make_from_errno)
(_gcry_error_from_errno): Return gpg_error_t.
* cipher/cipher.c (_gcry_cipher_open_internal)
(_gcry_cipher_ctl, _gcry_cipher_ctl): Don't use gcry_error.
* src/global.c (_gcry_vcontrol): Likewise.
* cipher/ecc-eddsa.c (_gcry_ecc_eddsa_genkey): Use
 gpg_err_code_from_syserror.
* cipher/mac.c (mac_reset, mac_setkey, mac_setiv, mac_write)
(mac_read, mac_verify): Return gcry_err_code_t.
* cipher/rsa-common.c (mgf1): Use gcry_err_code_t for ERR.
* src/visibility.c (gcry_error_from_errno): Return gpg_error_t.

--

Reverting a part of 73374fdd and fix _gcry_sexp_extract_param
return type, instead.

Fix similar coding mistakes, throughout.

3 years agoFix compiling AES/AES-NI implementation on linux-i386
Jussi Kivilinna [Tue, 13 Oct 2015 05:33:00 +0000 (08:33 +0300)]
Fix compiling AES/AES-NI implementation on linux-i386

* cipher/rijndael-aesni.c (do_aesni_ctr_4): Split assembly block in
two parts to reduce number of register constraints needed.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
3 years agoFix declaration of return type.
NIIBE Yutaka [Tue, 13 Oct 2015 03:28:00 +0000 (12:28 +0900)]
Fix declaration of return type.

* src/gcrypt-int.h (_gcry_sexp_extract_param): Return gpg_error_t.
* cipher/dsa.c (dsa_generate): Fix call to _gcry_sexp_extract_param.
* src/g10lib.h (_gcry_vcontrol): Return gcry_err_code_t.
* src/visibility.c (gcry_mpi_snatch): Fix call to _gcry_mpi_snatch.

--

GnuPG-bug-id: 2074

3 years agoImprove GCRYCTL_DISABLE_PRIV_DROP by also disabling cap_ calls.
Werner Koch [Mon, 7 Sep 2015 12:02:09 +0000 (14:02 +0200)]
Improve GCRYCTL_DISABLE_PRIV_DROP by also disabling cap_ calls.

* src/secmem.c (lock_pool, secmem_init): Do not call any cap_
functions if NO_PRIV_DROP is set.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agow32: Avoid a few compiler warnings.
Werner Koch [Fri, 4 Sep 2015 10:39:56 +0000 (12:39 +0200)]
w32: Avoid a few compiler warnings.

* cipher/cipher-selftest.c (_gcry_selftest_helper_cbc)
(_gcry_selftest_helper_cfb, _gcry_selftest_helper_ctr): Mark variable
as unused.
* random/rndw32.c (slow_gatherer): Avoid signed pointer mismatch
warning.
* src/secmem.c (init_pool): Avoid unused variable warning.
* tests/random.c (writen, readn): Include on if needed.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agow32: Fix alignment problem with AESNI on Windows >= 8
Werner Koch [Fri, 4 Sep 2015 10:32:16 +0000 (12:32 +0200)]
w32: Fix alignment problem with AESNI on Windows >= 8

* cipher/cipher-selftest.c (_gcry_cipher_selftest_alloc_ctx): New.
* cipher/rijndael.c (selftest_basic_128, selftest_basic_192)
(selftest_basic_256): Allocate context on the heap.
--

The stack alignment on Windows changed and because ld seems to limit
stack variables to a 8 byte alignment (we request 16), we get bus
errors from the selftests if AESNI is in use.

GnuPG-bug-id: 2085
Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agorsa: Add verify after sign to avoid Lenstra's CRT attack.
Werner Koch [Mon, 31 Aug 2015 21:13:27 +0000 (23:13 +0200)]
rsa: Add verify after sign to avoid Lenstra's CRT attack.

* cipher/rsa.c (rsa_sign): Check the CRT.
--

Failures in the computation of the CRT (e.g. due faulty hardware) can
lead to a leak of the private key.  The standard precaution against
this is to verify the signature after signing.  GnuPG does this itself
and even has an option to disable this.  However, the low performance
impact of this extra precaution suggest that it should always be done
and Libgcrypt is the right place here.  For decryption is not done
because the application will detect the failure due to garbled
plaintext and in any case no key derived material will be send to the
user.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agoAdd pubkey algo id for EdDSA.
Werner Koch [Mon, 31 Aug 2015 20:41:12 +0000 (22:41 +0200)]
Add pubkey algo id for EdDSA.

* src/gcrypt.h.in (GCRY_PK_EDDSA): New.
--

These ids are not actually used by Libgcrypt but other software makes
use of such algorithm ids.  Thus we provide them here.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agoAdd configure option --enable-build-timestamp.
Werner Koch [Tue, 25 Aug 2015 19:11:05 +0000 (21:11 +0200)]
Add configure option --enable-build-timestamp.

* configure.ac (BUILD_TIMESTAMP): Set to "<none>" by default.
--

This is based on
libgpg-error commit d620005fd1a655d591fccb44639e22ea445e4554
but changed to be disabled by default.  Check there for some
background.

Signed-off-by: Werner Koch <wk@gnupg.org>
3 years agotests: Add missing files for the make distcheck target.
Werner Koch [Sun, 23 Aug 2015 15:20:18 +0000 (17:20 +0200)]
tests: Add missing files for the make distcheck target.

* tests/Makefile.am (EXTRA_DIST): Add sha3-x test vector files.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoChange SHA-3 algorithm ids
Werner Koch [Wed, 19 Aug 2015 10:43:43 +0000 (12:43 +0200)]
Change SHA-3 algorithm ids

* src/gcrypt.h.in (GCRY_MD_SHA3_224, GCRY_MD_SHA3_256)
(GCRY_MD_SHA3_384, GCRY_MD_SHA3_512): Change values.
--

By using algorithm ids outside of the RFC-4880 range we make debugging
of GnuPG easier.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoKeccak: Fix array indexes in θ step
Jussi Kivilinna [Wed, 12 Aug 2015 15:17:01 +0000 (18:17 +0300)]
Keccak: Fix array indexes in θ step

* cipher/keccak.c (keccak_f1600_state_permute): Fix indexes for D[5].
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoSimplify OCB offset calculation for parallel implementations
Jussi Kivilinna [Tue, 11 Aug 2015 04:22:16 +0000 (07:22 +0300)]
Simplify OCB offset calculation for parallel implementations

* cipher/camellia-glue.c (_gcry_camellia_ocb_crypt)
(_gcry_camellia_ocb_auth): Precalculate Ls array always, instead of
just if 'blkn % <parallel blocks> == 0'.
* cipher/serpent.c (_gcry_serpent_ocb_crypt)
(_gcry_serpent_ocb_auth): Ditto.
* cipher/rijndael-aesni.c (get_l): Remove low-bit checks.
(aes_ocb_enc, aes_ocb_dec, _gcry_aes_aesni_ocb_auth): Handle leading
blocks until block counter is multiple of 4, so that parallel block
processing loop can use 'c->u_mode.ocb.L' array directly.
* tests/basic.c (check_ocb_cipher_largebuf): Rename to...
(check_ocb_cipher_largebuf_split): ...this and add option to process
large buffer as two split buffers.
(check_ocb_cipher_largebuf): New.
--

Patch simplifies source and reduce object size.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd carryless 8-bit addition fast-path for AES-NI CTR mode
Jussi Kivilinna [Mon, 10 Aug 2015 17:48:02 +0000 (20:48 +0300)]
Add carryless 8-bit addition fast-path for AES-NI CTR mode

* cipher/rijndael-aesni.c (do_aesni_ctr_4): Do addition using
CTR in big-endian form, if least-significant byte does not overflow.
--

Patch improves AES-NI CTR speed by 20%.

Benchmark on Intel Haswell (3.2 Ghz):

Before:
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        CTR enc |     0.273 ns/B    3489.8 MiB/s     0.875 c/B
        CTR dec |     0.273 ns/B    3491.0 MiB/s     0.874 c/B

After:
        CTR enc |     0.228 ns/B    4190.0 MiB/s     0.729 c/B
        CTR dec |     0.228 ns/B    4190.2 MiB/s     0.729 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd additional SHA3 test-vectors
Jussi Kivilinna [Sun, 9 Aug 2015 15:33:35 +0000 (18:33 +0300)]
Add additional SHA3 test-vectors

* tests/basic.c (check_digests): Allow datalen to be specified so that
input data can have byte with value 0x00; Include sha3-*.h header files
to test-vector structure.
* tests/sha3-224.h: New.
* tests/sha3-256.h: New.
* tests/sha3-384.h: New.
* tests/sha3-512.h: New.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd generic SHA3 implementation
Jussi Kivilinna [Mon, 10 Aug 2015 19:09:56 +0000 (22:09 +0300)]
Add generic SHA3 implementation

* cipher/hash-common.h (MD_BLOCK_MAX_BLOCKSIZE): Increase blocksize
USE_SHA3 enabled.
* cipher/keccak.c (SHA3_DELIMITED_SUFFIX, SHAKE_DELIMITED_SUFFIX): New.
(KECCAK_STATE): Add proper state.
(KECCAK_CONTEXT): Add 'outlen'.
(rol64, keccak_f1600_state_permute, transform_blk, transform): New.
(keccak_init): Add proper initialization.
(keccak_final): Add proper finalization.
(selftests_keccak): Add selftests.
(oid_spec_sha3_224, oid_spec_sha3_256, oid_spec_sha3_384)
(oid_spec_sha3_512): Add OID.
(_gcry_digest_spec_sha3_224, _gcry_digest_spec_sha3_256)
(_gcry_digest_spec_sha3_384, _gcry_digest_spec_sha3_512): Fix output
length.
* cipher/mac-hmac.c (map_mac_algo_to_md): Fix mapping for SHA3-512.
(hmac_get_keylen): Return proper blocksizes for SHA3 algorithms.
[USE_SHA3] (_gcry_mac_type_spec_hmac_sha3_224)
(_gcry_mac_type_spec_hmac_sha3_256, _gcry_mac_type_spec_hmac_sha3_384)
(_gcry_mac_type_spec_hmac_sha3_512): New.
* cipher/mac-internal [USE_SHA3] (_gcry_mac_type_spec_hmac_sha3_224)
(_gcry_mac_type_spec_hmac_sha3_256, _gcry_mac_type_spec_hmac_sha3_384)
(_gcry_mac_type_spec_hmac_sha3_512): New.
* cipher/mac.c (mac_list) [USE_SHA3]: Add SHA3 algorithms.
* cipher/md.c (md_open): Use proper SHA-3 blocksizes for HMAC macpads.
* tests/basic.c (check_digests): Add SHA3 test vectors.
--

Patch adds generic implementation for SHA3. Currently missing with this
patch:
 - HMAC SHA3 test vectors, not available from NIST (yet?)
 - ASNs

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoOptimize OCB offset calculation
Jussi Kivilinna [Mon, 10 Aug 2015 19:09:56 +0000 (22:09 +0300)]
Optimize OCB offset calculation

* cipher/cipher-internal.h (ocb_get_l): New.
* cipher/cipher-ocb.c (_gcry_cipher_ocb_authenticate)
(ocb_crypt): Use 'ocb_get_l' instead of '_gcry_cipher_ocb_get_l'.
* cipher/camellia-glue.c (get_l): Remove.
(_gcry_camellia_ocb_crypt, _gcry_camellia_ocb_auth): Precalculate
offset array when block count matches parallel operation size; Use
'ocb_get_l' instead of 'get_l'.
* cipher/rijndael-aesni.c (get_l): Add fast path for 75% most common
offsets.
(aesni_ocb_enc, aesni_ocb_dec, _gcry_aes_aesni_ocb_auth): Precalculate
offset array when block count matches parallel operation size.
* cipher/rijndael-ssse3-amd64.c (get_l): Add fast path for 75% most
common offsets.
* cipher/rijndael.c (_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth): Use
'ocb_get_l' instead of '_gcry_cipher_ocb_get_l'.
* cipher/serpent.c (get_l): Remove.
(_gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth): Precalculate
offset array when block count matches parallel operation size; Use
'ocb_get_l' instead of 'get_l'.
* cipher/twofish.c (get_l): Remove.
(_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): Use 'ocb_get_l'
instead of 'get_l'.
--

Patch optimizes OCB offset calculation for generic code and
assembly implementations with parallel block processing.

Benchmark of OCB AES-NI on Intel Haswell:

 $ tests/bench-slope --cpu-mhz 3201 cipher aes

 Before:
  AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
         CTR enc |     0.274 ns/B    3483.9 MiB/s     0.876 c/B
         CTR dec |     0.273 ns/B    3490.0 MiB/s     0.875 c/B
         OCB enc |     0.289 ns/B    3296.1 MiB/s     0.926 c/B
         OCB dec |     0.299 ns/B    3189.9 MiB/s     0.957 c/B
        OCB auth |     0.260 ns/B    3670.0 MiB/s     0.832 c/B

 After:
  AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
         CTR enc |     0.273 ns/B    3489.4 MiB/s     0.875 c/B
         CTR dec |     0.273 ns/B    3487.5 MiB/s     0.875 c/B
         OCB enc |     0.248 ns/B    3852.8 MiB/s     0.792 c/B
         OCB dec |     0.261 ns/B    3659.5 MiB/s     0.834 c/B
        OCB auth |     0.227 ns/B    4205.5 MiB/s     0.726 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoecc: fix Montgomery curve bugs.
NIIBE Yutaka [Mon, 10 Aug 2015 10:09:16 +0000 (19:09 +0900)]
ecc: fix Montgomery curve bugs.

* cipher/ecc.c (check_secret_key): Y1 should not be NULL when check.
(ecc_check_secret_key): Support Montgomery curve.
* mpi/ec.c (_gcry_mpi_ec_curve_point): Fix condition.

4 years agoAdd framework to eventually support SHA3.
Werner Koch [Sat, 8 Aug 2015 08:47:55 +0000 (10:47 +0200)]
Add framework to eventually support SHA3.

* src/gcrypt.h.in (GCRY_MD_SHA3_224, GCRY_MD_SHA3_256)
(GCRY_MD_SHA3_384, GCRY_MD_SHA3_512): New.
(GCRY_MAC_HMAC_SHA3_224, GCRY_MAC_HMAC_SHA3_256)
(GCRY_MAC_HMAC_SHA3_384, GCRY_MAC_HMAC_SHA3_512): New.
* cipher/keccak.c: New with stub functions.
* cipher/Makefile.am (EXTRA_libcipher_la_SOURCES): Add keccak.c.
* configure.ac (available_digests): Add sha3.
(USE_SHA3): New.
* src/fips.c (run_hmac_selftests): Add SHA3 to the required selftests.
* cipher/md.c (digest_list) [USE_SHA3]: Add standard SHA3 algos.
(md_open): Ditto for hmac processing.
* cipher/mac-hmac.c (map_mac_algo_to_md): Add mapping.
* cipher/hmac-tests.c (run_selftests): Prepare for tests.
* cipher/pubkey-util.c (get_hash_algo): Add "sha3-xxx".
--

Note that the algo GCRY_MD_SHA3_xxx are prelimanry.  We should try to
sync them with OpenPGP.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agotools: Fix memory leak for functions "I" and "G".
Werner Koch [Thu, 6 Aug 2015 12:57:44 +0000 (14:57 +0200)]
tools: Fix memory leak for functions "I" and "G".

* src/mpicalc.c (do_inv, do_gcd): Init A after stack check.
--

Reported-by: Ismo Puustinen <ismo.puustinen@intel.com>
Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoecc: Free memory also when in error branch.
Ismo Puustinen [Wed, 5 Aug 2015 12:27:43 +0000 (15:27 +0300)]
ecc: Free memory also when in error branch.

* cipher/ecc-eddsa.c (_gcry_ecc_eddsa_sign): Init DISGEST and goto
leave on error.
--

Fixing an issue found by static analysis.

Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
Added DIGEST init and wrote Changelog.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoAdd Curve25519 support.
NIIBE Yutaka [Thu, 6 Aug 2015 08:31:41 +0000 (17:31 +0900)]
Add Curve25519 support.

* cipher/ecc-curves.c (curve_aliases, domain_parms): Add Curve25519.
* tests/curves.c (N_CURVES): It's 22 now.
* src/cipher.h (PUBKEY_FLAG_DJB_TWEAK): New.
* cipher/ecc-common.h (_gcry_ecc_mont_decodepoint): New.
* cipher/ecc-misc.c (_gcry_ecc_mont_decodepoint): New.
* cipher/ecc.c (nist_generate_key): Handle the case of
PUBKEY_FLAG_DJB_TWEAK and Montgomery curve.
(test_ecdh_only_keys, check_secret_key): Likewise.
(ecc_generate): Support Curve25519 which is Montgomery curve with flag
PUBKEY_FLAG_DJB_TWEAK and PUBKEY_FLAG_COMP.
(ecc_encrypt_raw): Get flags from KEYPARMS and handle
PUBKEY_FLAG_DJB_TWEAK and Montgomery curve.
(ecc_decrypt_raw): Likewise.
(compute_keygrip): Handle the case of PUBKEY_FLAG_DJB_TWEAK.
* cipher/pubkey-util.c (_gcry_pk_util_parse_flaglist):
PUBKEY_FLAG_EDDSA implies PUBKEY_FLAG_DJB_TWEAK.
Parse "djb-tweak" for PUBKEY_FLAG_DJB_TWEAK.

--

With PUBKEY_FLAG_DJB_TWEAK, secret key has msb set and it should be
always multiple by cofactor.

4 years agoReduce code size for Twofish key-setup and remove key dependend branch
Jussi Kivilinna [Mon, 13 Jul 2015 13:16:13 +0000 (16:16 +0300)]
Reduce code size for Twofish key-setup and remove key dependend branch

* cipher/twofish.c (poly_to_exp): Increase size by one, change type
from byte to u16 and insert '492' to index 0.
(exp_to_poly): Increase size by 256, let new cells have zero value.
(CALC_S): Execute unconditionally with help of modified tables.
(do_twofish_setkey): Change type for 'tmp' to 'unsigned int'; Un-unroll
CALC_K256 and CALC_K phases to reduce generated object size.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoReduce amount of duplicated code in OCB bulk implementations
Jussi Kivilinna [Sun, 26 Jul 2015 20:39:51 +0000 (23:39 +0300)]
Reduce amount of duplicated code in OCB bulk implementations

* cipher/cipher-ocb.c (_gcry_cipher_ocb_authenticate)
(ocb_crypt): Change bulk function to return number of unprocessed
blocks.
* src/cipher.h (_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth)
(_gcry_camellia_ocb_crypt, _gcry_camellia_ocb_auth)
(_gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth)
(_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): Change return type
to 'size_t'.
* cipher/camellia-glue.c (get_l): Only if USE_AESNI_AVX or
USE_AESNI_AVX2 defined.
(_gcry_camellia_ocb_crypt, _gcry_camellia_ocb_auth): Change return type
to 'size_t' and return remaining blocks; Remove unaccelerated common
code path. Enable remaining common code only if USE_AESNI_AVX or
USE_AESNI_AVX2 defined; Remove unaccelerated common code.
* cipher/rijndael.c (_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth): Change
return type to 'size_t' and return zero.
* cipher/serpent.c (get_l): Only if USE_SSE2, USE_AVX2 or USE_NEON
defined.
(_gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth): Change return type
to 'size_t' and return remaining blocks; Remove unaccelerated common
code path. Enable remaining common code only if USE_SSE2, USE_AVX2 or
USE_NEON defined; Remove unaccelerated common code.
* cipher/twofish.c (get_l): Only if USE_AMD64_ASM defined.
(_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): Change return type
to 'size_t' and return remaining blocks; Remove unaccelerated common
code path. Enable remaining common code only if USE_AMD64_ASM defined;
Remove unaccelerated common code.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd bulk OCB for Serpent SSE2, AVX2 and NEON implementations
Jussi Kivilinna [Sun, 26 Jul 2015 14:17:20 +0000 (17:17 +0300)]
Add bulk OCB for Serpent SSE2, AVX2 and NEON implementations

* cipher/cipher.c (_gcry_cipher_open_internal): Setup OCB bulk
functions for Serpent.
* cipher/serpent-armv7-neon.S: Add OCB assembly functions.
* cipher/serpent-avx2-amd64.S: Add OCB assembly functions.
* cipher/serpent-sse2-amd64.S: Add OCB assembly functions.
* cipher/serpent.c (_gcry_serpent_sse2_ocb_enc)
(_gcry_serpent_sse2_ocb_dec, _gcry_serpent_sse2_ocb_auth)
(_gcry_serpent_neon_ocb_enc, _gcry_serpent_neon_ocb_dec)
(_gcry_serpent_neon_ocb_auth, _gcry_serpent_avx2_ocb_enc)
(_gcry_serpent_avx2_ocb_dec, _gcry_serpent_avx2_ocb_auth): New
prototypes.
(get_l, _gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth): New.
* src/cipher.h (_gcry_serpent_ocb_crypt)
(_gcry_serpent_ocb_auth): New.
* tests/basic.c (check_ocb_cipher): Add test-vector for serpent.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd bulk OCB for Twofish AMD64 implementation
Jussi Kivilinna [Tue, 7 Jul 2015 18:52:34 +0000 (21:52 +0300)]
Add bulk OCB for Twofish AMD64 implementation

* cipher/cipher.c (_gcry_cipher_open_internal): Setup OCB bulk
functions for Twofish.
* cipher/twofish-amd64.S: Add OCB assembly functions.
* cipher/twofish.c (_gcry_twofish_amd64_ocb_enc)
(_gcry_twofish_amd64_ocb_dec, _gcry_twofish_amd64_ocb_auth): New
prototypes.
(call_sysv_fn5, call_sysv_fn6, twofish_amd64_ocb_enc)
(twofish_amd64_ocb_dec, twofish_amd64_ocb_auth, get_l)
(_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): New.
* src/cipher.h (_gcry_twofish_ocb_crypt)
(_gcry_twofish_ocb_auth): New.
* tests/basic.c (check_ocb_cipher): Add test-vector for Twofish.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd bulk OCB for Camellia AES-NI/AVX and AES-NI/AVX2 implementations
Jussi Kivilinna [Tue, 7 Jul 2015 18:49:57 +0000 (21:49 +0300)]
Add bulk OCB for Camellia AES-NI/AVX and AES-NI/AVX2 implementations

* cipher/camellia-aesni-avx-amd64.S: Add OCB assembly functions.
* cipher/camellia-aesni-avx2-amd64.S: Add OCB assembly functions.
* cipher/camellia-glue.c (_gcry_camellia_aesni_avx_ocb_enc)
(_gcry_camellia_aesni_avx_ocb_dec, _gcry_camellia_aesni_avx_ocb_auth)
(_gcry_camellia_aesni_avx2_ocb_enc, _gcry_camellia_aesni_avx2_ocb_dec)
(_gcry_camellia_aesni_avx2_ocb_auth): New prototypes.
(get_l, _gcry_camellia_ocb_crypt, _gcry_camellia_ocb_auth): New.
* cipher/cipher.c (_gcry_cipher_open_internal): Setup OCB bulk
functions for Camellia.
* src/cipher.h (_gcry_camellia_ocb_crypt)
(_gcry_camellia_ocb_auth): New.
* tests/basic.c (check_ocb_cipher): Add test-vector for Camellia.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd OCB bulk mode for AES SSSE3 implementation
Jussi Kivilinna [Sun, 5 Jul 2015 17:58:56 +0000 (20:58 +0300)]
Add OCB bulk mode for AES SSSE3 implementation

* cipher/rijndael-ssse3-amd64.c (SSSE3_STATE_SIZE): New.
[HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS] (vpaes_ssse3_prepare): Use
'ssse3_state' for storing current SSSE3 state.
[HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS]
(vpaes_ssse3_cleanup): Restore SSSE3 state from 'ssse3_state'.
(_gcry_aes_ssse3_do_setkey, _gcry_aes_ssse3_prepare_decryption)
(_gcry_aes_ssse3_encrypt, _gcry_aes_ssse3_cfb_enc)
(_gcry_aes_ssse3_cbc_enc, _gcry_aes_ssse3_ctr_enc)
(_gcry_aes_ssse3_decrypt, _gcry_aes_ssse3_cfb_dec)
(_gcry_aes_ssse3_cbc_dec, _gcry_aes_ssse3_cbc_dec): Add 'ssse3_state'
array.
(get_l, ssse3_ocb_enc, ssse3_ocb_dec, _gcry_aes_ssse3_ocb_crypt)
(_gcry_aes_ssse3_ocb_auth): New.
* cipher/rijndael.c (_gcry_aes_ssse3_ocb_crypt)
(_gcry_aes_ssse3_ocb_auth): New.
(_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth) [USE_SSSE3]: Use SSSE3
implementation for OCB.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoFix undefined behavior wrt memcpy
Peter Wu [Sun, 26 Jul 2015 13:50:33 +0000 (16:50 +0300)]
Fix undefined behavior wrt memcpy

* cipher/cipher-gcm.c: Do not copy zero bytes from an empty buffer. Let
the function continue to add padding as needed though.
* cipher/mac-poly1305.c: If the caller requested to finish the hash
function without a copy of the result, return immediately.
--
Caught by UndefinedBehaviorSanitizer.

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
4 years agobuild: ignore scissor line for the commit-msg hook
Peter Wu [Thu, 9 Jul 2015 15:11:33 +0000 (17:11 +0200)]
build: ignore scissor line for the commit-msg hook

* build-aux/git-hooks/commit-msg: Stop processing more lines when the
  scissor line is encountered.
--
This allows the command `git commit -v` to work even if the code is
longer than 72 characters. Note that comments are already ignored by the
previous line.

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
4 years agoRegister DCO for Peter Wu.
Werner Koch [Thu, 23 Jul 2015 12:38:49 +0000 (14:38 +0200)]
Register DCO for Peter Wu.

--

4 years agorsa: Fix error in comments.
Peter Wu [Thu, 16 Jul 2015 04:59:44 +0000 (13:59 +0900)]
rsa: Fix error in comments.

* cipher/rsa.c: Fix.

--

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
4 years agosexp: Fix invalid deallocation in error path.
Peter Wu [Tue, 14 Jul 2015 00:53:38 +0000 (09:53 +0900)]
sexp: Fix invalid deallocation in error path.

* src/sexp.c: Fix wrong condition.

--

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
4 years agoecc: fix memory leak.
Peter Wu [Fri, 10 Jul 2015 01:15:26 +0000 (10:15 +0900)]
ecc: fix memory leak.

* cipher/ecc.c (ecc_verify): Release memory which was allocated before
by _gcry_pk_util_preparse_sigval.
(ecc_decrypt_raw): Likewise.

--

Caught by LeakSanitizer (LSan). Now the test suite (make check) passes
with no memleaks.

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
The last commit (0a7547e487a8bc4e7ac9599c55579eb2e4a13f06) includes
wrong fixes for sexp_release.

ecc_decrypt_raw fix added by gniibe.

4 years agoecc: fix memory leaks.
NIIBE Yutaka [Mon, 6 Jul 2015 03:01:00 +0000 (12:01 +0900)]
ecc: fix memory leaks.

cipher/ecc.c (ecc_generate): Fix memory leak on error of
_gcry_pk_util_parse_flaglist and _gcry_ecc_eddsa_encodepoint.
(ecc_check_secret_key): Fix memory leak on error of
_gcry_ecc_update_curve_param.
(ecc_sign, ecc_verify, ecc_encrypt_raw, ecc_decrypt_raw): Remove
unnecessary sexp_release and fix memory leak on error of
_gcry_ecc_fill_in_curve.
(ecc_decrypt_raw): Fix double free of the point kG and memory leak
on error of _gcry_ecc_os2ec.

4 years agompi: Support FreeBSD 10 or later.
NIIBE Yutaka [Thu, 11 Jun 2015 07:19:49 +0000 (16:19 +0900)]
mpi: Support FreeBSD 10 or later.

* mpi/config.links: Include FreeBSD 10 to 29.

--

Thanks to Yuta SATOH.

GnuPG-bug-id: 1936, 1974

4 years agoecc: Add key generation flag "no-keytest".
Werner Koch [Thu, 21 May 2015 14:24:36 +0000 (16:24 +0200)]
ecc: Add key generation flag "no-keytest".

* src/cipher.h (PUBKEY_FLAG_NO_KEYTEST): New.
* cipher/pubkey-util.c (_gcry_pk_util_parse_flaglist): Add flag
"no-keytest".  Return an error for invalid flags of length 10.

* cipher/ecc.c (nist_generate_key): Replace arg random_level by flags
set random level depending on flags.
* cipher/ecc-eddsa.c (_gcry_ecc_eddsa_genkey): Ditto.
* cipher/ecc.c (ecc_generate): Pass flags to generate fucntion and
remove var random_level.
(nist_generate_key): Implement "no-keytest" flag.

* tests/keygen.c (check_ecc_keys): Add tests for transient-key and
no-keytest.
--

After key creation we usually run a test to check whether the keys
really work.  However for transient keys this might be too time
consuming and given that a failed test would anyway abort the process
the optional use of a flag to skip the test is appropriate.

Using Ed25519 for EdDSA and the "no-keytest" flags halves the time to
create such a key.  This was measured by looping the last test from
check_ecc_keys() 1000 times with and without the flag.

Due to a bug in the flags parser unknown flags with a length of 10
characters were not detected.  Thus the "no-keytest" flag can be
employed by all software even for libraries before this.  That bug is
however solved with this version.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoecc: Avoid double conversion to affine coordinates in keygen.
Werner Koch [Thu, 21 May 2015 09:12:42 +0000 (11:12 +0200)]
ecc: Avoid double conversion to affine coordinates in keygen.

* cipher/ecc.c (nist_generate_key): Add args r_x and r_y.
(ecc_generate): Rename vars.  Convert to affine coordinates only if
not returned by the lower level generation function.
--

nist_generate_key already needs to convert to affine coordinates to
implement Jivsov's trick.  Thus we can return them and avoid calling
it in ecc_generate again.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agorandom: Change initial extra seeding from 2400 bits to 128 bits.
Werner Koch [Mon, 4 May 2015 14:46:02 +0000 (16:46 +0200)]
random: Change initial extra seeding from 2400 bits to 128 bits.

* random/random-csprng.c (read_pool): Reduce initial seeding.
--

See discussion starting at
 https://lists.gnupg.org/pipermail/gnupg-devel/2015-April/029750.html
and also in May.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoEnable AMD64 Twofish implementation on WIN64
Jussi Kivilinna [Thu, 14 May 2015 10:07:34 +0000 (13:07 +0300)]
Enable AMD64 Twofish implementation on WIN64

* cipher/twofish-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/twofish.c (USE_AMD64_ASM): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
[HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS] (call_sysv_fn): New.
(twofish_amd64_encrypt_block, twofish_amd64_decrypt_block)
(twofish_amd64_ctr_enc, twofish_amd64_cbc_dec)
(twofish_amd64_cfb_dec): New wrapper functions for AMD64
assembly functions.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable AMD64 Serpent implementations on WIN64
Jussi Kivilinna [Thu, 14 May 2015 10:07:48 +0000 (13:07 +0300)]
Enable AMD64 Serpent implementations on WIN64

* cipher/serpent-avx2-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/serpent-sse2-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/chacha20.c (USE_SSE2, USE_AVX2): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
[USE_SSE2 || USE_AVX2] (ASM_FUNC_ABI): New.
(_gcry_serpent_sse2_ctr_enc, _gcry_serpent_sse2_cbc_dec)
(_gcry_serpent_sse2_cfb_dec, _gcry_serpent_avx2_ctr_enc)
(_gcry_serpent_avx2_cbc_dec, _gcry_serpent_avx2_cfb_dec): Add
ASM_FUNC_ABI.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable AMD64 Salsa20 implementation on WIN64
Jussi Kivilinna [Thu, 14 May 2015 09:37:21 +0000 (12:37 +0300)]
Enable AMD64 Salsa20 implementation on WIN64

* cipher/salsa20-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/salsa20.c (USE_AMD64): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
[USE_AMD64] (ASM_FUNC_ABI, ASM_EXTRA_STACK): New.
(_gcry_salsa20_amd64_keysetup, _gcry_salsa20_amd64_ivsetup)
(_gcry_salsa20_amd64_encrypt_blocks): Add ASM_FUNC_ABI.
[USE_AMD64] (salsa20_core): Add ASM_EXTRA_STACK.
(salsa20_do_encrypt_stream) [USE_AMD64]: Add ASM_EXTRA_STACK.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable AMD64 Poly1305 implementations on WIN64
Jussi Kivilinna [Thu, 14 May 2015 09:39:39 +0000 (12:39 +0300)]
Enable AMD64 Poly1305 implementations on WIN64

* cipher/poly1305-avx2-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/poly1305-sse2-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/poly1305-internal.h (POLY1305_SYSV_FUNC_ABI): New.
(POLY1305_USE_SSE2, POLY1305_USE_AVX2): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(OPS_FUNC_ABI): New.
(poly1305_ops_t): Use OPS_FUNC_ABI.
* cipher/poly1305.c (_gcry_poly1305_amd64_sse2_init_ext)
(_gcry_poly1305_amd64_sse2_finish_ext)
(_gcry_poly1305_amd64_sse2_blocks, _gcry_poly1305_amd64_avx2_init_ext)
(_gcry_poly1305_amd64_avx2_finish_ext)
(_gcry_poly1305_amd64_avx2_blocks, _gcry_poly1305_armv7_neon_init_ext)
(_gcry_poly1305_armv7_neon_finish_ext)
(_gcry_poly1305_armv7_neon_blocks, poly1305_init_ext_ref32)
(poly1305_blocks_ref32, poly1305_finish_ext_ref32)
(poly1305_init_ext_ref8, poly1305_blocks_ref8)
(poly1305_finish_ext_ref8): Use OPS_FUNC_ABI.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable AMD64 3DES implementation on WIN64
Jussi Kivilinna [Thu, 14 May 2015 07:31:18 +0000 (10:31 +0300)]
Enable AMD64 3DES implementation on WIN64

* cipher/des-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/des.c (USE_AMD64_ASM): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
[HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS] (call_sysv_fn): New.
(tripledes_ecb_crypt) [HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS]: Call
assembly function through 'call_sysv_fn'.
(tripledes_amd64_ctr_enc, tripledes_amd64_cbc_dec)
(tripledes_amd64_cfb_dec): New wrapper functions for bulk
assembly functions.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable AMD64 ChaCha20 implementations on WIN64
Jussi Kivilinna [Tue, 5 May 2015 18:02:43 +0000 (21:02 +0300)]
Enable AMD64 ChaCha20 implementations on WIN64

* cipher/chacha20-avx2-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/chacha20-sse2-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/chacha20-ssse3-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/chacha20.c (USE_SSE2, USE_SSSE3, USE_AVX2): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ASM_FUNC_ABI, ASM_EXTRA_STACK): New.
(chacha20_blocks_t, _gcry_chacha20_amd64_sse2_blocks)
(_gcry_chacha20_amd64_ssse3_blocks, _gcry_chacha20_amd64_avx2_blocks)
(_gcry_chacha20_armv7_neon_blocks, chacha20_blocks): Add ASM_FUNC_ABI.
(chacha20_core): Add ASM_EXTRA_STACK.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable AMD64 CAST5 implementation on WIN64
Jussi Kivilinna [Tue, 5 May 2015 17:46:10 +0000 (20:46 +0300)]
Enable AMD64 CAST5 implementation on WIN64

* cipher/cast5-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(RIP): Remove.
(GET_EXTERN_POINTER): Use 'leaq' version on WIN64.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/cast5.c (USE_AMD64_ASM): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
[HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS] (call_sysv_fn): New.
(do_encrypt_block, do_decrypt_block)
[HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS]: Call assembly
function through 'call_sysv_fn'.
(cast5_amd64_ctr_enc, cast5_amd64_cbc_dec)
(cast5_amd64_cfb_dec): New wrapper functions for bulk
assembly functions.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable AMD64 Camellia implementations on WIN64
Jussi Kivilinna [Thu, 14 May 2015 10:33:07 +0000 (13:33 +0300)]
Enable AMD64 Camellia implementations on WIN64

* cipher/camellia-aesni-avx-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/camellia-aesni-avx2-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/camellia-glue.c (USE_AESNI_AVX, USE_AESNI_AVX2): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
[USE_AESNI_AVX || USE_AESNI_AVX2] (ASM_FUNC_ABI, ASM_EXTRA_STACK): New.
(_gcry_camellia_aesni_avx_ctr_enc, _gcry_camellia_aesni_avx_cbc_dec)
(_gcry_camellia_aesni_avx_cfb_dec, _gcry_camellia_aesni_avx_keygen)
(_gcry_camellia_aesni_avx2_ctr_enc, _gcry_camellia_aesni_avx2_cbc_dec)
(_gcry_camellia_aesni_avx2_cfb_dec): Add ASM_FUNC_ABI.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable AMD64 Blowfish implementation on WIN64
Jussi Kivilinna [Sun, 3 May 2015 14:28:40 +0000 (17:28 +0300)]
Enable AMD64 Blowfish implementation on WIN64

* cipher/blowfish-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/blowfish.c (USE_AMD64_ASM): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
[HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS] (call_sysv_fn): New.
(do_encrypt, do_encrypt_block, do_decrypt_block)
[HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS]: Call assembly
function through 'call_sysv_fn'.
(blowfish_amd64_ctr_enc, blowfish_amd64_cbc_dec)
(blowfish_amd64_cfb_dec): New wrapper functions for bulk
assembly functions.
..

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable AMD64 arcfour implementation on WIN64
Jussi Kivilinna [Sun, 3 May 2015 14:06:56 +0000 (17:06 +0300)]
Enable AMD64 arcfour implementation on WIN64

* cipher/arcfour-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/arcfour.c (USE_AMD64_ASM): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(do_encrypt, do_decrypt) [HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS]: Use
assembly block to call AMD64 assembly function.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoUpdate documentation for Poly1305-ChaCha20 AEAD, RFC-7539
Jussi Kivilinna [Thu, 14 May 2015 07:02:51 +0000 (10:02 +0300)]
Update documentation for Poly1305-ChaCha20 AEAD, RFC-7539

* cipher/cipher-poly1305.c: Add RFC-7539 to header.
* doc/gcrypt.texi: Update Poly1305 AEAD documentation with mention of
RFC-7539; Drop Salsa from supported stream ciphers for Poly1305 AEAD.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agohwf-x86: use edi for passing value to ebx for i386 cpuid
Jussi Kivilinna [Fri, 8 May 2015 15:07:51 +0000 (18:07 +0300)]
hwf-x86: use edi for passing value to ebx for i386 cpuid

* src/hwf-x86.c [__i386__] (get_cpuid): Use '=D' for regs[1] instead
of '=r'.
--

On Win32, %ebx can be assigned for '=r' (regs[1]). This results invalid
assembly:
pushl %ebx
movl %ebx, %ebx
cpuid
movl %ebx, %ebx
popl %ebx

So use '=D' (%esi) for regs[1] instead.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agohwf-x86: add EDX as output register for xgetbv asm block
Jussi Kivilinna [Mon, 4 May 2015 17:09:51 +0000 (20:09 +0300)]
hwf-x86: add EDX as output register for xgetbv asm block

* src/hwf-x86.c (get_xgetbv): Add EDX as output.
--

XGETBV instruction modifies EAX:EDX register pair, so we need to mark
EDX as output to let compiler know that contents in this register are
lost.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agobuild: Update build-aux files.
Werner Koch [Mon, 4 May 2015 08:29:22 +0000 (10:29 +0200)]
build: Update build-aux files.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoFix possible regression on old 32 bit mingw compilers.
Werner Koch [Mon, 4 May 2015 08:22:24 +0000 (10:22 +0200)]
Fix possible regression on old 32 bit mingw compilers.

* acinclude.m4: Add new pattern for mingw32.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agobuild: Add new file.
Werner Koch [Mon, 4 May 2015 08:23:12 +0000 (10:23 +0200)]
build: Add new file.

* mpi/amd64/distfiles: Add func_abi.h.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoFix WIN64 assembly glue for AES
Jussi Kivilinna [Sun, 3 May 2015 14:16:08 +0000 (17:16 +0300)]
Fix WIN64 assembly glue for AES

* cipher/rinjdael.c (do_encrypt, do_decrypt)
[!HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS]: Change input operands to
input+output to mark volatile nature of the used registers.
--

Function arguments cannot be passed to assembly block as input operands
as target function modifies those input registers.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd '1 million a characters' test vectors
Jussi Kivilinna [Sat, 2 May 2015 22:24:50 +0000 (01:24 +0300)]
Add '1 million a characters' test vectors

* tests/basic.c (check_digests): Add "!" test vectors for MD5, SHA-384,
SHA-512, RIPEMD160 and CRC32.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoMore optimized CRC implementations
Jussi Kivilinna [Sat, 2 May 2015 21:34:34 +0000 (00:34 +0300)]
More optimized CRC implementations

* cipher/crc.c (crc32_table, crc24_table): Replace with new table
contents.
(update_crc32, CRC24_INIT, CRC24_POLY): Remove.
(crc32_next, crc32_next4, crc24_init, crc24_next, crc24_next4)
(crc24_final): New.
(crc24rfc2440_init): Use crc24_init.
(crc32_write): Rewrite to use crc32_next & crc32_next4.
(crc24_write): Rewrite to use crc24_next & crc24_next4.
(crc32_final, crc32rfc1510_final): Use buf_put_be32.
(crc24rfc2440_final): Use crc24_final & buf_put_le32.
* tests/basic.c (check_digests): Add CRC "123456789" tests.
--

Patch adds more optimized CRC implementations generated with universal_crc
tool by Danjel McGougan: http://www.mcgougan.se/universal_crc/

Benchmark on Intel Haswell (no-turbo, 3200 Mhz):

Before:
 CRC32          |      2.52 ns/B     378.3 MiB/s      8.07 c/B
 CRC32RFC1510   |      2.52 ns/B     378.1 MiB/s      8.07 c/B
 CRC24RFC2440   |     46.62 ns/B     20.46 MiB/s     149.2 c/B

After:
 CRC32          |     0.918 ns/B    1039.3 MiB/s      2.94 c/B
 CRC32RFC1510   |     0.918 ns/B    1039.0 MiB/s      2.94 c/B
 CRC24RFC2440   |     0.918 ns/B    1039.4 MiB/s      2.94 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable AMD64 AES implementation for WIN64
Jussi Kivilinna [Sat, 2 May 2015 10:27:06 +0000 (13:27 +0300)]
Enable AMD64 AES implementation for WIN64

* cipher/rijndael-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/rijndael-internal.h (USE_AMD64_ASM): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(do_encrypt, do_decrypt)
[USE_AMD64_ASM && !HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS]: Use
assembly block to call AMD64 assembly encrypt/decrypt function.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable AMD64 Whirlpool implementation for WIN64
Jussi Kivilinna [Sat, 2 May 2015 10:26:46 +0000 (13:26 +0300)]
Enable AMD64 Whirlpool implementation for WIN64

* cipher/whirlpool-sse2-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/whirlpool.c (USE_AMD64_ASM): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
[USE_AMD64_ASM] (ASM_FUNC_ABI, ASM_EXTRA_STACK): New.
[USE_AMD64_ASM] (_gcry_whirlpool_transform_amd64): Add ASM_FUNC_ABI to
prototype.
[USE_AMD64_ASM] (whirlpool_transform): Add ASM_EXTRA_STACK to stack
burn value.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable AMD64 SHA512 implementations for WIN64
Jussi Kivilinna [Sat, 2 May 2015 10:05:12 +0000 (13:05 +0300)]
Enable AMD64 SHA512 implementations for WIN64

* cipher/sha512-avx-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/sha512-avx-bmi2-amd64.S: Ditto.
* cipher/sha512-ssse3-amd64.S: Ditto.
* cipher/sha512.c (USE_SSSE3, USE_AVX, USE_AVX2): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
[USE_SSSE3 || USE_AVX || USE_AVX2] (ASM_FUNC_ABI)
(ASM_EXTRA_STACK): New.
(_gcry_sha512_transform_amd64_ssse3, _gcry_sha512_transform_amd64_avx)
(_gcry_sha512_transform_amd64_avx_bmi2): Add ASM_FUNC_ABI to
prototypes.
(transform): Add ASM_EXTRA_STACK to stack burn value.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable AMD64 SHA256 implementations for WIN64
Jussi Kivilinna [Sat, 2 May 2015 10:05:02 +0000 (13:05 +0300)]
Enable AMD64 SHA256 implementations for WIN64

* cipher/sha256-avx-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/sha256-avx2-bmi2-amd64.S: Ditto.
* cipher/sha256-ssse3-amd64.S: Ditto.
* cipher/sha256.c (USE_SSSE3, USE_AVX, USE_AVX2): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
[USE_SSSE3 || USE_AVX || USE_AVX2] (ASM_FUNC_ABI)
(ASM_EXTRA_STACK): New.
(_gcry_sha256_transform_amd64_ssse3, _gcry_sha256_transform_amd64_avx)
(_gcry_sha256_transform_amd64_avx2): Add ASM_FUNC_ABI to prototypes.
(transform): Add ASM_EXTRA_STACK to stack burn value.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable AMD64 SHA1 implementations for WIN64
Jussi Kivilinna [Sat, 2 May 2015 09:57:07 +0000 (12:57 +0300)]
Enable AMD64 SHA1 implementations for WIN64

* cipher/sha1-avx-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/sha1-avx-bmi2-amd64.S: Ditto.
* cipher/sha1-ssse3-amd64.S: Ditto.
* cipher/sha1.c (USE_SSSE3, USE_AVX, USE_BMI2): Enable
when HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
[USE_SSSE3 || USE_AVX || USE_BMI2] (ASM_FUNC_ABI)
(ASM_EXTRA_STACK): New.
(_gcry_sha1_transform_amd64_ssse3, _gcry_sha1_transform_amd64_avx)
(_gcry_sha1_transform_amd64_avx_bmi2): Add ASM_FUNC_ABI to
prototypes.
(transform): Add ASM_EXTRA_STACK to stack burn value.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable AES/AES-NI, AES/SSSE3 and GCM/PCLMUL implementations on WIN64
Jussi Kivilinna [Wed, 29 Apr 2015 15:18:07 +0000 (18:18 +0300)]
Enable AES/AES-NI, AES/SSSE3 and GCM/PCLMUL implementations on WIN64

* cipher/cipher-gcm-intel-pclmul.c (_gcry_ghash_intel_pclmul)
( _gcry_ghash_intel_pclmul) [__WIN64__]: Store non-volatile vector
registers before use and restore after.
* cipher/cipher-internal.h (GCM_USE_INTEL_PCLMUL): Remove dependency
on !defined(__WIN64__).
* cipher/rijndael-aesni.c [__WIN64__] (aesni_prepare_2_6_variable,
aesni_prepare, aesni_prepare_2_6, aesni_cleanup)
( aesni_cleanup_2_6): New.
[!__WIN64__] (aesni_prepare_2_6_variable, aesni_prepare_2_6): New.
(_gcry_aes_aesni_do_setkey, _gcry_aes_aesni_cbc_enc)
(_gcry_aesni_ctr_enc, _gcry_aesni_cfb_dec, _gcry_aesni_cbc_dec)
(_gcry_aesni_ocb_crypt, _gcry_aesni_ocb_auth): Use
'aesni_prepare_2_6'.
* cipher/rijndael-internal.h (USE_SSSE3): Enable if
HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS or
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS.
(USE_AESNI): Remove dependency on !defined(__WIN64__)
* cipher/rijndael-ssse3-amd64.c [HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS]
(vpaes_ssse3_prepare, vpaes_ssse3_cleanup): New.
[!HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS] (vpaes_ssse3_prepare): New.
(vpaes_ssse3_prepare_enc, vpaes_ssse3_prepare_dec): Use
'vpaes_ssse3_prepare'.
(_gcry_aes_ssse3_do_setkey, _gcry_aes_ssse3_prepare_decryption): Use
'vpaes_ssse3_prepare' and 'vpaes_ssse3_cleanup'.
[HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS] (X): Add masking macro to
exclude '.type' and '.size' markers from assembly code, as they are
not support on WIN64/COFF objects.
* configure.ac (gcry_cv_gcc_attribute_ms_abi)
(gcry_cv_gcc_attribute_sysv_abi, gcry_cv_gcc_default_abi_is_ms_abi)
(gcry_cv_gcc_default_abi_is_sysv_abi)
(gcry_cv_gcc_win64_platform_as_ok): New checks.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd W64 support for mpi amd64 assembly
Jussi Kivilinna [Wed, 29 Apr 2015 15:18:07 +0000 (18:18 +0300)]
Add W64 support for mpi amd64 assembly

acinclude.m4 (GNUPG_SYS_SYMBOL_UNDERSCORE): Set
'ac_cv_sys_symbol_underscore=no' on MingW-W64.
mpi/amd64/func_abi.h: New.
mpi/amd64/mpih-add1.S (_gcry_mpih_add_n): Add FUNC_ENTRY and FUNC_EXIT.
mpi/amd64/mpih-lshift.S (_gcry_mpih_lshift): Ditto.
mpi/amd64/mpih-mul1.S (_gcry_mpih_mul_1): Ditto.
mpi/amd64/mpih-mul2.S (_gcry_mpih_addmul_1): Ditto.
mpi/amd64/mpih-mul3.S (_gcry_mpih_submul_1): Ditto.
mpi/amd64/mpih-rshift.S (_gcry_mpih_rshift): Ditto.
mpi/amd64/mpih-sub1.S (_gcry_mpih_sub_n): Ditto.
mpi/config.links [host=x86_64-*mingw*]: Enable assembly modules.
[host=x86_64-*-*]: Append mpi/amd64/func_abi.h to mpi/asm-syntax.h.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoDES: Silence compiler warnings on Windows
Jussi Kivilinna [Fri, 1 May 2015 16:15:34 +0000 (19:15 +0300)]
DES: Silence compiler warnings on Windows

* cipher/des.c (working_memcmp): Make pointer arguments 'const void *'.
--

Following warning seen on Windows target build:

des.c: In function 'is_weak_key':
des.c:1019:40: warning: pointer targets in passing argument 1 of 'working_memcmp' differ in signedness [-Wpointer-sign]
       if ( !(cmp_result=working_memcmp(work, weak_keys[middle], 8)) )
                                        ^
des.c:149:1: note: expected 'const char *' but argument is of type 'unsigned char *'
 working_memcmp( const char *a, const char *b, size_t n )
 ^
des.c:1019:46: warning: pointer targets in passing argument 2 of 'working_memcmp' differ in signedness [-Wpointer-sign]
       if ( !(cmp_result=working_memcmp(work, weak_keys[middle], 8)) )
                                              ^
des.c:149:1: note: expected 'const char *' but argument is of type 'unsigned char *'
 working_memcmp( const char *a, const char *b, size_t n )
 ^

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoCast pointers to integers using uintptr_t instead of long
Jussi Kivilinna [Wed, 29 Apr 2015 15:18:07 +0000 (18:18 +0300)]
Cast pointers to integers using uintptr_t instead of long

4 years agoFix rndhw for 64-bit Windows build
Jussi Kivilinna [Wed, 29 Apr 2015 15:18:07 +0000 (18:18 +0300)]
Fix rndhw for 64-bit Windows build

* configure.ac: Add sizeof check for 'void *'.
* random/rndhw.c (poll_padlock): Check for SIZEOF_VOID_P == 8
instead of defined(__LP64__).
(RDRAND_LONG): Check for SIZEOF_UNSIGNED_LONG == 8 instead of
defined(__LP64__).
--

__LP64__ is not predefined for 64-bit mingw64-gcc, which caused wrong
assembly code selections. Do selection based on type sizes instead,
to support x86_64, x32 and win64 properly.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoPrepare random/win32.c fast poll for 64-bit Windows
Jussi Kivilinna [Wed, 29 Apr 2015 15:18:07 +0000 (18:18 +0300)]
Prepare random/win32.c fast poll for 64-bit Windows

* random/win32.c (_gcry_rndw32_gather_random_fast) [ADD]: Rename to
ADDINT.
(_gcry_rndw32_gather_random_fast): Add ADDPTR.
(_gcry_rndw32_gather_random_fast): Disable entropy gathering from
GetQueueStatus(QS_ALLEVENTS).
(_gcry_rndw32_gather_random_fast): Change minimumWorkingSetSize and
maximumWorkingSetSize to SIZE_T from DWORD.
(_gcry_rndw32_gather_random_fast): Only add lower 32-bits of
minimumWorkingSetSize and maximumWorkingSetSize to random poll.
(_gcry_rndw32_gather_random_fast) [__WIN64__]: Read TSC directly
using intrinsic.
--

Introduce entropy gatherer changes related to 64-bit Windows platform as done
in cryptlib fast poll:
 - Change ADD macro to ADDPTR/ADDINT to handle pointer values. ADDPTR
   discards high 32-bits of 64-bit pointer values.
 - minimum/maximumWorkingSetSize changed to SIZE_T type to avoid stack
   corruption on 64-bit; only low 32-bits are used for entropy.
 - Use __rdtsc() intrinsic on 64-bit (as TSC is always available).

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoDisable GCM and AES-NI assembly implementations for WIN64
Jussi Kivilinna [Wed, 29 Apr 2015 15:18:07 +0000 (18:18 +0300)]
Disable GCM and AES-NI assembly implementations for WIN64

* cipher/cipher-internal.h (GCM_USE_INTEL_PCLMUL): Do not enable when
__WIN64__ defined.
* cipher/rijndael-internal.h (USE_AESNI): Ditto.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoDisable building mpi assembly routines on WIN64
Jussi Kivilinna [Wed, 29 Apr 2015 15:18:07 +0000 (18:18 +0300)]
Disable building mpi assembly routines on WIN64

* mpi/config.links: Disable assembly for host 'x86_64-*mingw32*'.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoFix packed attribute check for Windows targets
Jussi Kivilinna [Fri, 1 May 2015 16:07:07 +0000 (19:07 +0300)]
Fix packed attribute check for Windows targets

* configure.ac (gcry_cv_gcc_attribute_packed): Move 'long b' to its
own packed structure.
--

Change packed attribute test so that it works with both MS ABI and SYSV ABI.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoFix tail handling in buf_xor_1
Jussi Kivilinna [Fri, 1 May 2015 15:50:34 +0000 (18:50 +0300)]
Fix tail handling in buf_xor_1

* cipher/bufhelp.h (buf_xor_1): Increment source pointer at tail
handling.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd --disable-hwf for basic tests
Jussi Kivilinna [Fri, 1 May 2015 12:03:38 +0000 (15:03 +0300)]
Add --disable-hwf for basic tests

* tests/basic.c (main): Add handling for '--disable-hwf'.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoUse more odd chuck sizes for check_one_md
Jussi Kivilinna [Fri, 1 May 2015 11:55:58 +0000 (14:55 +0300)]
Use more odd chuck sizes for check_one_md

* tests/basic.c (check_one_md): Make chuck size vary oddly, instead
of using fixed length of 1000 bytes.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable more modes in basic ciphers test
Jussi Kivilinna [Fri, 1 May 2015 11:33:29 +0000 (14:33 +0300)]
Enable more modes in basic ciphers test

* src/gcrypt.h.in (GCRY_OCB_BLOCK_LEN): New.
* tests/basic.c (check_one_cipher_core_reset): New.
(check_one_cipher_core): Use check_one_cipher_core_reset inplace of
gcry_cipher_reset.
(check_ciphers): Add CCM and OCB modes for block cipher tests.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoFix reseting cipher in OCB mode
Jussi Kivilinna [Fri, 1 May 2015 11:32:36 +0000 (14:32 +0300)]
Fix reseting cipher in OCB mode

* cipher/cipher.c (cipher_reset): Setup default taglen for OCB after
clearing state.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoFix buggy RC4 AMD64 assembly and add test to notice similar issues
Jussi Kivilinna [Thu, 30 Apr 2015 13:57:57 +0000 (16:57 +0300)]
Fix buggy RC4 AMD64 assembly and add test to notice similar issues

* cipher/arcfour-amd64.S (_gcry_arcfour_amd64): Fix swapped store of
'x' and 'y'.
* tests/basic.c (get_algo_mode_blklen): New.
(check_one_cipher_core): Add new tests for split buffer input on
encryption and decryption.
--

Reported-by: Dima Kukulniak <dima.ky@gmail.com>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>