libgcrypt.git
4 years agoEnable AMD64 AES implementation for WIN64
Jussi Kivilinna [Sat, 2 May 2015 10:27:06 +0000 (13:27 +0300)]
Enable AMD64 AES implementation for WIN64

* cipher/rijndael-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/rijndael-internal.h (USE_AMD64_ASM): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(do_encrypt, do_decrypt)
[USE_AMD64_ASM && !HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS]: Use
assembly block to call AMD64 assembly encrypt/decrypt function.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable AMD64 Whirlpool implementation for WIN64
Jussi Kivilinna [Sat, 2 May 2015 10:26:46 +0000 (13:26 +0300)]
Enable AMD64 Whirlpool implementation for WIN64

* cipher/whirlpool-sse2-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/whirlpool.c (USE_AMD64_ASM): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
[USE_AMD64_ASM] (ASM_FUNC_ABI, ASM_EXTRA_STACK): New.
[USE_AMD64_ASM] (_gcry_whirlpool_transform_amd64): Add ASM_FUNC_ABI to
prototype.
[USE_AMD64_ASM] (whirlpool_transform): Add ASM_EXTRA_STACK to stack
burn value.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable AMD64 SHA512 implementations for WIN64
Jussi Kivilinna [Sat, 2 May 2015 10:05:12 +0000 (13:05 +0300)]
Enable AMD64 SHA512 implementations for WIN64

* cipher/sha512-avx-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/sha512-avx-bmi2-amd64.S: Ditto.
* cipher/sha512-ssse3-amd64.S: Ditto.
* cipher/sha512.c (USE_SSSE3, USE_AVX, USE_AVX2): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
[USE_SSSE3 || USE_AVX || USE_AVX2] (ASM_FUNC_ABI)
(ASM_EXTRA_STACK): New.
(_gcry_sha512_transform_amd64_ssse3, _gcry_sha512_transform_amd64_avx)
(_gcry_sha512_transform_amd64_avx_bmi2): Add ASM_FUNC_ABI to
prototypes.
(transform): Add ASM_EXTRA_STACK to stack burn value.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable AMD64 SHA256 implementations for WIN64
Jussi Kivilinna [Sat, 2 May 2015 10:05:02 +0000 (13:05 +0300)]
Enable AMD64 SHA256 implementations for WIN64

* cipher/sha256-avx-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/sha256-avx2-bmi2-amd64.S: Ditto.
* cipher/sha256-ssse3-amd64.S: Ditto.
* cipher/sha256.c (USE_SSSE3, USE_AVX, USE_AVX2): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
[USE_SSSE3 || USE_AVX || USE_AVX2] (ASM_FUNC_ABI)
(ASM_EXTRA_STACK): New.
(_gcry_sha256_transform_amd64_ssse3, _gcry_sha256_transform_amd64_avx)
(_gcry_sha256_transform_amd64_avx2): Add ASM_FUNC_ABI to prototypes.
(transform): Add ASM_EXTRA_STACK to stack burn value.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable AMD64 SHA1 implementations for WIN64
Jussi Kivilinna [Sat, 2 May 2015 09:57:07 +0000 (12:57 +0300)]
Enable AMD64 SHA1 implementations for WIN64

* cipher/sha1-avx-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/sha1-avx-bmi2-amd64.S: Ditto.
* cipher/sha1-ssse3-amd64.S: Ditto.
* cipher/sha1.c (USE_SSSE3, USE_AVX, USE_BMI2): Enable
when HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
[USE_SSSE3 || USE_AVX || USE_BMI2] (ASM_FUNC_ABI)
(ASM_EXTRA_STACK): New.
(_gcry_sha1_transform_amd64_ssse3, _gcry_sha1_transform_amd64_avx)
(_gcry_sha1_transform_amd64_avx_bmi2): Add ASM_FUNC_ABI to
prototypes.
(transform): Add ASM_EXTRA_STACK to stack burn value.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable AES/AES-NI, AES/SSSE3 and GCM/PCLMUL implementations on WIN64
Jussi Kivilinna [Wed, 29 Apr 2015 15:18:07 +0000 (18:18 +0300)]
Enable AES/AES-NI, AES/SSSE3 and GCM/PCLMUL implementations on WIN64

* cipher/cipher-gcm-intel-pclmul.c (_gcry_ghash_intel_pclmul)
( _gcry_ghash_intel_pclmul) [__WIN64__]: Store non-volatile vector
registers before use and restore after.
* cipher/cipher-internal.h (GCM_USE_INTEL_PCLMUL): Remove dependency
on !defined(__WIN64__).
* cipher/rijndael-aesni.c [__WIN64__] (aesni_prepare_2_6_variable,
aesni_prepare, aesni_prepare_2_6, aesni_cleanup)
( aesni_cleanup_2_6): New.
[!__WIN64__] (aesni_prepare_2_6_variable, aesni_prepare_2_6): New.
(_gcry_aes_aesni_do_setkey, _gcry_aes_aesni_cbc_enc)
(_gcry_aesni_ctr_enc, _gcry_aesni_cfb_dec, _gcry_aesni_cbc_dec)
(_gcry_aesni_ocb_crypt, _gcry_aesni_ocb_auth): Use
'aesni_prepare_2_6'.
* cipher/rijndael-internal.h (USE_SSSE3): Enable if
HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS or
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS.
(USE_AESNI): Remove dependency on !defined(__WIN64__)
* cipher/rijndael-ssse3-amd64.c [HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS]
(vpaes_ssse3_prepare, vpaes_ssse3_cleanup): New.
[!HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS] (vpaes_ssse3_prepare): New.
(vpaes_ssse3_prepare_enc, vpaes_ssse3_prepare_dec): Use
'vpaes_ssse3_prepare'.
(_gcry_aes_ssse3_do_setkey, _gcry_aes_ssse3_prepare_decryption): Use
'vpaes_ssse3_prepare' and 'vpaes_ssse3_cleanup'.
[HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS] (X): Add masking macro to
exclude '.type' and '.size' markers from assembly code, as they are
not support on WIN64/COFF objects.
* configure.ac (gcry_cv_gcc_attribute_ms_abi)
(gcry_cv_gcc_attribute_sysv_abi, gcry_cv_gcc_default_abi_is_ms_abi)
(gcry_cv_gcc_default_abi_is_sysv_abi)
(gcry_cv_gcc_win64_platform_as_ok): New checks.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd W64 support for mpi amd64 assembly
Jussi Kivilinna [Wed, 29 Apr 2015 15:18:07 +0000 (18:18 +0300)]
Add W64 support for mpi amd64 assembly

acinclude.m4 (GNUPG_SYS_SYMBOL_UNDERSCORE): Set
'ac_cv_sys_symbol_underscore=no' on MingW-W64.
mpi/amd64/func_abi.h: New.
mpi/amd64/mpih-add1.S (_gcry_mpih_add_n): Add FUNC_ENTRY and FUNC_EXIT.
mpi/amd64/mpih-lshift.S (_gcry_mpih_lshift): Ditto.
mpi/amd64/mpih-mul1.S (_gcry_mpih_mul_1): Ditto.
mpi/amd64/mpih-mul2.S (_gcry_mpih_addmul_1): Ditto.
mpi/amd64/mpih-mul3.S (_gcry_mpih_submul_1): Ditto.
mpi/amd64/mpih-rshift.S (_gcry_mpih_rshift): Ditto.
mpi/amd64/mpih-sub1.S (_gcry_mpih_sub_n): Ditto.
mpi/config.links [host=x86_64-*mingw*]: Enable assembly modules.
[host=x86_64-*-*]: Append mpi/amd64/func_abi.h to mpi/asm-syntax.h.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoDES: Silence compiler warnings on Windows
Jussi Kivilinna [Fri, 1 May 2015 16:15:34 +0000 (19:15 +0300)]
DES: Silence compiler warnings on Windows

* cipher/des.c (working_memcmp): Make pointer arguments 'const void *'.
--

Following warning seen on Windows target build:

des.c: In function 'is_weak_key':
des.c:1019:40: warning: pointer targets in passing argument 1 of 'working_memcmp' differ in signedness [-Wpointer-sign]
       if ( !(cmp_result=working_memcmp(work, weak_keys[middle], 8)) )
                                        ^
des.c:149:1: note: expected 'const char *' but argument is of type 'unsigned char *'
 working_memcmp( const char *a, const char *b, size_t n )
 ^
des.c:1019:46: warning: pointer targets in passing argument 2 of 'working_memcmp' differ in signedness [-Wpointer-sign]
       if ( !(cmp_result=working_memcmp(work, weak_keys[middle], 8)) )
                                              ^
des.c:149:1: note: expected 'const char *' but argument is of type 'unsigned char *'
 working_memcmp( const char *a, const char *b, size_t n )
 ^

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoCast pointers to integers using uintptr_t instead of long
Jussi Kivilinna [Wed, 29 Apr 2015 15:18:07 +0000 (18:18 +0300)]
Cast pointers to integers using uintptr_t instead of long

4 years agoFix rndhw for 64-bit Windows build
Jussi Kivilinna [Wed, 29 Apr 2015 15:18:07 +0000 (18:18 +0300)]
Fix rndhw for 64-bit Windows build

* configure.ac: Add sizeof check for 'void *'.
* random/rndhw.c (poll_padlock): Check for SIZEOF_VOID_P == 8
instead of defined(__LP64__).
(RDRAND_LONG): Check for SIZEOF_UNSIGNED_LONG == 8 instead of
defined(__LP64__).
--

__LP64__ is not predefined for 64-bit mingw64-gcc, which caused wrong
assembly code selections. Do selection based on type sizes instead,
to support x86_64, x32 and win64 properly.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoPrepare random/win32.c fast poll for 64-bit Windows
Jussi Kivilinna [Wed, 29 Apr 2015 15:18:07 +0000 (18:18 +0300)]
Prepare random/win32.c fast poll for 64-bit Windows

* random/win32.c (_gcry_rndw32_gather_random_fast) [ADD]: Rename to
ADDINT.
(_gcry_rndw32_gather_random_fast): Add ADDPTR.
(_gcry_rndw32_gather_random_fast): Disable entropy gathering from
GetQueueStatus(QS_ALLEVENTS).
(_gcry_rndw32_gather_random_fast): Change minimumWorkingSetSize and
maximumWorkingSetSize to SIZE_T from DWORD.
(_gcry_rndw32_gather_random_fast): Only add lower 32-bits of
minimumWorkingSetSize and maximumWorkingSetSize to random poll.
(_gcry_rndw32_gather_random_fast) [__WIN64__]: Read TSC directly
using intrinsic.
--

Introduce entropy gatherer changes related to 64-bit Windows platform as done
in cryptlib fast poll:
 - Change ADD macro to ADDPTR/ADDINT to handle pointer values. ADDPTR
   discards high 32-bits of 64-bit pointer values.
 - minimum/maximumWorkingSetSize changed to SIZE_T type to avoid stack
   corruption on 64-bit; only low 32-bits are used for entropy.
 - Use __rdtsc() intrinsic on 64-bit (as TSC is always available).

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoDisable GCM and AES-NI assembly implementations for WIN64
Jussi Kivilinna [Wed, 29 Apr 2015 15:18:07 +0000 (18:18 +0300)]
Disable GCM and AES-NI assembly implementations for WIN64

* cipher/cipher-internal.h (GCM_USE_INTEL_PCLMUL): Do not enable when
__WIN64__ defined.
* cipher/rijndael-internal.h (USE_AESNI): Ditto.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoDisable building mpi assembly routines on WIN64
Jussi Kivilinna [Wed, 29 Apr 2015 15:18:07 +0000 (18:18 +0300)]
Disable building mpi assembly routines on WIN64

* mpi/config.links: Disable assembly for host 'x86_64-*mingw32*'.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoFix packed attribute check for Windows targets
Jussi Kivilinna [Fri, 1 May 2015 16:07:07 +0000 (19:07 +0300)]
Fix packed attribute check for Windows targets

* configure.ac (gcry_cv_gcc_attribute_packed): Move 'long b' to its
own packed structure.
--

Change packed attribute test so that it works with both MS ABI and SYSV ABI.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoFix tail handling in buf_xor_1
Jussi Kivilinna [Fri, 1 May 2015 15:50:34 +0000 (18:50 +0300)]
Fix tail handling in buf_xor_1

* cipher/bufhelp.h (buf_xor_1): Increment source pointer at tail
handling.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd --disable-hwf for basic tests
Jussi Kivilinna [Fri, 1 May 2015 12:03:38 +0000 (15:03 +0300)]
Add --disable-hwf for basic tests

* tests/basic.c (main): Add handling for '--disable-hwf'.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoUse more odd chuck sizes for check_one_md
Jussi Kivilinna [Fri, 1 May 2015 11:55:58 +0000 (14:55 +0300)]
Use more odd chuck sizes for check_one_md

* tests/basic.c (check_one_md): Make chuck size vary oddly, instead
of using fixed length of 1000 bytes.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoEnable more modes in basic ciphers test
Jussi Kivilinna [Fri, 1 May 2015 11:33:29 +0000 (14:33 +0300)]
Enable more modes in basic ciphers test

* src/gcrypt.h.in (GCRY_OCB_BLOCK_LEN): New.
* tests/basic.c (check_one_cipher_core_reset): New.
(check_one_cipher_core): Use check_one_cipher_core_reset inplace of
gcry_cipher_reset.
(check_ciphers): Add CCM and OCB modes for block cipher tests.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoFix reseting cipher in OCB mode
Jussi Kivilinna [Fri, 1 May 2015 11:32:36 +0000 (14:32 +0300)]
Fix reseting cipher in OCB mode

* cipher/cipher.c (cipher_reset): Setup default taglen for OCB after
clearing state.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoFix buggy RC4 AMD64 assembly and add test to notice similar issues
Jussi Kivilinna [Thu, 30 Apr 2015 13:57:57 +0000 (16:57 +0300)]
Fix buggy RC4 AMD64 assembly and add test to notice similar issues

* cipher/arcfour-amd64.S (_gcry_arcfour_amd64): Fix swapped store of
'x' and 'y'.
* tests/basic.c (get_algo_mode_blklen): New.
(check_one_cipher_core): Add new tests for split buffer input on
encryption and decryption.
--

Reported-by: Dima Kukulniak <dima.ky@gmail.com>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoDisallow compiler from generating SSE instructions in mixed C+asm source
Jussi Kivilinna [Wed, 22 Apr 2015 17:29:05 +0000 (20:29 +0300)]
Disallow compiler from generating SSE instructions in mixed C+asm source

* cipher/cipher-gcm-intel-pclmul.c [gcc-version >= 4.4]: Add GCC target
pragma to disable compiler use of SSE.
* cipher/rijndael-aesni.c [gcc-version >= 4.4]: Ditto.
* cipher/rijndael-ssse3-amd64.c [gcc-version >= 4.4]: Ditto.
--

These implementations assume that compiler does not use XMM registers
between assembly blocks.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd OCB bulk crypt/auth functions for AES/AES-NI
Jussi Kivilinna [Sat, 18 Apr 2015 14:41:34 +0000 (17:41 +0300)]
Add OCB bulk crypt/auth functions for AES/AES-NI

* cipher/cipher-internal.h (gcry_cipher_handle): Add bulk.ocb_crypt
and bulk.ocb_auth.
(_gcry_cipher_ocb_get_l): New prototype.
* cipher/cipher-ocb.c (get_l): Rename to ...
(_gcry_cipher_ocb_get_l): ... this.
(_gcry_cipher_ocb_authenticate, ocb_crypt): Use bulk function when
available.
* cipher/cipher.c (_gcry_cipher_open_internal): Setup OCB bulk
functions for AES.
* cipher/rijndael-aesni.c (get_l, aesni_ocb_enc, aes_ocb_dec)
(_gcry_aes_aesni_ocb_crypt, _gcry_aes_aesni_ocb_auth): New.
* cipher/rijndael.c [USE_AESNI] (_gcry_aes_aesni_ocb_crypt)
(_gcry_aes_aesni_ocb_auth): New prototypes.
(_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth): New.
* src/cipher.h (_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth): New
prototypes.
* tests/basic.c (check_ocb_cipher_largebuf): New.
(check_ocb_cipher): Add large buffer encryption/decryption test.
--

Patch adds bulk encryption/decryption/authentication code for AES-NI
accelerated AES.

Benchmark on Intel i5-4570 (3200 Mhz, turbo off):

Before:
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        OCB enc |      2.12 ns/B     449.7 MiB/s      6.79 c/B
        OCB dec |      2.12 ns/B     449.6 MiB/s      6.79 c/B
       OCB auth |      2.07 ns/B     459.9 MiB/s      6.64 c/B

After:
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        OCB enc |     0.292 ns/B    3262.5 MiB/s     0.935 c/B
        OCB dec |     0.297 ns/B    3212.2 MiB/s     0.950 c/B
       OCB auth |     0.260 ns/B    3666.1 MiB/s     0.832 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agotests: Add option to time the S2K function.
Werner Koch [Wed, 15 Apr 2015 10:34:38 +0000 (12:34 +0200)]
tests: Add option to time the S2K function.

* tests/t-kdf.c: Include stopwatch.h.
(dummy_consumer): new.
(bench_s2k): New.
(main): Add option parser and option --s2k.
--

For example:

  $ ./t-kdf --s2k 17659904
 88.0ms
  $ ./t-kdf --s2k 65536
  0.3ms

This test is similar to the code done by gpg-agent to calibrate the
S2K count.

4 years agotests: Improve stopwatch.h
Werner Koch [Wed, 15 Apr 2015 10:30:50 +0000 (12:30 +0200)]
tests: Improve stopwatch.h

* tests/stopwatch.h (elapsed_time): Add arg divisor.

4 years agompi: Fix gcry_mpi_copy for NULL opaque data.
Werner Koch [Mon, 13 Apr 2015 09:48:33 +0000 (11:48 +0200)]
mpi: Fix gcry_mpi_copy for NULL opaque data.

* mpi/mpiutil.c (_gcry_mpi_copy): Copy opaque only if needed.
--

gcry_mpi_set_opaque allows to store NULL as opaque data.  Thus we also
need to take care when copying such data.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoAdd git url to AUTHORS
Werner Koch [Sun, 12 Apr 2015 17:50:49 +0000 (19:50 +0200)]
Add git url to AUTHORS

--

4 years agowipememory: use one-byte aligned type for unaligned memory accesses
Jussi Kivilinna [Sat, 21 Mar 2015 11:01:38 +0000 (13:01 +0200)]
wipememory: use one-byte aligned type for unaligned memory accesses

* src/g10lib.h (fast_wipememory2_unaligned_head): Enable unaligned
access only when HAVE_GCC_ATTRIBUTE_PACKED and
HAVE_GCC_ATTRIBUTE_ALIGNED defined.
(fast_wipememory_t): New.
(fast_wipememory2): Use 'fast_wipememory_t'.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agobufhelp: use one-byte aligned type for unaligned memory accesses
Jussi Kivilinna [Sat, 21 Mar 2015 11:01:38 +0000 (13:01 +0200)]
bufhelp: use one-byte aligned type for unaligned memory accesses

* cipher/bufhelp.h (BUFHELP_FAST_UNALIGNED_ACCESS): Enable only when
HAVE_GCC_ATTRIBUTE_PACKED and HAVE_GCC_ATTRIBUTE_ALIGNED are defined.
(bufhelp_int_t): New type.
(buf_cpy, buf_xor, buf_xor_1, buf_xor_2dst, buf_xor_n_copy_2): Use
'bufhelp_int_t'.
[BUFHELP_FAST_UNALIGNED_ACCESS] (bufhelp_u32_t, bufhelp_u64_t): New.
[BUFHELP_FAST_UNALIGNED_ACCESS] (buf_get_be32, buf_get_le32)
(buf_put_be32, buf_put_le32, buf_get_be64, buf_get_le64)
(buf_put_be64, buf_put_le64): Use 'bufhelp_uXX_t'.
* configure.ac (gcry_cv_gcc_attribute_packed): New.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agotests/bench-slope: fix memory-leak and use-after-free bugs
Jussi Kivilinna [Sat, 21 Mar 2015 11:01:38 +0000 (13:01 +0200)]
tests/bench-slope: fix memory-leak and use-after-free bugs

* tests/bench-slope.c (do_slope_benchmark): Free 'measurements' at end.
(bench_mac_init): Move 'key' free at end of function.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoFix two pedantic warnings.
Werner Koch [Thu, 19 Mar 2015 09:43:55 +0000 (10:43 +0100)]
Fix two pedantic warnings.

* src/gcrypt.h.in (gcry_mpi_flag, gcry_mac_algos): Remove trailing
comma.
--

Reported-by: Opal Raava <opalraava@hushmail.com>
Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoUse well defined type instead of size_t in secmem.c
Werner Koch [Mon, 16 Mar 2015 10:50:23 +0000 (11:50 +0100)]
Use well defined type instead of size_t in secmem.c

* src/secmem.c (ptr_into_pool_p): Replace size_t by uintptr_t.
--

This is more or less cosmetic.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoMake uintptr_t global available.
Werner Koch [Mon, 16 Mar 2015 10:32:07 +0000 (11:32 +0100)]
Make uintptr_t global available.

* cipher/bufhelp.h: Move include for uintptr_t to ...
* src/types.h: here.  Check that config.h has been included.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoIndentation fix.
Werner Koch [Mon, 16 Mar 2015 08:32:44 +0000 (09:32 +0100)]
Indentation fix.

--

4 years agompi: Remove useless condition.
Werner Koch [Mon, 16 Mar 2015 08:29:27 +0000 (09:29 +0100)]
mpi: Remove useless condition.

* mpi/mpi-pow.c: Remove condition rp==mp.
--

MP has already been allocated and thus can't match RP.  The followinf
assert would have been triggred anyway due to the prior allocation.

Detected by Stack 0.3.

4 years agocipher: Remove useless NULL check.
Werner Koch [Mon, 16 Mar 2015 08:01:24 +0000 (09:01 +0100)]
cipher: Remove useless NULL check.

* cipher/hash-common.c (_gcry_md_block_write): Remove NUL check for
hd->buf.
--

HD->BUF is not allocated but part of the struct.  HD has already be
dereferenced twice thus the check does not make sense.  Detected by
Stack 0.3:

  bug: anti-simplify
  model: |
    %cmp4 = icmp eq i8* %arraydecay, null, !dbg !29
    -->  false
  stack:
    - /home/wk/s/libgcrypt/cipher/hash-common.c:114:0
  ncore: 1
  core:
    - /home/wk/s/libgcrypt/cipher/hash-common.c:108:0
      - null pointer dereference

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoFix in-place encryption for OCB mode
Jussi Kivilinna [Sat, 28 Feb 2015 16:04:34 +0000 (18:04 +0200)]
Fix in-place encryption for OCB mode

* cipher/cipher-ocb.c (ocb_checksum): New.
(ocb_crypt): Move checksum calculation outside main crypt loop, do
checksum calculation for encryption before inbuf is overwritten.
* tests/basic.c (check_ocb_cipher): Rename to ...
(do_check_ocb_cipher): ... to this and add argument for testing
in-place encryption/decryption.
(check_ocb_cipher): New.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agotests: fix t-sexp.c.
NIIBE Yutaka [Fri, 27 Feb 2015 08:24:49 +0000 (17:24 +0900)]
tests: fix t-sexp.c.

* tests/t-sexp.c (bug_1594): Free N and PUBKEY.

4 years agompi: Avoid data-dependent timing variations in mpi_powm.
NIIBE Yutaka [Thu, 26 Feb 2015 12:07:01 +0000 (21:07 +0900)]
mpi: Avoid data-dependent timing variations in mpi_powm.

* mpi/mpi-pow.c (mpi_powm): Access all data in the table by
mpi_set_cond.

--

Access to the precomputed table was indexed by a portion of EXPO,
which could be mounted by a side channel attack.  This change fixes
this particular data-dependent access pattern.

Cherry-picked from commit  5e72b6c76ebee720f69b8a5c212f52d38eb50287
in LIBGCRYPT-1-6-BRANCH.

4 years agompi: Revise mpi_powm.
NIIBE Yutaka [Wed, 11 Feb 2015 13:30:02 +0000 (22:30 +0900)]
mpi: Revise mpi_powm.

* mpi/mpi-pow.c (_gcry_mpi_powm): Rename the table to PRECOMP.

--

The name of precomputed table was b_2i3 which stands for BASE^(2*I+3).
But it's too cryptic, so, it's renamed.  Besides, we needed to
distinguish the case of I==0, that was not good.  Since it's OK to
increase the size of table by one, it's BASE^(2*I+1), now.

4 years agocipher: Use ciphertext blinding for Elgamal decryption.
Werner Koch [Mon, 23 Feb 2015 10:39:58 +0000 (11:39 +0100)]
cipher: Use ciphertext blinding for Elgamal decryption.

* cipher/elgamal.c (USE_BLINDING): New.
(decrypt): Rewrite to use ciphertext blinding.
--

CVE-id: CVE-2014-3591

As a countermeasure to a new side-channel attacks on sliding windows
exponentiation we blind the ciphertext for Elgamal decryption.  This
is similar to what we are doing with RSA. This patch is a backport of
the GnuPG 1.4 commit ff53cf06e966dce0daba5f2c84e03ab9db2c3c8b.

Unfortunately, the performance impact of Elgamal blinding is quite
noticeable (i5-2410M CPU @ 2.30GHz TP 220):

  Algorithm         generate  100*priv  100*public
  ------------------------------------------------
  ELG 1024 bit             -     100ms        90ms
  ELG 2048 bit             -     330ms       350ms
  ELG 3072 bit             -     660ms       790ms

  Algorithm         generate  100*priv  100*public
  ------------------------------------------------
  ELG 1024 bit             -     150ms        90ms
  ELG 2048 bit             -     520ms       360ms
  ELG 3072 bit             -    1100ms       800ms

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agompi: Add mpi_set_cond.
NIIBE Yutaka [Wed, 11 Feb 2015 12:42:22 +0000 (21:42 +0900)]
mpi: Add mpi_set_cond.

* mpi/mpiutil.c (_gcry_mpi_set_cond): New.
(_gcry_mpi_swap_cond): Fix types.
* src/mpi.h (mpi_set_cond): New.

4 years agow32: Use -static-libgcc to avoid linking to libgcc_s_sjlj-1.dll.
Werner Koch [Fri, 30 Jan 2015 15:58:02 +0000 (16:58 +0100)]
w32: Use -static-libgcc to avoid linking to libgcc_s_sjlj-1.dll.

* src/Makefile.am (extra_ltoptions): New.
(libgcrypt_la_LDFLAGS): Use it.
--

Since gcc 4.8 there is a regression in that plain C programs may link
to libgcc_s.a which has a dependency on libgcc_s_sjlj.dll.  This is
for example triggered by using long long arithmetic on a 32 bit
Windows (e.g symbol __udivdi3).

As usual the gcc maintainers don't care about backward compatibility
and declare that as some kind of compatibility fix and not as
regression from 4.7 and all earlier versions.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoFix building of GOST s-boxes when cross-compiling.
Werner Koch [Wed, 28 Jan 2015 14:13:50 +0000 (15:13 +0100)]
Fix building of GOST s-boxes when cross-compiling.

* cipher/Makefile.am (gost-s-box): USe CC_FOR_BUILD.
(noinst_PROGRAMS): Remove.
(EXTRA_DIST): New.
(CLEANFILES): New.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agorijndael: fix wrong ifdef for SSSE3 setkey
Jussi Kivilinna [Tue, 20 Jan 2015 16:54:13 +0000 (18:54 +0200)]
rijndael: fix wrong ifdef for SSSE3 setkey

* cipher/rijndael.c (do_setkey): Use USE_SSSE3 instead of USE_AESNI
around SSSE3 setkey selection.
--

Reported-by: Richard H Lee <ricardohenrylee@gmail.com>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd OCB cipher mode
Werner Koch [Fri, 16 Jan 2015 13:55:03 +0000 (14:55 +0100)]
Add OCB cipher mode

* cipher/cipher-ocb.c: New.
* cipher/Makefile.am (libcipher_la_SOURCES): Add cipher-ocb.c
* cipher/cipher-internal.h (OCB_BLOCK_LEN, OCB_L_TABLE_SIZE): New.
(gcry_cipher_handle): Add fields marks.finalize and u_mode.ocb.
* cipher/cipher.c (_gcry_cipher_open_internal): Add OCB mode.
(_gcry_cipher_open_internal): Setup default taglen of OCB.
(cipher_reset): Clear OCB specific data.
(cipher_encrypt, cipher_decrypt, _gcry_cipher_authenticate)
(_gcry_cipher_gettag, _gcry_cipher_checktag): Call OCB functions.
(_gcry_cipher_setiv): Add OCB specific nonce setting.
(_gcry_cipher_ctl): Add GCRYCTL_FINALIZE and GCRYCTL_SET_TAGLEN

* src/gcrypt.h.in (GCRYCTL_SET_TAGLEN): New.
(gcry_cipher_final): New.

* cipher/bufhelp.h (buf_xor_1): New.

* tests/basic.c (hex2buffer): New.
(check_ocb_cipher): New.
(main): Call it here.  Add option --cipher-modes.
* tests/bench-slope.c (bench_aead_encrypt_do_bench): Call
gcry_cipher_final.
(bench_aead_decrypt_do_bench): Ditto.
(bench_aead_authenticate_do_bench): Ditto.  Check error code.
(bench_ocb_encrypt_do_bench): New.
(bench_ocb_decrypt_do_bench): New.
(bench_ocb_authenticate_do_bench): New.
(ocb_encrypt_ops): New.
(ocb_decrypt_ops): New.
(ocb_authenticate_ops): New.
(cipher_modes): Add them.
(cipher_bench_one): Skip wrong block length for OCB.
* tests/benchmark.c (cipher_bench): Add field noncelen to MODES.  Add
OCB support.

--

See the comments on top of cipher/cipher-ocb.c for the patent status
of the OCB mode.

The implementation has not yet been optimized and as such is not faster
that the other AEAD modes.  A first candidate for optimization is the
double_block function.  Large improvements can be expected by writing
an AES ECB function to work on multiple blocks.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoAdd functions to count trailing zero bits in a word.
Werner Koch [Thu, 15 Jan 2015 09:04:43 +0000 (10:04 +0100)]
Add functions to count trailing zero bits in a word.

* cipher/bithelp.h (_gcry_ctz, _gcry_ctz64): New.
* configure.ac (HAVE_BUILTIN_CTZ): Add new test.
--

Note that these functions return the number of bits in the word when
passing 0.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoRe-indent types.h for easier reading.
Werner Koch [Thu, 15 Jan 2015 09:02:28 +0000 (10:02 +0100)]
Re-indent types.h for easier reading.

--

4 years agocipher: Prepare for OCB mode.
Werner Koch [Thu, 8 Jan 2015 08:07:09 +0000 (09:07 +0100)]
cipher: Prepare for OCB mode.

* src/gcrypt.h.in (GCRY_CIPHER_MODE_OCB): New.
--

This is merely a claim that I am working on OCB mode.

4 years agoMake make distcheck work again.
Werner Koch [Tue, 6 Jan 2015 19:30:37 +0000 (20:30 +0100)]
Make make distcheck work again.

* Makefile.am (DISTCHECK_CONFIGURE_FLAGS): Remove --enable-ciphers.
* cipher/Makefile.am (DISTCLEANFILES): Add gost-sb.h.

4 years agoRemove the old Manifest files
Werner Koch [Tue, 6 Jan 2015 17:54:24 +0000 (18:54 +0100)]
Remove the old Manifest files

--

The Manifest file have been part of an experiment a long time ago to
implement source level integrity.  I is not maintained for more than a
decade and with the advent of git this is superfluous anyway.

4 years agostribog: Reduce table size to the needed one.
Dmitry Eremin-Solenikov [Sun, 28 Dec 2014 09:15:33 +0000 (12:15 +0300)]
stribog: Reduce table size to the needed one.

* cipher/stribog.c (C16): Avoid allocating superfluous space.

--

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
4 years agogostr3411-94: Fix the iteration count for length filling loop.
Dmitry Eremin-Solenikov [Sun, 28 Dec 2014 09:05:43 +0000 (12:05 +0300)]
gostr3411-94: Fix the iteration count for length filling loop.

* cipher/gostr3411-94.c (gost3411_final): Fix loop
--

The maximum iteration count for filling the l (bit length) array was
incrrectly set to 32 (missed that in u8->u32 refactoring). This was
not resulting in stack corruption, since nblocks variable would be
exausted earlier compared to 8 32-bit values (the size of the array).

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
4 years agobuild: Add a commit-msg git-hook script.
Werner Koch [Tue, 6 Jan 2015 13:51:39 +0000 (14:51 +0100)]
build: Add a commit-msg git-hook script.

--

This is the same script as used by GnuPG.  It makes sure that lines
are not too long and checks some other basic things.  ./autogen.sh
installs it.

4 years agorandom: Silent warning under NetBSD using rndunix
Werner Koch [Mon, 5 Jan 2015 18:38:29 +0000 (19:38 +0100)]
random: Silent warning under NetBSD using rndunix

* random/rndunix.c (STDERR_FILENO): Define if needed.
(start_gatherer): Re-open standard descriptors.  Fix an
unsigned/signed pointer warning.
--

GnuPG-bug-id: 1702

4 years agoprimegen: Fix memory leak for invalid call sequences.
Werner Koch [Mon, 5 Jan 2015 17:58:39 +0000 (18:58 +0100)]
primegen: Fix memory leak for invalid call sequences.

* cipher/primegen.c (prime_generate_internal): Refactor generator code
to not leak memory for non-implemented feature.
(_gcry_prime_group_generator): Refactor to not leak memory for invalid
args.  Also make sure that R_G is set as soon as possible.
--

GnuPG-bug-id: 1705
Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agodoc: Update yat2m to current upstream version (GnuPG).
Werner Koch [Mon, 5 Jan 2015 16:47:26 +0000 (17:47 +0100)]
doc: Update yat2m to current upstream version (GnuPG).

4 years agobuild: Require automake 1.14.
Werner Koch [Mon, 5 Jan 2015 16:46:05 +0000 (17:46 +0100)]
build: Require automake 1.14.

* configure.ac (AM_INIT_AUTOMAKE): Add serial-tests.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agocipher: Add the original PD notice to rijndael-ssse3-amd64.c
Werner Koch [Mon, 5 Jan 2015 16:16:04 +0000 (17:16 +0100)]
cipher: Add the original PD notice to rijndael-ssse3-amd64.c

--

4 years agoReplace camel case of internal scrypt functions.
Werner Koch [Mon, 5 Jan 2015 16:04:10 +0000 (17:04 +0100)]
Replace camel case of internal scrypt functions.

* cipher/scrypt.c (_salsa20_core): Rename to salsa20_core.  Change
callers.
(_scryptBlockMix): Rename to scrypt_block_mix.  Change callers.
(_scryptROMix): Rename to scrypt_ro_mix. Change callers.
--

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agodoc: State that gcry_md_write et al may be used after md_read.
Werner Koch [Sun, 28 Dec 2014 13:26:48 +0000 (14:26 +0100)]
doc: State that gcry_md_write et al may be used after md_read.

--

4 years agodoc: typo fix
Werner Koch [Fri, 19 Dec 2014 08:11:08 +0000 (09:11 +0100)]
doc: typo fix

--
GnuPG-bug-id: 1589

4 years agormd160: restore native-endian store in _gcry_rmd160_mixblock
Jussi Kivilinna [Fri, 2 Jan 2015 17:07:24 +0000 (19:07 +0200)]
rmd160: restore native-endian store in _gcry_rmd160_mixblock

* cipher/rmd160.c (_gcry_rmd160_mixblock): Store result to buffer in
native-endianess.
--

Commit 4515315f61fbf79413e150fbd1d5f5a2435f2bc5 unintendedly changed this
native-endian store to little-endian.

Reported-by: Yuriy Kaminskiy <yumkam@gmail.com>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd Intel SSSE3 based vector permutation AES implementation
Jussi Kivilinna [Sat, 27 Dec 2014 10:37:16 +0000 (12:37 +0200)]
Add Intel SSSE3 based vector permutation AES implementation

* cipher/Makefile.am: Add 'rijndael-ssse3-amd64.c'.
* cipher/rijndael-internal.h (USE_SSSE3): New.
(RIJNDAEL_context_s) [USE_SSSE3]: Add 'use_ssse3'.
* cipher/rijndael-ssse3-amd64.c: New.
* cipher/rijndael.c [USE_SSSE3] (_gcry_aes_ssse3_do_setkey)
(_gcry_aes_ssse3_prepare_decryption, _gcry_aes_ssse3_encrypt)
(_gcry_aes_ssse3_decrypt, _gcry_aes_ssse3_cfb_enc)
(_gcry_aes_ssse3_cbc_enc, _gcry_aes_ssse3_ctr_enc)
(_gcry_aes_ssse3_cfb_dec, _gcry_aes_ssse3_cbc_dec): New.
(do_setkey): Add HWF check for SSSE3 and setup for SSSE3
implementation.
(prepare_decryption, _gcry_aes_cfb_enc, _gcry_aes_cbc_enc)
(_gcry_aes_ctr_enc, _gcry_aes_cfb_dec, _gcry_aes_cbc_dec): Add
selection for SSSE3 implementation.
* configure.ac [host=x86_64]: Add 'rijndael-ssse3-amd64.lo'.
--

This patch adds "AES with vector permutations" implementation by
Mike Hamburg. Public-domain source-code is available at:
  http://crypto.stanford.edu/vpaes/

Benchmark on Intel Core2 T8100 (2.1Ghz, no turbo):

Old (AMD64 asm):
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |      8.79 ns/B     108.5 MiB/s     18.46 c/B
        ECB dec |      9.07 ns/B     105.1 MiB/s     19.05 c/B
        CBC enc |      7.77 ns/B     122.7 MiB/s     16.33 c/B
        CBC dec |      7.74 ns/B     123.2 MiB/s     16.26 c/B
        CFB enc |      7.88 ns/B     121.0 MiB/s     16.54 c/B
        CFB dec |      7.56 ns/B     126.1 MiB/s     15.88 c/B
        OFB enc |      9.02 ns/B     105.8 MiB/s     18.94 c/B
        OFB dec |      9.07 ns/B     105.1 MiB/s     19.05 c/B
        CTR enc |      7.80 ns/B     122.2 MiB/s     16.38 c/B
        CTR dec |      7.81 ns/B     122.2 MiB/s     16.39 c/B

New (ssse3):
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |      5.77 ns/B     165.2 MiB/s     12.13 c/B
        ECB dec |      7.13 ns/B     133.7 MiB/s     14.98 c/B
        CBC enc |      5.27 ns/B     181.0 MiB/s     11.06 c/B
        CBC dec |      6.39 ns/B     149.3 MiB/s     13.42 c/B
        CFB enc |      5.27 ns/B     180.9 MiB/s     11.07 c/B
        CFB dec |      5.28 ns/B     180.7 MiB/s     11.08 c/B
        OFB enc |      6.11 ns/B     156.1 MiB/s     12.83 c/B
        OFB dec |      6.13 ns/B     155.5 MiB/s     12.88 c/B
        CTR enc |      5.26 ns/B     181.5 MiB/s     11.04 c/B
        CTR dec |      5.24 ns/B     182.0 MiB/s     11.00 c/B

Benchmark on Intel i5-2450M (2.5Ghz, no turbo, aes-ni disabled):

Old (AMD64 asm):
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |      8.06 ns/B     118.3 MiB/s     20.15 c/B
        ECB dec |      8.21 ns/B     116.1 MiB/s     20.53 c/B
        CBC enc |      7.88 ns/B     121.1 MiB/s     19.69 c/B
        CBC dec |      7.57 ns/B     126.0 MiB/s     18.92 c/B
        CFB enc |      7.87 ns/B     121.2 MiB/s     19.67 c/B
        CFB dec |      7.56 ns/B     126.2 MiB/s     18.89 c/B
        OFB enc |      8.27 ns/B     115.3 MiB/s     20.67 c/B
        OFB dec |      8.28 ns/B     115.1 MiB/s     20.71 c/B
        CTR enc |      8.02 ns/B     119.0 MiB/s     20.04 c/B
        CTR dec |      8.02 ns/B     118.9 MiB/s     20.05 c/B

New (ssse3):
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |      4.03 ns/B     236.6 MiB/s     10.07 c/B
        ECB dec |      5.28 ns/B     180.8 MiB/s     13.19 c/B
        CBC enc |      3.77 ns/B     252.7 MiB/s      9.43 c/B
        CBC dec |      4.69 ns/B     203.3 MiB/s     11.73 c/B
        CFB enc |      3.75 ns/B     254.3 MiB/s      9.37 c/B
        CFB dec |      3.69 ns/B     258.6 MiB/s      9.22 c/B
        OFB enc |      4.17 ns/B     228.7 MiB/s     10.43 c/B
        OFB dec |      4.17 ns/B     228.7 MiB/s     10.42 c/B
        CTR enc |      3.72 ns/B     256.5 MiB/s      9.30 c/B
        CTR dec |      3.72 ns/B     256.1 MiB/s      9.31 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agorandom-csprng: fix compiler warnings on ARM
Jussi Kivilinna [Tue, 23 Dec 2014 11:33:12 +0000 (13:33 +0200)]
random-csprng: fix compiler warnings on ARM

* random/random-csprng.c (_gcry_rngcsprng_update_seed_file)
(read_pool): Cast keypool and rndpool to 'unsigned long *' through
'void *'.
--

Patch fixes 'cast increases required alignment' warnings seen on GCC:

random-csprng.c: In function '_gcry_rngcsprng_update_seed_file':
random-csprng.c:867:15: warning: cast increases required alignment of target type [-Wcast-align]
   for (i=0,dp=(unsigned long*)keypool, sp=(unsigned long*)rndpool;
               ^
random-csprng.c:867:43: warning: cast increases required alignment of target type [-Wcast-align]
   for (i=0,dp=(unsigned long*)keypool, sp=(unsigned long*)rndpool;
                                           ^
random-csprng.c: In function 'read_pool':
random-csprng.c:1023:14: warning: cast increases required alignment of target type [-Wcast-align]
   for(i=0,dp=(unsigned long*)keypool, sp=(unsigned long*)rndpool;
              ^
random-csprng.c:1023:42: warning: cast increases required alignment of target type [-Wcast-align]
   for(i=0,dp=(unsigned long*)keypool, sp=(unsigned long*)rndpool;
                                          ^

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoscrypt: fix compiler warnings on ARM
Jussi Kivilinna [Tue, 23 Dec 2014 11:31:58 +0000 (13:31 +0200)]
scrypt: fix compiler warnings on ARM

* cipher/scrypt.c (_scryptBlockMix): Cast X to 'u32 *' through 'void *'.
--

Patch fixes 'cast increases required alignment' warnings seen on GCC:

scrypt.c: In function '_scryptBlockMix':
scrypt.c:145:22: warning: cast increases required alignment of target type [-Wcast-align]
       _salsa20_core ((u32*)X, (u32*)X, 8);
                      ^
scrypt.c:145:31: warning: cast increases required alignment of target type [-Wcast-align]
       _salsa20_core ((u32*)X, (u32*)X, 8);
                               ^

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agosecmem: fix compiler warnings on ARM
Jussi Kivilinna [Tue, 23 Dec 2014 11:31:09 +0000 (13:31 +0200)]
secmem: fix compiler warnings on ARM

* src/secmem.c (ADDR_TO_BLOCK, mb_get_next, mb_get_new): Cast pointer
from 'char *' to 'memblock_t *' through 'void *'.
(MB_WIPE_OUT): Remove unneeded cast to 'memblock_t *'.
--

Patch fixes 'cast increases required alignment' warnings seen on GCC:

secmem.c: In function 'mb_get_next':
secmem.c:140:13: warning: cast increases required alignment of target type [-Wcast-align]
   mb_next = (memblock_t *) ((char *) mb + BLOCK_HEAD_SIZE + mb->size);
             ^
secmem.c: In function 'mb_get_new':
secmem.c:208:17: warning: cast increases required alignment of target type [-Wcast-align]
      mb_split = (memblock_t *) (((char *) mb) + BLOCK_HEAD_SIZE + size);
                 ^
secmem.c: In function '_gcry_secmem_free_internal':
secmem.c:101:3: warning: cast increases required alignment of target type [-Wcast-align]
   (memblock_t *) ((char *) addr - BLOCK_HEAD_SIZE)
   ^
secmem.c:603:8: note: in expansion of macro 'ADDR_TO_BLOCK'
   mb = ADDR_TO_BLOCK (a);
        ^
In file included from secmem.c:40:0:
secmem.c:609:16: warning: cast increases required alignment of target type [-Wcast-align]
   wipememory2 ((memblock_t *) ((char *) mb + BLOCK_HEAD_SIZE), (byte), size);
                ^
g10lib.h:309:54: note: in definition of macro 'wipememory2'
               volatile char *_vptr=(volatile char *)(_ptr); \
                                                      ^
secmem.c:611:3: note: in expansion of macro 'MB_WIPE_OUT'
   MB_WIPE_OUT (0xff);
   ^
secmem.c:609:16: warning: cast increases required alignment of target type [-Wcast-align]
   wipememory2 ((memblock_t *) ((char *) mb + BLOCK_HEAD_SIZE), (byte), size);
                ^
g10lib.h:309:54: note: in definition of macro 'wipememory2'
               volatile char *_vptr=(volatile char *)(_ptr); \
                                                      ^
secmem.c:612:3: note: in expansion of macro 'MB_WIPE_OUT'
   MB_WIPE_OUT (0xaa);
   ^
secmem.c:609:16: warning: cast increases required alignment of target type [-Wcast-align]
   wipememory2 ((memblock_t *) ((char *) mb + BLOCK_HEAD_SIZE), (byte), size);
                ^
g10lib.h:309:54: note: in definition of macro 'wipememory2'
               volatile char *_vptr=(volatile char *)(_ptr); \
                                                      ^
secmem.c:613:3: note: in expansion of macro 'MB_WIPE_OUT'
   MB_WIPE_OUT (0x55);
   ^
secmem.c:609:16: warning: cast increases required alignment of target type [-Wcast-align]
   wipememory2 ((memblock_t *) ((char *) mb + BLOCK_HEAD_SIZE), (byte), size);
                ^
g10lib.h:309:54: note: in definition of macro 'wipememory2'
               volatile char *_vptr=(volatile char *)(_ptr); \
                                                      ^
secmem.c:614:3: note: in expansion of macro 'MB_WIPE_OUT'
   MB_WIPE_OUT (0x00);
   ^
secmem.c: In function '_gcry_secmem_realloc':
secmem.c:644:8: warning: cast increases required alignment of target type [-Wcast-align]
   mb = (memblock_t *) ((char *) p - ((size_t) &((memblock_t *) 0)->aligned.c));
        ^

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agohash: fix compiler warning on ARM
Jussi Kivilinna [Tue, 23 Dec 2014 11:01:33 +0000 (13:01 +0200)]
hash: fix compiler warning on ARM

* cipher/md.c (md_open, md_copy): Cast 'char *' to ctx through
'void *'.
* cipher/md4.c (md4_final): Use buf_put_* helper instead of
converting 'char *' to 'u32 *'.
* cipher/md5.c (md5_final): Ditto.
* cipher/rmd160.c (_gcry_rmd160_mixblock, rmd160_final): Ditto.
* cipher/sha1.c (sha1_final): Ditto.
* cipher/sha256.c (sha256_final): Ditto.
* cipher/sha512.c (sha512_final): Ditto.
* cipher/tiger.c (tiger_final): Ditto.
--

Patch fixes 'cast increases required alignment' warnings seen on GCC:

md.c: In function 'md_open':
md.c:318:23: warning: cast increases required alignment of target type [-Wcast-align]
       hd->ctx = ctx = (struct gcry_md_context *) ((char *) hd + n);
                       ^
md.c: In function 'md_copy':
md.c:491:22: warning: cast increases required alignment of target type [-Wcast-align]
       bhd->ctx = b = (struct gcry_md_context *) ((char *) bhd + n);
                      ^
md4.c: In function 'md4_final':
md4.c:258:20: warning: cast increases required alignment of target type [-Wcast-align]
 #define X(a) do { *(u32*)p = le_bswap32((*hd).a) ; p += 4; } while(0)
                    ^
md4.c:259:3: note: in expansion of macro 'X'
   X(A);
   ^
md4.c:258:20: warning: cast increases required alignment of target type [-Wcast-align]
 #define X(a) do { *(u32*)p = le_bswap32((*hd).a) ; p += 4; } while(0)
                    ^
md4.c:260:3: note: in expansion of macro 'X'
   X(B);
   ^
[removed the rest]

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agorijndael: fix compiler warnings on ARM
Jussi Kivilinna [Tue, 23 Dec 2014 10:13:50 +0000 (12:13 +0200)]
rijndael: fix compiler warnings on ARM

* cipher/rijndael-internal.h (RIJNDAEL_context_s): Add u32 variants of
keyschedule arrays to unions u1 and u2.
(keyschedenc32, keyscheddec32): New.
* cipher/rijndael.c (u32_a_t): Remove.
(do_setkey): Add and use tkk[].data32, k_u32, tk_u32 and W_u32; Remove
casting byte arrays to u32_a_t.
(prepare_decryption, do_encrypt_fn, do_decrypt_fn): Use keyschedenc32
and keyscheddec32; Remove casting byte arrays to u32_a_t.
--

Patch fixes 'cast increases required alignment' compiler warnings that GCC was showing:

rijndael.c: In function 'do_setkey':
rijndael.c:310:13: warning: cast increases required alignment of target type [-Wcast-align]
           *((u32_a_t*)tk[j]) = *((u32_a_t*)k[j]);
             ^
rijndael.c:310:34: warning: cast increases required alignment of target type [-Wcast-align]
           *((u32_a_t*)tk[j]) = *((u32_a_t*)k[j]);
[removed the rest]

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoPoly1305-AEAD: updated implementation to match draft-irtf-cfrg-chacha20-poly1305-03
Jussi Kivilinna [Sun, 21 Dec 2014 15:36:59 +0000 (17:36 +0200)]
Poly1305-AEAD: updated implementation to match draft-irtf-cfrg-chacha20-poly1305-03

* cipher/cipher-internal.h (gcry_cipher_handle): Use separate byte
counters for AAD and data in Poly1305.
* cipher/cipher-poly1305.c (poly1305_fill_bytecount): Remove.
(poly1305_fill_bytecounts, poly1305_do_padding): New.
(poly1305_aad_finish): Fill padding to Poly1305 and do not fill AAD
length.
(_gcry_cipher_poly1305_authenticate, _gcry_cipher_poly1305_encrypt)
(_gcry_cipher_poly1305_decrypt): Update AAD and data length separately.
(_gcry_cipher_poly1305_tag): Fill padding and bytecounts to Poly1305.
(_gcry_cipher_poly1305_setkey, _gcry_cipher_poly1305_setiv): Reset
AAD and data byte counts; only allow 96-bit IV.
* cipher/cipher.c (_gcry_cipher_open_internal): Limit Poly1305-AEAD to
ChaCha20 cipher.
* tests/basic.c (_check_poly1305_cipher): Update test-vectors.
(check_ciphers): Limit Poly1305-AEAD checks to ChaCha20.
* tests/bench-slope.c (cipher_bench_one): Ditto.
--

Latest Internet-Draft version for "ChaCha20 and Poly1305 for IETF protocols"
has added additional padding to Poly1305-AEAD and limited support IV size to
96-bits:
 https://www.ietf.org/rfcdiff?url1=draft-nir-cfrg-chacha20-poly1305-03&difftype=--html&submit=Go!&url2=draft-irtf-cfrg-chacha20-poly1305-03

Patch makes Poly1305-AEAD implementation to match the changes and limits
Poly1305-AEAD to ChaCha20 only.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agochacha20: allow setting counter for stream random access
Jussi Kivilinna [Sun, 21 Dec 2014 15:36:59 +0000 (17:36 +0200)]
chacha20: allow setting counter for stream random access

* cipher/chacha20.c (CHACHA20_CTR_SIZE): New.
(chacha20_ivsetup): Add setup for full counter.
(chacha20_setiv): Allow ivlen == CHACHA20_CTR_SIZE.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agogcm: do not pass extra key pointer for setupM/fillM
Jussi Kivilinna [Tue, 23 Dec 2014 10:35:37 +0000 (12:35 +0200)]
gcm: do not pass extra key pointer for setupM/fillM

* cipher/cipher-gcm-intel-pclmul.c
(_gcry_ghash_setup_intel_pclmul): Remove 'h' parameter.
* cipher/cipher-gcm.c (_gcry_ghash_setup_intel_pclmul): Ditto.
(fillM): Get 'h' pointer from 'c'.
(setupM): Remome 'h' parameter.
(_gcry_cipher_gcm_setkey): Only pass 'c' to setupM.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agorijndael: use more compact look-up tables and add table prefetching
Jussi Kivilinna [Tue, 23 Dec 2014 10:35:28 +0000 (12:35 +0200)]
rijndael: use more compact look-up tables and add table prefetching

* cipher/rijndael-internal.h (rijndael_prefetchfn_t): New.
(RIJNDAEL_context): Add 'prefetch_enc_fn' and 'prefetch_dec_fn'.
* cipher/rijndael-tables.h (S, T1, T2, T3, T4, T5, T6, T7, T8, S5, U1)
(U2, U3, U4): Remove.
(encT, dec_tables, decT, inv_sbox): Add.
* cipher/rijndael.c (_gcry_aes_amd64_encrypt_block)
(_gcry_aes_amd64_decrypt_block, _gcry_aes_arm_encrypt_block)
(_gcry_aes_arm_encrypt_block): Add parameter for passing table pointer
to assembly implementation.
(prefetch_table, prefetch_enc, prefetch_dec): New.
(do_setkey): Setup context prefetch functions depending on selected
rijndael implementation; Use new tables for key setup.
(prepare_decryption): Use new tables for decryption key setup.
(do_encrypt_aligned): Rename to...
(do_encrypt_fn): ... to this, change to use new compact tables,
make handle unaligned input and unroll rounds loop by two.
(do_encrypt): Remove handling of unaligned input/output; pass table
pointer to assembly implementations.
(rijndael_encrypt, _gcry_aes_cfb_enc, _gcry_aes_cbc_enc)
(_gcry_aes_ctr_enc, _gcry_aes_cfb_dec): Prefetch encryption tables
before encryption.
(do_decrypt_aligned): Rename to...
(do_decrypt_fn): ... to this, change to use new compact tables,
make handle unaligned input and unroll rounds loop by two.
(do_decrypt): Remove handling of unaligned input/output; pass table
pointer to assembly implementations.
(rijndael_decrypt, _gcry_aes_cbc_dec): Prefetch decryption tables
before decryption.
* cipher/rijndael-amd64.S: Use 1+1.25 KiB tables for
encryption+decryption; remove tables from assembly file.
* cipher/rijndael-arm.S: Ditto.
--

Patch replaces 4+4.25 KiB look-up tables in generic implementation and
8+8 KiB look-up tables in AMD64 implementation and 2+2 KiB look-up tables in
ARM implementation with 1+1.25 KiB look-up tables, and adds prefetching of
look-up tables.

AMD64 assembly is slower than before because of additional rotation
instructions. The generic C implementation is now better optimized and
actually faster than before.

Benchmark results on Intel i5-4570 (turbo off) (64-bit, AMD64 assembly):

tests/bench-slope --disable-hwf intel-aesni --cpu-mhz 3200 cipher aes

Old:
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |      3.10 ns/B     307.5 MiB/s      9.92 c/B
        ECB dec |      3.15 ns/B     302.5 MiB/s     10.09 c/B
        CBC enc |      3.46 ns/B     275.5 MiB/s     11.08 c/B
        CBC dec |      3.19 ns/B     299.2 MiB/s     10.20 c/B
        CFB enc |      3.48 ns/B     274.4 MiB/s     11.12 c/B
        CFB dec |      3.23 ns/B     294.8 MiB/s     10.35 c/B
        OFB enc |      3.29 ns/B     290.2 MiB/s     10.52 c/B
        OFB dec |      3.31 ns/B     288.3 MiB/s     10.58 c/B
        CTR enc |      3.64 ns/B     261.7 MiB/s     11.66 c/B
        CTR dec |      3.65 ns/B     261.6 MiB/s     11.67 c/B

New:
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |      4.21 ns/B     226.7 MiB/s     13.46 c/B
        ECB dec |      4.27 ns/B     223.2 MiB/s     13.67 c/B
        CBC enc |      4.15 ns/B     229.8 MiB/s     13.28 c/B
        CBC dec |      3.85 ns/B     247.8 MiB/s     12.31 c/B
        CFB enc |      4.16 ns/B     229.1 MiB/s     13.32 c/B
        CFB dec |      3.88 ns/B     245.9 MiB/s     12.41 c/B
        OFB enc |      4.38 ns/B     217.8 MiB/s     14.01 c/B
        OFB dec |      4.36 ns/B     218.6 MiB/s     13.96 c/B
        CTR enc |      4.30 ns/B     221.6 MiB/s     13.77 c/B
        CTR dec |      4.30 ns/B     221.7 MiB/s     13.76 c/B

Benchmark on Intel i5-4570 (turbo off) (32-bit mingw, generic C):

tests/bench-slope.exe --disable-hwf intel-aesni --cpu-mhz 3200 cipher aes

Old:
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |      6.03 ns/B     158.2 MiB/s     19.29 c/B
        ECB dec |      5.81 ns/B     164.1 MiB/s     18.60 c/B
        CBC enc |      6.22 ns/B     153.4 MiB/s     19.90 c/B
        CBC dec |      5.91 ns/B     161.3 MiB/s     18.92 c/B
        CFB enc |      6.25 ns/B     152.7 MiB/s     19.99 c/B
        CFB dec |      6.24 ns/B     152.8 MiB/s     19.97 c/B
        OFB enc |      6.33 ns/B     150.6 MiB/s     20.27 c/B
        OFB dec |      6.33 ns/B     150.7 MiB/s     20.25 c/B
        CTR enc |      6.28 ns/B     152.0 MiB/s     20.08 c/B
        CTR dec |      6.28 ns/B     151.7 MiB/s     20.11 c/B

New:
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |      5.02 ns/B     190.0 MiB/s     16.06 c/B
        ECB dec |      5.33 ns/B     178.8 MiB/s     17.07 c/B
        CBC enc |      4.64 ns/B     205.4 MiB/s     14.86 c/B
        CBC dec |      4.95 ns/B     192.7 MiB/s     15.84 c/B
        CFB enc |      4.75 ns/B     200.7 MiB/s     15.20 c/B
        CFB dec |      4.74 ns/B     201.1 MiB/s     15.18 c/B
        OFB enc |      5.29 ns/B     180.3 MiB/s     16.93 c/B
        OFB dec |      5.29 ns/B     180.3 MiB/s     16.93 c/B
        CTR enc |      4.77 ns/B     200.0 MiB/s     15.26 c/B
        CTR dec |      4.77 ns/B     199.8 MiB/s     15.27 c/B

Benchmark on Cortex-A8 (ARM assembly):

tests/bench-slope --cpu-mhz 1008 cipher aes

Old:
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     21.84 ns/B     43.66 MiB/s     22.02 c/B
        ECB dec |     22.35 ns/B     42.67 MiB/s     22.53 c/B
        CBC enc |     22.97 ns/B     41.53 MiB/s     23.15 c/B
        CBC dec |     23.48 ns/B     40.61 MiB/s     23.67 c/B
        CFB enc |     22.72 ns/B     41.97 MiB/s     22.90 c/B
        CFB dec |     23.41 ns/B     40.74 MiB/s     23.59 c/B
        OFB enc |     23.65 ns/B     40.32 MiB/s     23.84 c/B
        OFB dec |     23.67 ns/B     40.29 MiB/s     23.86 c/B
        CTR enc |     23.24 ns/B     41.03 MiB/s     23.43 c/B
        CTR dec |     23.23 ns/B     41.05 MiB/s     23.42 c/B

New:
 AES            |  nanosecs/byte   mebibytes/sec   cycles/byte
        ECB enc |     26.03 ns/B     36.64 MiB/s     26.24 c/B
        ECB dec |     26.97 ns/B     35.36 MiB/s     27.18 c/B
        CBC enc |     23.21 ns/B     41.09 MiB/s     23.39 c/B
        CBC dec |     23.36 ns/B     40.83 MiB/s     23.54 c/B
        CFB enc |     23.02 ns/B     41.42 MiB/s     23.21 c/B
        CFB dec |     23.67 ns/B     40.28 MiB/s     23.86 c/B
        OFB enc |     27.86 ns/B     34.24 MiB/s     28.08 c/B
        OFB dec |     27.87 ns/B     34.21 MiB/s     28.10 c/B
        CTR enc |     23.47 ns/B     40.63 MiB/s     23.66 c/B
        CTR dec |     23.49 ns/B     40.61 MiB/s     23.67 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agobuild: Add configure option --disable-doc.
Werner Koch [Mon, 15 Dec 2014 11:05:32 +0000 (12:05 +0100)]
build: Add configure option --disable-doc.

* Makefile.am (AUTOMAKE_OPTIONS): Remove.
(doc) [!BUILD_DOC]: Do not recurse into the dir.
* configure.ac (AM_INIT_AUTOMAKE): Add option formerly in Makefile.am.
(BUILD_DOC): Add new am_conditional.

4 years agorijndael: further optimizations for AES-NI accelerated CBC and CFB bulk modes
Jussi Kivilinna [Sat, 6 Dec 2014 13:09:13 +0000 (15:09 +0200)]
rijndael: further optimizations for AES-NI accelerated CBC and CFB bulk modes

* cipher/rijndael-aesni.c (do_aesni_enc, do_aesni_dec): Pass
input/output through SSE register XMM0.
(do_aesni_cfb): Remove.
(_gcry_aes_aesni_encrypt, _gcry_aes_aesni_decrypt): Add loading/storing
input/output to/from XMM0.
(_gcry_aes_aesni_cfb_enc, _gcry_aes_aesni_cbc_enc)
(_gcry_aes_aesni_cfb_dec): Update to use renewed 'do_aesni_enc' and
move IV loading/storing outside loop.
(_gcry_aes_aesni_cbc_dec): Update to use renewed 'do_aesni_dec'.
--

CBC encryption speed is improved ~16% on Intel Haswell and CFB encryption ~8%.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoGCM: move Intel PCLMUL accelerated implementation to separate file
Jussi Kivilinna [Sat, 6 Dec 2014 08:38:36 +0000 (10:38 +0200)]
GCM: move Intel PCLMUL accelerated implementation to separate file

* cipher/Makefile.am: Add 'cipher-gcm-intel-pclmul.c'.
* cipher/cipher-gcm-intel-pclmul.c: New.
* cipher/cipher-gcm.c [GCM_USE_INTEL_PCLMUL]
(_gcry_ghash_setup_intel_pclmul, _gcry_ghash_intel_pclmul): New
prototypes.
[GCM_USE_INTEL_PCLMUL] (gfmul_pclmul, gfmul_pclmul_aggr4): Move
to 'cipher-gcm-intel-pclmul.c'.
(ghash): Rename to...
(ghash_internal): ...this and move GCM_USE_INTEL_PCLMUL part to new
function in 'cipher-gcm-intel-pclmul.c'.
(setupM): Move GCM_USE_INTEL_PCLMUL part to new function in
'cipher-gcm-intel-pclmul.c'; Add selection of ghash function based
on available HW acceleration.
(do_ghash_buf): Change use of 'ghash' to 'c->u_mode.gcm.ghash_fn'.
* cipher/internal.h (ghash_fn_t): New.
(gcry_cipher_handle): Remove 'use_intel_pclmul'; Add 'ghash_fn'.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agorijndael: split Padlock part to separate file
Jussi Kivilinna [Mon, 1 Dec 2014 19:10:19 +0000 (21:10 +0200)]
rijndael: split Padlock part to separate file

* cipher/Makefile.am: Add 'rijndael-padlock.c'.
* cipher/rijndael-padlock.c: New.
* cipher/rijndael.c (do_padlock, do_padlock_encrypt)
(do_padlock_decrypt): Move to 'rijndael-padlock.c'.
* configure.ac [mpi_cpu_arch=x86]: Add 'rijndael-padlock.lo'.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agorijndael: refactor to reduce number of #ifdefs and branches
Jussi Kivilinna [Mon, 1 Dec 2014 19:10:19 +0000 (21:10 +0200)]
rijndael: refactor to reduce number of #ifdefs and branches

* cipher/rijndael-aesni.c (_gcry_aes_aesni_encrypt)
(_gcry_aes_aesni_decrypt): Make return stack burn depth.
* cipher/rijndael-amd64.S (_gcry_aes_amd64_encrypt_block)
(_gcry_aes_amd64_decrypt_block): Ditto.
* cipher/rijndael-arm.S (_gcry_aes_arm_encrypt_block)
(_gcry_aes_arm_decrypt_block): Ditto.
* cipher/rijndael-internal.h (RIJNDAEL_context_s)
(rijndael_cryptfn_t): New.
(RIJNDAEL_context): New members 'encrypt_fn' and 'decrypt_fn'.
* cipher/rijndael.c (_gcry_aes_amd64_encrypt_block)
(_gcry_aes_amd64_decrypt_block, _gcry_aes_aesni_encrypt)
(_gcry_aes_aesni_decrypt, _gcry_aes_arm_encrypt_block)
(_gcry_aes_arm_decrypt_block): Change prototypes.
(do_padlock_encrypt, do_padlock_decrypt): New.
(do_setkey): Separate key-length to rounds conversion from
HW features check; Add selection for ctx->encrypt_fn and
ctx->decrypt_fn.
(do_encrypt_aligned, do_decrypt_aligned): Move inside
'[!USE_AMD64_ASM && !USE_ARM_ASM]'; Move USE_AMD64_ASM and
USE_ARM_ASM to...
(do_encrypt, do_decrypt): ...here; Return stack depth; Remove second
temporary buffer from non-aligned input/output case.
(do_padlock): Move decrypt_flag to last argument; Return stack depth.
(rijndael_encrypt): Remove #ifdefs, just call ctx->encrypt_fn.
(_gcry_aes_cfb_enc, _gcry_aes_cbc_enc): Remove USE_PADLOCK; Call
ctx->encrypt_fn in place of do_encrypt/do_encrypt_aligned.
(_gcry_aes_ctr_enc): Call ctx->encrypt_fn in place of
do_encrypt_aligned; Make tmp buffer 16-byte aligned and wipe buffer
after use.
(rijndael_encrypt): Remove #ifdefs, just call ctx->decrypt_fn.
(_gcry_aes_cfb_dec): Remove USE_PADLOCK; Call ctx->decrypt_fn in place
of do_decrypt/do_decrypt_aligned.
(_gcry_aes_cbc_dec): Ditto; Make savebuf buffer 16-byte aligned.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agorijndael: move AES-NI blocks before Padlock
Jussi Kivilinna [Mon, 1 Dec 2014 19:10:19 +0000 (21:10 +0200)]
rijndael: move AES-NI blocks before Padlock

* cipher/rijndael.c (do_setkey, rijndael_encrypt, _gcry_aes_cfb_enc)
(rijndael_decrypt, _gcry_aes_cfb_dec): Move USE_AESNI before
USE_PADLOCK.
(check_decryption_praparation) [USE_PADLOCK]: Move to...
(prepare_decryption) [USE_PADLOCK]: ...here.
--

Make order of AES-NI and Padlock #ifdefs consistent.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agorijndael: split AES-NI functions to separate file
Jussi Kivilinna [Mon, 1 Dec 2014 19:10:19 +0000 (21:10 +0200)]
rijndael: split AES-NI functions to separate file

* cipher/Makefile.in: Add 'rijndael-aesni.c'.
* cipher/rijndael-aesni.c: New.
* cipher/rijndael-internal.h: New.
* cipher/rijndael.c (MAXKC, MAXROUNDS, BLOCKSIZE, ATTR_ALIGNED_16)
(USE_AMD64_ASM, USE_ARM_ASM, USE_PADLOCK, USE_AESNI, RIJNDAEL_context)
(keyschenc, keyschdec, padlockkey): Move to 'rijndael-internal.h'.
(u128_s, aesni_prepare, aesni_cleanup, aesni_cleanup_2_6)
(aesni_do_setkey, do_aesni_enc, do_aesni_dec, do_aesni_enc_vec4)
(do_aesni_dec_vec4, do_aesni_cfb, do_aesni_ctr, do_aesni_ctr_4): Move
to 'rijndael-aesni.c'.
(prepare_decryption, rijndael_encrypt, _gcry_aes_cfb_enc)
(_gcry_aes_cbc_enc, _gcry_aes_ctr_enc, rijndael_decrypt)
(_gcry_aes_cfb_dec, _gcry_aes_cbc_dec) [USE_AESNI]: Move to functions
in 'rijdael-aesni.c'.
* configure.ac [mpi_cpu_arch=x86]: Add 'rijndael-aesni.lo'.
--

Clean-up rijndael.c before new new hardware acceleration support gets added.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoRemove duplicated prototypes.
Werner Koch [Mon, 24 Nov 2014 11:28:33 +0000 (12:28 +0100)]
Remove duplicated prototypes.

* src/gcrypt-int.h (_gcry_mpi_ec_new, _gcry_mpi_ec_set_mpi)
(gcry_mpi_ec_set_point): Remove.
--

Thos used gpg_error_t instead of gpg_err_code_t and the picky AIX
compiler takes this as a severe error.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agotests: Add a prime mode to benchmark.
Werner Koch [Tue, 14 Oct 2014 19:29:33 +0000 (21:29 +0200)]
tests: Add a prime mode to benchmark.

* tests/benchmark.c (progress_cb): Add a single char mode.
(prime_bench): New.
(main): Add a "prime" mode.  Factor with_progress out to file scope.

Signed-off-by: Werner Koch <wk@gnupg.org>
4 years agoecc: Improve Montgomery curve implementation.
NIIBE Yutaka [Wed, 19 Nov 2014 06:48:12 +0000 (15:48 +0900)]
ecc: Improve Montgomery curve implementation.

* cipher/ecc-curves.c (_gcry_ecc_fill_in_curve): Support
MPI_EC_MONTGOMERY.
* cipher/ecc.c (test_ecdh_only_keys): New.
(nist_generate_key): Call test_ecdh_only_keys for MPI_EC_MONTGOMERY.
(check_secret_key): Handle Montgomery curve of x-coordinate only.
* mpi/ec.c (_gcry_mpi_ec_mul_point): Resize points before the loop.
Simplify, using pointers of Q1, Q2, PRD, and SUM.
--

4 years agoDisable NEON for CPUs that are known to have broken NEON implementation
Jussi Kivilinna [Sun, 2 Nov 2014 15:45:35 +0000 (17:45 +0200)]
Disable NEON for CPUs that are known to have broken NEON implementation

* src/hwf-arm.c (detect_arm_proc_cpuinfo): Add parsing for CPU version
information and check if CPU is known to have broken NEON
implementation.
(_gcry_hwf_detect_arm): Filter out broken HW features.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoAdd ARM/NEON implementation of Poly1305
Jussi Kivilinna [Sun, 2 Nov 2014 14:01:11 +0000 (16:01 +0200)]
Add ARM/NEON implementation of Poly1305

* cipher/Makefile.am: Add 'poly1305-armv7-neon.S'.
* cipher/poly1305-armv7-neon.S: New.
* cipher/poly1305-internal.h (POLY1305_USE_NEON)
(POLY1305_NEON_BLOCKSIZE, POLY1305_NEON_STATESIZE)
(POLY1305_NEON_ALIGNMENT): New.
* cipher/poly1305.c [POLY1305_USE_NEON]
(_gcry_poly1305_armv7_neon_init_ext)
(_gcry_poly1305_armv7_neon_finish_ext)
(_gcry_poly1305_armv7_neon_blocks, poly1305_armv7_neon_ops): New.
(_gcry_poly1305_init) [POLY1305_USE_NEON]: Select NEON implementation
if HWF_ARM_NEON set.
* configure.ac [neonsupport=yes]: Add 'poly1305-armv7-neon.lo'.
--

Add Andrew Moon's public domain NEON implementation of Poly1305. Original
source is available at: https://github.com/floodyberry/poly1305-opt

Benchmark on Cortex-A8 (--cpu-mhz 1008):

Old:
                    |  nanosecs/byte   mebibytes/sec   cycles/byte
 POLY1305           |     12.34 ns/B     77.27 MiB/s     12.44 c/B

New:
                    |  nanosecs/byte   mebibytes/sec   cycles/byte
 POLY1305           |      2.12 ns/B     450.7 MiB/s      2.13 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agochacha20: add ARMv7/NEON implementation
Jussi Kivilinna [Wed, 6 Aug 2014 17:05:16 +0000 (20:05 +0300)]
chacha20: add ARMv7/NEON implementation

* cipher/Makefile.am: Add 'chacha20-armv7-neon.S'.
* cipher/chacha20-armv7-neon.S: New.
* cipher/chacha20.c (USE_NEON): New.
[USE_NEON] (_gcry_chacha20_armv7_neon_blocks): New.
(chacha20_do_setkey) [USE_NEON]: Use Neon implementation if
HWF_ARM_NEON flag set.
(selftest): Self-test encrypting buffer byte by byte.
* configure.ac [neonsupport=yes]: Add 'chacha20-armv7-neon.lo'.
--

Add Andrew Moon's public domain ARMv7/NEON implementation of ChaCha20. Original
source is available at: https://github.com/floodyberry/chacha-opt

Benchmark on Cortex-A8 (--cpu-mhz 1008):

Old:
 CHACHA20       |  nanosecs/byte   mebibytes/sec   cycles/byte
     STREAM enc |     13.45 ns/B     70.92 MiB/s     13.56 c/B
     STREAM dec |     13.45 ns/B     70.90 MiB/s     13.56 c/B

New:
 CHACHA20       |  nanosecs/byte   mebibytes/sec   cycles/byte
     STREAM enc |      6.20 ns/B     153.9 MiB/s      6.25 c/B
     STREAM dec |      6.20 ns/B     153.9 MiB/s      6.25 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoRegister DCO for Markus Teich
Werner Koch [Thu, 9 Oct 2014 06:31:35 +0000 (08:31 +0200)]
Register DCO for Markus Teich

--

4 years agompi: Add gcry_mpi_ec_sub.
Markus Teich [Tue, 7 Oct 2014 16:24:27 +0000 (18:24 +0200)]
mpi: Add gcry_mpi_ec_sub.

* NEWS (gcry_mpi_ec_sub): New.
* doc/gcrypt.texi (gcry_mpi_ec_sub): New.
* mpi/ec.c (_gcry_mpi_ec_sub, sub_points_edwards): New.
(sub_points_montgomery, sub_points_weierstrass): New stubs.
* src/gcrypt-int.h (_gcry_mpi_ec_sub): New.
* src/gcrypt.h.in (gcry_mpi_ec_sub): New.
* src/libgcrypt.def (gcry_mpi_ec_sub): New.
* src/libgcrypt.vers (gcry_mpi_ec_sub): New.
* src/mpi.h (_gcry_mpi_ec_sub_points): New.
* src/visibility.c (gcry_mpi_ec_sub): New.
* src/visibility.h (gcry_mpi_ec_sub): New.
--

This function subtracts two points on the curve. Only Twisted Edwards
curves are supported with this change.

Signed-off-by: Markus Teich <markus dot teich at stusta dot mhn dot de>
4 years agodoc: Fix a configure option name.
Werner Koch [Wed, 8 Oct 2014 12:42:36 +0000 (14:42 +0200)]
doc: Fix a configure option name.

--

4 years agoFix prime test for 2 and lower and add check command to mpicalc.
Werner Koch [Wed, 8 Oct 2014 12:41:21 +0000 (14:41 +0200)]
Fix prime test for 2 and lower and add check command to mpicalc.

* cipher/primegen.c (check_prime): Return true for the small primes.
(_gcry_prime_check): Return correct values for 2 and lower numbers.

* src/mpicalc.c (do_primecheck): New.
(main): Add command 'P'.
(main): Allow for larger input data.

4 years agoAdd Whirlpool AMD64/SSE2 assembly implementation
Jussi Kivilinna [Sun, 31 Aug 2014 10:17:24 +0000 (13:17 +0300)]
Add Whirlpool AMD64/SSE2 assembly implementation

* cipher/Makefile.am: Add 'whirlpool-sse2-amd64.S'.
* cipher/whirlpool-sse2-amd64.S: New.
* cipher/whirlpool.c (USE_AMD64_ASM): New.
(whirlpool_tables_s): New.
(rc, C0, C1, C2, C3, C4, C5, C6, C7): Combine these tables into single
structure and replace old tables with macros of same name.
(tab): New structure containing above tables.
[USE_AMD64_ASM] (_gcry_whirlpool_transform_amd64)
(whirlpool_transform): New.
* configure.ac [host=x86_64]: Add 'whirlpool-sse2-amd64.lo'.
--

Benchmark results:

On Intel Core i5-4570 (3.2 Ghz):
After:
 WHIRLPOOL      |      4.82 ns/B     197.8 MiB/s     15.43 c/B
Before:
 WHIRLPOOL      |      9.10 ns/B     104.8 MiB/s     29.13 c/B

On Intel Core i5-2450M (2.5 Ghz):
After:
 WHIRLPOOL      |      8.43 ns/B     113.1 MiB/s     21.09 c/B
Before:
 WHIRLPOOL      |     13.45 ns/B     70.92 MiB/s     33.62 c/B

On Intel Core2 T8100 (2.1 Ghz):
After:
 WHIRLPOOL      |     10.22 ns/B     93.30 MiB/s     21.47 c/B
Before:
 WHIRLPOOL      |     19.87 ns/B     48.00 MiB/s     41.72 c/B

Summary, old vs new ratio:

 Intel Core i5-4570: 1.88x
 Intel Core i5-2450M: 1.59x
 Intel Core2 T8100: 1.94x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoImproved ripemd160 performance
Andrei Scherer [Thu, 28 Aug 2014 17:45:35 +0000 (09:45 -0800)]
Improved ripemd160 performance

* cipher/rmd160.c (transform): Interleave the left and right lane
rounds to introduce more instruction level parallelism.
--

The benchmarks on different systems:

Intel(R) Atom(TM) CPU N570   @ 1.66GHz
before:
Hash:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 RIPEMD160      |     13.07 ns/B     72.97 MiB/s         - c/B
after:
Hash:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 RIPEMD160      |     11.37 ns/B     83.84 MiB/s         - c/B

Intel(R) Core(TM) i5-4670 CPU @ 3.40GHz
before:
Hash:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 RIPEMD160      |      3.31 ns/B     288.0 MiB/s         - c/B
after:
Hash:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 RIPEMD160      |      2.08 ns/B     458.5 MiB/s         - c/B

Signed-off-by: Andrei Scherer <andsch@inbox.com>
4 years agobuild: Document SYSROOT.
Werner Koch [Thu, 2 Oct 2014 12:49:31 +0000 (14:49 +0200)]
build: Document SYSROOT.

* configure.ac: Mark SYSROOT as arg var.

4 years agobuild: Support SYSROOT based config script finding.
Werner Koch [Thu, 2 Oct 2014 10:51:49 +0000 (12:51 +0200)]
build: Support SYSROOT based config script finding.

* src/libgcrypt.m4: Add support for SYSROOT and set
gpg_config_script_warn.  Use AC_PATH_PROG instead of AC_PATH_TOOL
because the config script is not expected to be installed with a
prefix for its name
* configure.ac: Print a library mismatch warning.
* m4/gpg-error.m4: Update from git master.
--

Also fixed the false copyright notice in libgcrypt.m4.

4 years agomac: Fix gcry_mac_close to allow for a NULL handle.
Werner Koch [Mon, 29 Sep 2014 15:34:28 +0000 (17:34 +0200)]
mac: Fix gcry_mac_close to allow for a NULL handle.

* cipher/mac.c (_gcry_mac_close): Check for NULL.
--

We always allow this for easier cleanup.  actually the docs already
tell that this is allowed.

4 years agoAdd a constant for a forthcoming new RNG.
Werner Koch [Wed, 3 Sep 2014 06:53:43 +0000 (08:53 +0200)]
Add a constant for a forthcoming new RNG.

* src/gcrypt.h.in (GCRYCTL_DRBG_REINIT): New constant.

4 years agoAdd new Poly1305 MAC test vectors
Jussi Kivilinna [Tue, 2 Sep 2014 17:40:07 +0000 (20:40 +0300)]
Add new Poly1305 MAC test vectors

* tests/basic.c (check_mac): Add new test vectors for Poly1305 MAC.
--

Patch adds new test vectors for Poly1305 MAC from Internet Draft
draft-irtf-cfrg-chacha20-poly1305-01.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
4 years agoasm: Allow building x86 and amd64 using old compilers.
Werner Koch [Tue, 2 Sep 2014 07:25:20 +0000 (09:25 +0200)]
asm: Allow building x86 and amd64 using old compilers.

* src/hwf-x86.c (get_xgetbv): Build only if AVX support is enabled.
--

Old as(1) versions do not support the xgetvb instruction.  Thus build
this function only if asm support has been requested.

GnuPG-bug-id: 1708

4 years agoAdd DCO entries for Andrei Scherer and Stefan Mueller.
Werner Koch [Mon, 1 Sep 2014 09:40:31 +0000 (11:40 +0200)]
Add DCO entries for Andrei Scherer and Stefan Mueller.

--

4 years agompi: Re-indent longlong.h.
Werner Koch [Fri, 29 Aug 2014 12:54:11 +0000 (14:54 +0200)]
mpi: Re-indent longlong.h.

--
Indenting the cpp statements should make longlong.h better readable.

5 years agosexp: Check args of gcry_sexp_build.
Werner Koch [Thu, 21 Aug 2014 12:12:55 +0000 (14:12 +0200)]
sexp: Check args of gcry_sexp_build.

* src/sexp.c (do_vsexp_sscan): Return error for invalid args.
--

This helps to avoid usage errors by passing NULL for the return
variable and the format string.