libgcrypt.git
6 years agomd: Simplify the message digest dispatcher md.c.
Werner Koch [Wed, 2 Oct 2013 11:39:47 +0000 (13:39 +0200)]
md: Simplify the message digest dispatcher md.c.

* src/gcrypt-module.h (gcry_md_spec_t):  Move to ...
* src/cipher-proto.h: here.  Merge with md_extra_spec_t.  Add fields
ALGO and FLAGS.  Set these fields in all digest modules.
* cipher/md.c: Change most code to replace the former module
system by a simpler system to gain information about the algorithms.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agocipher: Simplify the cipher dispatcher cipher.c.
Werner Koch [Tue, 1 Oct 2013 20:00:50 +0000 (22:00 +0200)]
cipher: Simplify the cipher dispatcher cipher.c.

* src/gcrypt-module.h (gcry_cipher_spec_t):  Move to ...
* src/cipher-proto.h (gcry_cipher_spec_t): here.  Merge with
cipher_extra_spec_t.  Add fields ALGO and FLAGS.  Set these fields in
all cipher modules.
* cipher/cipher.c: Change most code to replace the former module
system by a simpler system to gain information about the algorithms.
(disable_pubkey_algo): Simplified.  Not anymore thread-safe, though.

* cipher/md.c (_gcry_md_selftest): Use correct structure.  Not a real
problem because both define the same function as their first field.

* cipher/pubkey.c (_gcry_pk_selftest): Take care of the disabled flag.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agompi: Fix gcry_mpi_neg.
Werner Koch [Tue, 1 Oct 2013 15:47:27 +0000 (17:47 +0200)]
mpi: Fix gcry_mpi_neg.

* mpi/mpiutil.c (_gcry_mpi_neg): Copy U to W.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agocipher: Add support for 128-bit keys in RC2
Peter Wu [Thu, 26 Sep 2013 21:20:32 +0000 (23:20 +0200)]
cipher: Add support for 128-bit keys in RC2

* cipher/rfc2268.c (oids_rfc2268_128): New
(_gcry_cipher_spec_rfc2268_128): New.
* cipher/cipher.c (cipher_table_entry): Add GCRY_CIPHER_RFC2268_128.
--

This patch adds support for decrypting (and encrypting) using 128-bit
keys using the RC2 algorithm.

Signed-off-by: Peter Wu <lekensteyn@gmail.com>
Actually this is merely enabling that extra ID for 128 bit RFC2268.
We should have used one id for that algorithm only, because a second
identifier merely for having the OID in the code is a bad idea.  My
initial fault and thus I better apply this patch to make the id not
entirely useless.  -wk

6 years agoecc: Use faster b parameter for Ed25519.
Werner Koch [Mon, 30 Sep 2013 19:14:11 +0000 (21:14 +0200)]
ecc: Use faster b parameter for Ed25519.

* cipher/ecc-curves.c (domain_parms): Replace b.
* tests/t-mpi-point.c (test_curve): Ditto.
--

This change has been suggested by NIIBE Yutaka:

  Here,

    0x98412DFC9311D490018C7338BF8688861767FF8FF5B2BEBE27548A14B235EC8FEDA4

  is: (121666^-1 mod q)*121665.

  (121666^-1) * 121665 mod q is:

    0x2DFC9311D490018C7338BF8688861767FF8FF5B2BEBE27548A14B235ECA6874A

  While it works for both, I think that shorter is better.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoecc: Prepare for future Ed25519 optimization.
Werner Koch [Mon, 30 Sep 2013 18:32:20 +0000 (20:32 +0200)]
ecc: Prepare for future Ed25519 optimization.

* mpi/ec-ed25519.c: New but empty file.
* mpi/ec-internal.h: New.
* mpi/ec.c: Include ec-internal.h.
(ec_mod): New.
(ec_addm): Use ec_mod.
(ec_mulm): Remove commented code.  Use ec_mod.
(ec_subm): Call simple sub.
(ec_pow2): Use ec_mulm.
(ec_mul2): New.
(dup_point_weierstrass): Use ec_mul2.
(dup_point_twistededwards): Add special case for a == -1.  Use
ec_mul2.
(add_points_weierstrass): Use ec_mul2.
(add_points_twistededwards): Add special case for a == -1.
(_gcry_mpi_ec_curve_point): Ditto.
(ec_p_init): Add hack to test Barrett functions.
* src/ec-context.h (mpi_ec_ctx_s): Add P_BARRETT.

* mpi/mpi-mod.c (_gcry_mpi_mod_barrett): Fix sign problem.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoecc: Fix recomputing of Q for Ed25519.
Werner Koch [Mon, 30 Sep 2013 18:17:05 +0000 (20:17 +0200)]
ecc: Fix recomputing of Q for Ed25519.

* cipher/ecc-misc.c (reverse_buffer): New.
(_gcry_ecc_compute_public): Add ED255519 specific code.
* cipher/ecc.c (sign_eddsa): Allocate DIGEST in secure memory.  Get
rid of HASH_D.
* tests/t-mpi-point.c (context_param): Test recomputing of Q for
Ed25519.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agolog: Try to print s-expressions in a more compact format.
Werner Koch [Mon, 30 Sep 2013 11:20:06 +0000 (13:20 +0200)]
log: Try to print s-expressions in a more compact format.

* src/misc.c (count_closing_parens): New.
(_gcry_log_printsxp): Use new function.
* mpi/ec.c (_gcry_mpi_point_log): Take care of a NULL point.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoMake Whirlpool use the _gcry_md_block_write helper
Jussi Kivilinna [Sat, 21 Sep 2013 10:54:38 +0000 (13:54 +0300)]
Make Whirlpool use the _gcry_md_block_write helper

* cipher/whirlpool.c (whirlpool_context_t): Add 'bctx', remove
'buffer', 'count' and 'nblocks'.
(whirlpool_init): Initialize 'bctx'.
(whirlpool_transform): Adjust context argument type and burn stack
depth.
(whirlpool_add): Remove.
(whirlpool_write): Use _gcry_md_block_write.
(whirlpool_final, whirlpool_read): Adjust for 'bctx' usage.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agowhirlpool: add stack burning after transform
Jussi Kivilinna [Sat, 21 Sep 2013 10:54:38 +0000 (13:54 +0300)]
whirlpool: add stack burning after transform

* cipher/whirlpool.c (whirlpool_transform): Return burn stack depth.
(whirlpool_add): Do burn_stack.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agowhirlpool: do bitcount calculation in finalization part
Jussi Kivilinna [Sat, 21 Sep 2013 10:54:38 +0000 (13:54 +0300)]
whirlpool: do bitcount calculation in finalization part

* cipher/whirlpool.c (whirlpool_context_t): Remove 'length', add
'nblocks'.
(whirlpool_add): Update 'nblocks' instead of 'length', and add early
return at one spot.
(whirlpool_write): Check for 'nblocks' overflow.
(whirlpool_final): Convert 'nblocks' to bit-counter, and use
whirlpool_write instead of whirlpool_add.
--

Currently Whirlpool uses large 256 bit counter that is increased in the
'write' function. However, we could to bit counter calculation as is
done in all the rest hash algorithms; use 64-bit block counter that is
converted to bit counter in finalization function. This change does
limit amount of bytes Whirlpool can process before overflowing bit counter.
With 256-bit counter, overflow happens after ~1.3e67 gigabytes. With 64-bit
block counter, overflow happens just after ~1.1e12 gigabytes. Patch keeps
the old behaviour of halting if counter overflows.

Main benefit for this patch is that after this change, we can use the
_gcry_md_block_write helper for Whirlpool too.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoAdd logging functions to the API.
Werner Koch [Mon, 30 Sep 2013 08:18:25 +0000 (10:18 +0200)]
Add logging functions to the API.

* src/gcrypt.h.in (_GCRY_GCC_ATTR_PRINTF): New.
(gcry_log_debug, gcry_log_debughex, gcry_log_debugmpi): New.
(gcry_log_debugpnt, gcry_log_debugsxp): New.
* src/visibility.c (gcry_log_debug): New.
(gcry_log_debughex, gcry_log_debugmpi, gcry_log_debugpnt): New.
(gcry_log_debugsxp): New.
* src/libgcrypt.def, src/libgcrypt.vers: Add new functions.
* src/misc.c (_gcry_logv): Make public.
(_gcry_log_printsxp): New.
* src/g10lib.h (log_printsxp): New macro.
--

For debugging applications it is often required to dump certain data
structures.  Libgcrypt uses several internal functions for this.  To
avoid re-implementing everything in the caller, we now provide access
to some of those functions.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoMake libgcrypt build with Clang on i386
Jussi Kivilinna [Sat, 21 Sep 2013 10:54:38 +0000 (13:54 +0300)]
Make libgcrypt build with Clang on i386

* cipher/longlong.h [__i386__] (add_ssaaaa, sub_ddmmss)
(umul_ppmm, udiv_qrnnd): Do not cast asm output to USItype.
--

Clang defines __GNUC__ even when it's not GCC compatible. As result Clang
enables GCC-only assembly code in mpi/longlong.h and fails to build.

However, since changes to make libgcrypt build with Clang are smallish, and
changes do not cause problems with GCC, patch just does them.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agompi: Change not yet used _gcry_mpi_set_opaque_copy.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
mpi: Change not yet used _gcry_mpi_set_opaque_copy.

* mpi/mpiutil.c (_gcry_mpi_set_opaque_copy): Change prototype.
(_gcry_mpi_get_opaque_copy): Take care of gcry_malloc failure.

6 years agosexp: Improve printing of data with a leading zero.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
sexp: Improve printing of data with a leading zero.

* src/sexp.c (suitable_encoding): Detect leading zero byte.

6 years agoecc: Allow the name "q@eddsa" to get/set the public key.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
ecc: Allow the name "q@eddsa" to get/set the public key.

* cipher/ecc-curves.c (_gcry_ecc_get_mpi): Support "q@eddsa".
(_gcry_ecc_set_mpi): Support "q".
* cipher/ecc.c (eddsa_encodepoint): Rename to ...
(_gcry_ecc_eddsa_encodepoint): this and make global.  Remove arg
MINLEN and take from context.
(eddsa_decodepoint): Rename to
(_gcry_ecc_eddsa_decodepoint): this and make global. Remove arg LEN
and take from context.
(sign_eddsa, verify_eddsa): Take B from context.
(ecc_sign, ecc_verify): Add hack to set DIALECT.
(_gcry_pk_ecc_get_sexp): Use _gcry_ecc_compute_public.  Handle EdDSA.
* src/ec-context.h (mpi_ec_ctx_s): Add field NBITS.
* mpi/ec.c (ec_p_init): Init NBITS.
* tests/t-mpi-point.c (test_curve): Add Ed25519.
(sample_ed25519_q): New.
(context_param): Check new sample key.
(hex2buffer, hex2mpiopa): New.
(cmp_mpihex): Take care of opaque MPIs.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agompicalc: Add statement to compute the number of bits.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
mpicalc: Add statement to compute the number of bits.

* src/mpicalc.c (do_nbits): New.
(main): Add statement 'b'.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoecc: Refactor low-level access functions.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
ecc: Refactor low-level access functions.

* mpi/ec.c (point_copy): Move to cipher/ecc-curves.c.
(ec_get_reset): Rename to _gcry_mpi_ec_get_reset and make global.
(_gcry_mpi_ec_get_mpi): Factor most code out to _gcry_ecc_get_mpi.
(_gcry_mpi_ec_get_point): Factor most code out to _gcry_ecc_get_point.
(_gcry_mpi_ec_set_mpi): Factor most code out to _gcry_ecc_set_mpi.
(_gcry_mpi_ec_set_point): Factor most code out to _gcry_ecc_set_point.
* cipher/ecc-curves.c (_gcry_ecc_get_mpi): New.
(_gcry_ecc_get_point, _gcry_ecc_set_mpi, _gcry_ecc_set_point): New.
* cipher/ecc-misc.c (_gcry_ecc_compute_public): New.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoecc: Fix highly unlikely endless loop in sign_ecdsa.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
ecc: Fix highly unlikely endless loop in sign_ecdsa.

* cipher/ecc.c (sign_ecdsa): Turn while-do into do-while loops.
--

Reported-by: Dmitry Eremin-Solenikov
Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoecc: Allow the use of an uncompressed public key.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
ecc: Allow the use of an uncompressed public key.

* cipher/ecc.c (eddsa_encodepoint): Factor most code out to ...
(eddsa_encode_x_y): new fucntion.
(eddsa_decodepoint): Allow use of an uncompressed public key.
* tests/t-ed25519.c (N_TESTS): Adjust.
* tests/t-ed25519.inp: Add test 1025.

6 years agopk: Add algo id GCRY_PK_ECC and deprecate ECDSA and ECDH.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
pk: Add algo id GCRY_PK_ECC and deprecate ECDSA and ECDH.

* src/gcrypt.h.in (GCRY_PK_ECC): New.
* cipher/pubkey.c (map_algo): New.
(spec_from_algo, gcry_pk_get_param, _gcry_pk_selftest): Use it.
* cipher/ecc.c (selftests_ecdsa): Report using GCRY_PK_ECC.
(run_selftests): Simplify.
(ecdh_names, ecdsa_names): Merge into a new ecc_names.
(_gcry_pubkey_spec_ecdh, _gcry_pubkey_spec_ecdsa): Merge into new
_gcry_pubkey_spec_ecc.
--

The algo ids are actually a relict from Libgcrypt's former life as
GnuPG's crypto code.  They don't make much sense anymore and are often
not needed.

This patch requires some changes to the GnuPG 2.1 code (which has
still not been released).  For example the secret key transfer between
gpg and gpg-agent (gpg --export and gpg --import).  Fortunately this
will also require to add usage flags to the secret key storage of
gpg-agent which is is something we should have done a long time ago.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoec: Use mpi_mulm instead of mpi_powm.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
ec: Use mpi_mulm instead of mpi_powm.

* mpi/ec.c (ec_pow2): New.
(ec_powm): Remove call to mpi_abs.
(dup_point_weierstrass, dup_point_twistededwards)
(add_points_weierstrass, add_points_twistededwards)
(_gcry_mpi_ec_curve_point): Use ec_pow2.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agobufhelp: enable fast unaligned memory accesses on powerpc
Jussi Kivilinna [Sat, 21 Sep 2013 10:54:38 +0000 (13:54 +0300)]
bufhelp: enable fast unaligned memory accesses on powerpc

* cipher/bufhelp.h [__powerpc__] (BUFHELP_FAST_UNALIGNED_ACCESS): Set
macro enabled.
[__powerpc64__] (BUFHELP_FAST_UNALIGNED_ACCESS): Ditto.
--

PowerPC can handle unaligned memory accesses fast, so enable fast
buffer handling in bufhelp.h.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoRemove i386 inline assembly version of rotation functions
Jussi Kivilinna [Sat, 21 Sep 2013 10:54:38 +0000 (13:54 +0300)]
Remove i386 inline assembly version of rotation functions

* cipher/bithelp.h (rol, ror): Remove i386 version, change
macros to inline functions.
* src/hmac256.c (ror): Ditto.
--

(Current) compilers can optimize '(x << c) | (x >> (32-c))' to rotation
instruction. So remove i386 specific assembly for manually doing this.
Furthermore, compiler can generate faster code in case where 'c' is
constant and can use rotate with immediate value rather than rotate
with %cl register.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoOptimize and cleanup 32-bit and 64-bit endianess transforms
Jussi Kivilinna [Sat, 21 Sep 2013 10:54:38 +0000 (13:54 +0300)]
Optimize and cleanup 32-bit and 64-bit endianess transforms

* cipher/bithelp.h (bswap32, bswap64, le_bswap32, be_bswap32)
(le_bswap64, be_bswap64): New.
* cipher/bufhelp.h (buf_get_be32, buf_get_le32, buf_put_le32)
(buf_put_be32, buf_get_be64, buf_get_le64, buf_put_be64)
(buf_put_le64): New.
* cipher/blowfish.c (do_encrypt_block, do_decrypt_block): Use new
endian conversion helpers.
(do_bf_setkey): Turn endian specific code to generic.
* cipher/camellia.c (GETU32, PUTU32): Use new endian conversion
helpers.
* cipher/cast5.c (rol): Remove, use rol from bithelp.
(F1, F2, F3): Fix to use rol from bithelp.
(do_encrypt_block, do_decrypt_block, do_cast_setkey): Use new endian
conversion helpers.
* cipher/des.c (READ_64BIT_DATA, WRITE_64BIT_DATA): Ditto.
* cipher/md4.c (transform, md4_final): Ditto.
* cipher/md5.c (transform, md5_final): Ditto.
* cipher/rmd160.c (transform, rmd160_final): Ditto.
* cipher/salsa20.c (LE_SWAP32, LE_READ_UINT32): Ditto.
* cipher/scrypt.c (READ_UINT64, LE_READ_UINT64, LE_SWAP32): Ditto.
* cipher/seed.c (GETU32, PUTU32): Ditto.
* cipher/serpent.c (byte_swap_32): Remove.
(serpent_key_prepare, serpent_encrypt_internal)
(serpent_decrypt_internal): Use new endian conversion helpers.
* cipher/sha1.c (transform, sha1_final): Ditto.
* cipher/sha256.c (transform, sha256_final): Ditto.
* cipher/sha512.c (__transform, sha512_final): Ditto.
* cipher/stribog.c (transform, stribog_final): Ditto.
* cipher/tiger.c (transform, tiger_final): Ditto.
* cipher/twofish.c (INPACK, OUTUNPACK): Ditto.
* cipher/whirlpool.c (buffer_to_block, block_to_buffer): Ditto.
* configure.ac (gcry_cv_have_builtin_bswap32): Check for compiler
provided __builtin_bswap32.
(gcry_cv_have_builtin_bswap64): Check for compiler provided
__builtin_bswap64.
--

Patch add helper functions that provide conversions to/from integers and
buffers of different endianess. Benefits are code cleanup and optimization
for architectures that have byte-swaping instructions and/or can do fast
unaligned memory accesses.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agogostr3411_94: set better burn stack depth estimate
Jussi Kivilinna [Sat, 21 Sep 2013 10:54:38 +0000 (13:54 +0300)]
gostr3411_94: set better burn stack depth estimate

* cipher/gost28147.c (_gcry_gost_enc_one): Account function stack to
burn stack depth.
* cipher/gostr3411-94.c (max): New macro.
(do_hash_step, transform): Return stack burn depth.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoUse hash transform function return type for passing burn stack depth
Jussi Kivilinna [Sat, 21 Sep 2013 10:54:38 +0000 (13:54 +0300)]
Use hash transform function return type for passing burn stack depth

* cipher/gostr4311-94.c (transform): Return stack burn depth.
* cipher/hash-common.c (_gcry_md_block_write): Use stack burn depth
returned by 'hd->bwrite'.
* cipher/hash-common.h (_gcry_md_block_write_t): Change return type to
'unsigned int'.
(gry_md_block_ctx_t): Remove 'stack_burn'.
* cipher/md4.c (transform): Return stack burn depth.
(md4_final): Use stack burn depth from transform.
* cipher/md5.c (transform): Return stack burn depth.
(md5_final): Use stack burn depth from transform.
* cipher/rmd160.c (transform): Return stack burn depth.
(rmd160_final): Use stack burn depth from transform.
* cipher/sha1.c (transform): Return stack burn depth.
(sha1_final): Use stack burn depth from transform.
* cipher/sha256.c (transform): Return stack burn depth.
(sha256_final): Use stack burn depth from transform.
* cipher/sha512.c (__transform, transform): Return stack burn depth.
(sha512_final): Use stack burn depth from transform.
* cipher/stribog.c (transform64): Return stack burn depth.
* cipher/tiger.c (transform): Return stack burn depth.
(tiger_final): Use stack burn depth from transform.
--

Transform function might want different depth of stack burn depending on
detected CPU features (like in SHA-512 on ARM with NEON). So return
stack burn depth from transform functions as a request or a hint to
calling function.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoMake STRIBOG use the new _gcry_md_block_write helper
Jussi Kivilinna [Sat, 21 Sep 2013 10:54:38 +0000 (13:54 +0300)]
Make STRIBOG use the new _gcry_md_block_write helper

* cipher/stribog.c (STRIBOG_STRUCT): Add 'bctx' and remove 'buf' and
'count'.
(stribog_init_512): Initialize 'bctx'.
(transform64): New function.
(stribog_write): Remove.
(stribog_final): Use _gcry_md_block_write and bctx.
(_gcry_digest_spec_stribog_256, _gcry_digest_spec_stribog_512): Use
_gcry_md_block_write.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoMake SHA-512 use the new _gcry_md_block_write helper
Jussi Kivilinna [Sat, 21 Sep 2013 10:54:38 +0000 (13:54 +0300)]
Make SHA-512 use the new _gcry_md_block_write helper

* cipher/hash-common.c (_gcry_md_block_write): Check that hd->buf is
large enough.
* cipher/hash-common.h (MD_BLOCK_MAX_BLOCKSIZE, MD_NBLOCKS_TYPE): New
macros.
(gcry_md_block_ctx_t): Use above macros for 'nblocks' and 'buf'.
* cipher/sha512.c (SHA512_STATE): New struct.
(SHA512_CONTEXT): Add 'bctx' and 'state'.
(sha512_init, sha384_init): Initialize 'bctx'.
(__transform, _gcry_sha512_transform_armv7_neon): Use SHA512_STATE for
'hd'.
(transform): For now, do not return burn stack.
(sha512_write): Remove.
(sha512_final): Use _gcry_md_block_write and bctx.
(_gcry_digest_spec_sha512, _gcry_digest_spec_sha384): Use
_gcry_md_block_write.
--

Patch changes 'nblocks' counter to 64-bits when SHA-512 is enabled. This does
not cause problems with other algorithms; they are already casting 'nblocks'
to u32 variable in their finalization functions. Also move 'buf' member to
head of 'gcry_md_block_ctx_t' to ensure proper alignment; this is because some
algorithms cast buffer pointer to (u64*) in final endian conversion.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agosexp: Change internal versions to always use gpg_err_code_t.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
sexp: Change internal versions to always use gpg_err_code_t.

* src/sexp.c (gcry_sexp_new, gcry_sexp_create, gcry_sexp_build)
(gcry_sexp_build_array, gcry_sexp_canon_len): Change error return type
from gpg_error_t to gpg_err_code_t.  Remove all calls to gpg_error.
* src/visibility.c (gcry_sexp_new, gcry_sexp_create, gcry_sexp_sscan)
(gcry_sexp_build, gcry_sexp_build_array, gcry_sexp_canon_len): Map
error codes via gpg_error.
* cipher/dsa.c, cipher/ecc.c, cipher/elgamal.c, cipher/rsa.c: Remove
use gpg_err_code wrappers.

--

We should do such a change for all other use of internal functions.
It just does not make sense to use gpg_error in the internal interface
because the error source is always Libgcrypt.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agopk: Move s-exp creation for gcry_pk_decrypt to the modules.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
pk: Move s-exp creation for gcry_pk_decrypt to the modules.

* cipher/pubkey.c (sexp_to_enc): Remove RET_MODERN arg and merge it
into FLAGS.
(gcry_pk_decrypt): Move result s-exp building into the modules.
* src/cipher-proto.h (gcry_pk_decrypt_t): Add some args.
* cipher/ecc.c (ecc_decrypt_raw): Change to return an s-exp.
* cipher/elgamal.c (elg_decrypt): Ditto.
* cipher/rsa.c (rsa_decrypt): Ditto.
(rsa_blind, rsa_unblind): Merge into rsa_decrypt.  This saves several
extra MPI allocations.

--

The extra args added to gcry_pk_decrypt_t are a temporary solution
unti we move the input s-exp parsing also into the modules.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agopk: Remove unused function.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
pk: Remove unused function.

* cipher/pubkey.c (_gcry_pk_aliased_algo_name): Remove

6 years agoBeautify debug output of the prime generator.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
Beautify debug output of the prime generator.

* cipher/primegen.c: Adjust output of log_mpidump to recently changed
log_mpidump code changes.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agopk: Move s-expr creation for genkey to the modules.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
pk: Move s-expr creation for genkey to the modules.

* cipher/pubkey.c (pubkey_generate): Fold into gcry_pk_genkey
(gcry_pk_genkey): Move result s-exp creation into the modules.
* cipher/dsa.c (dsa_generate): Create result as s-exp.
* cipher/elgamal.c (elg_generate): Ditto.
* cipher/rsa.c (rsa_generate): Ditto.
* cipher/ecc.c (ecc_generate): Ditto.
* src/cipher-proto.h (pk_ext_generate_t): Remove type
(gcry_pk_spec): and remove from struct.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agotests: Beautify some diagnostics.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
tests: Beautify some diagnostics.

* tests/benchmark.c (ecc_bench): Print the key sexp in very verbose
mode.
(main): Add option --pk-count.
* tests/keygen.c: Add Elgamal generation and improved diagnostics.
* tests/t-ed25519.c (check_ed25519): Print running number of tests
done.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agosexp: Improve printing data representing a negative number.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
sexp: Improve printing data representing a negative number.

* src/sexp.c (suitable_encoding): Detect a negative number.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agopk: Move RSA encoding functions to a new file.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
pk: Move RSA encoding functions to a new file.

* cipher/rsa-common: New.
* cipher/pubkey.c (pkcs1_encode_for_encryption): Move to rsa-common.c
and rename to _gcry_rsa_pkcs1_encode_for_enc.
(pkcs1_decode_for_encryption): Move to rsa-common.c and rename to
_gcry_rsa_pkcs1_decode_for_enc.
(pkcs1_encode_for_signature): Move to rsa-common.c and rename to
_gcry_rsa_pkcs1_encode_for_sig.
(oaep_encode): Move to rsa-common.c and rename to
_gcry_rsa_oaep_encode.
(oaep_decode): Move to rsa-common.c and rename to
_gcry_rsa_oaep_decode.
(pss_encode): Move to rsa-common.c and rename to _gcry_rsa_pss_encode.
(pss_verify): Move to rsa-common.c and rename to _gcry_rsa_pss_decode.
(octet_string_from_mpi, mgf1): Move to rsa-common.c.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agopk: Move s-expr creation for sign and encrypt to the modules.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
pk: Move s-expr creation for sign and encrypt to the modules.

* cipher/pubkey.c (pubkey_encrypt): Fold into gcry_pk_encrypt.
(pubkey_decrypt): Fold into gcry_pk_decrypt.
(pubkey_sign): Fold into gcry_pk_sign.
(pubkey_verify): Fold into gcry_pk_verify.
(octet_string_from_mpi): Make it a wrapper and factor code out to ...
* mpi/mpicoder.c (_gcry_mpi_to_octet_string): New function.

* src/cipher.h (PUBKEY_FLAG_FIXEDLEN): New.
* cipher/pubkey.c (sexp_data_to_mpi): Set flag for some encodings.
(gcry_pk_encrypt): Simply by moving the s-expr generation to the modules.
(gcry_pk_sign): Ditto.
* cipher/dsa.c (dsa_sign): Create s-expr.
* cipher/elgamal.c (elg_encrypt, elg_sign): Ditto.
* cipher/rsa.c (rsa_encrypt, rsa_sign): Ditto.
* cipher/ecc.c (ecc_sign, ecc_encrypt_raw): Ditto.
(ecdsa_names): Add "eddsa".
* tests/t-ed25519.c (one_test): Expect "eddsa" token.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoFix Stribog digest on bigendian platforms
Dmitry Eremin-Solenikov [Mon, 16 Sep 2013 02:55:13 +0000 (06:55 +0400)]
Fix Stribog digest on bigendian platforms

* cipher/stribog.c (stribog_final): swap bytes in the result of digest
calculations.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
6 years agopk: Simplify the public key dispatcher pubkey.c.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
pk: Simplify the public key dispatcher pubkey.c.

* src/cipher-proto.h (gcry_pk_spec_t): Add fields ALGO and FLAGS.
* cipher/dsa.c (_gcry_pubkey_spec_dsa): Set these fields.
* cipher/ecc.c (_gcry_pubkey_spec_ecdsa): Ditto.
(_gcry_pubkey_spec_ecdh): Ditto.
* cipher/rsa.c (_gcry_pubkey_spec_rsa): Ditto.
* cipher/elgamal.c (_gcry_pubkey_spec_elg): Ditto
(_gcry_pubkey_spec_elg_e): New.
* cipher/pubkey.c: Change most code to replace the former module
system by a simpler system to gain information about the algorithms.
(disable_pubkey_algo): SImplified.  Not anymore thread-safe, though.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agopk: Merge extraspecs struct with standard specs struct.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
pk: Merge extraspecs struct with standard specs struct.

* src/gcrypt-module.h (gcry_pk_spec_t): Move this typedef and the
corresponding function typedefs to ...
* src/cipher-proto.h: here.
(pk_extra_spec_t): Remove typedef and merge fields into
gcry_pk_spec_t.
* cipher/rsa.c, cipher/dsa.c, cipher/elg.c, cipher/ecc.c: Ditto.
* cipher/pubkey.c: Change accordingly.
* src/cipher.h (_gcry_pubkey_extraspec_rsa): Remove.
(_gcry_pubkey_extraspec_dsa): Remove.
(_gcry_pubkey_extraspec_elg): Remove.
(_gcry_pubkey_extraspec_ecdsa): Remove.
--

Now that we don't have loadable modules anymore, we don't need to keep
the internal API between the modules and thus can simplify the code.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoFix encryption/decryption return type for GOST28147
Jussi Kivilinna [Wed, 18 Sep 2013 14:13:53 +0000 (17:13 +0300)]
Fix encryption/decryption return type for GOST28147

* cipher/gost.h (_gcry_gost_enc_one): Change return type to
'unsigned int'.
* cipher/gost28147.c (max): New macro.
(gost_encrypt_block, gost_decrypt_block): Return burn stack depth.
(_gcry_gost_enc_one): Return burn stack depth from gost_encrypt_block.
--

Return type for block cipher functions was lately changed from 'void' to
'unsigned int' to pass burn stack depth to cipher mode code. Patch fixes
gost28147 to return stack burn value.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoRename the GOST algorithm identifiers.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
Rename the GOST algorithm identifiers.

--

Dots and dashes in the names are probably not a good idea.  I also
renamed the identifiers to names which are easier to remember.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agodoc: fix building of ps and pdf documentation
Dmitry Eremin-Solenikov [Mon, 2 Sep 2013 09:28:52 +0000 (13:28 +0400)]
doc: fix building of ps and pdf documentation

* doc/gcrypt.texi, doc/gpl.texi, doc/lgpl.texi: fix texinfo errors.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
6 years agoAdd GOST R 34.11-2012 implementation (Stribog)
Dmitry Eremin-Solenikov [Wed, 18 Sep 2013 12:34:18 +0000 (14:34 +0200)]
Add GOST R 34.11-2012 implementation (Stribog)

* src/gcrypt.h.in (GCRY_MD_GOSTR3411_12_256)
(GCRY_MD_GOSTR3411_12_512): New.
* cipher/stribog.c: New.
* configure.ac (available_digests_64): Add stribog.
* src/cipher.h: Declare Stribog declarations.
* cipher/md.c: Register Stribog digest.
* tests/basic.c (check_digests) Add 4 testcases for Stribog from
standard.
* doc/gcrypt.texi: Document new constants.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
6 years agoAdd basic implementation of GOST R 34.11-94 message digest
Dmitry Eremin-Solenikov [Wed, 18 Sep 2013 12:21:13 +0000 (14:21 +0200)]
Add basic implementation of GOST R 34.11-94 message digest

* src/gcrypt.h.in (GCRY_MD_GOSTR3411_94): New.
* cipher/gostr3411-94.c: New.
* configure.ac (available_digests): Add gostr3411-94.
* src/cipher.h: Add gostr3411-94 definitions.
* cipher/md.c: Register GOST R 34.11-94.
* tests/basic.c (check_digests): Add 4 tests for GOST R 34.11-94
  hash algo. Two are  defined in the standard itself, two other are
  more or less common tests - an empty string an exclamation mark.
* doc/gcrypt.texi: Add an entry describing GOST R 34.11-94 to the MD
  algorithms table.

--

Add simple implementation of GOST R 34.11-94 hash function. Currently
there is no way to specify hash parameters (it always uses GOST R 34.11-94
test parameters).

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Stack burn value in gost3411_init added by wk.

6 years agoSeparate common md block code
Dmitry Eremin-Solenikov [Wed, 18 Sep 2013 11:50:35 +0000 (13:50 +0200)]
Separate common md block code

* cipher/hash-common.c (_gcry_md_block_write): New function to handle
block md operations.  The current implementation is limited to 64 byte
buffer and u32 block counter.

* cipher/md4.c, cipher/md5.c, cipher/rmd.h, cipher/rmd160.c
*cipher/sha1.c, cipher/sha256.c, cipher/tiger.c: Convert to use
_gcry_md_block_write.
--

Whirlpool and SHA512 are left as before, as SHA512 uses 128 bytes buffer
and u64 blocks counter and Whirlpool does not have trivial block handling
structure.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Indentation changes, minor edits and adjustment of
_gcry_sha1_hash_buffers by wk.

6 years agoAdd limited implementation of GOST 28147-89 cipher
Dmitry Eremin-Solenikov [Mon, 2 Sep 2013 09:28:48 +0000 (13:28 +0400)]
Add limited implementation of GOST 28147-89 cipher

* src/gcrypt.h.in (GCRY_CIPHER_GOST28147): New.
* cipher/gost.h, cipher/gost28147.c: New.
* configure.ac (available_ciphers): Add gost28147.
* src/cipher.h: Add gost28147 definitions.
* cipher/cipher.c: Register gost28147.
* tests/basic.c (check_ciphers): Enable simple test for gost28147.
* doc/gcrypt.texi: document GCRY_CIPHER_GOST28147.

--

Add a very basic implementation of GOST 28147-89 cipher: from modes
defined in standard only ECB and CFB are supported, sbox is limited
to the "test variant" as provided in GOST 34.11-94.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
6 years agoecc: Add Ed25519 key generation and prepare for optimizations.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
ecc: Add Ed25519 key generation and prepare for optimizations.

* src/mpi.h (enum ecc_dialects): New.
* src/ec-context.h (mpi_ec_ctx_s): Add field DIALECT.
* cipher/ecc-common.h (elliptic_curve_t): Ditto.
* cipher/ecc-curves.c (ecc_domain_parms_t): Ditto.
(domain_parms): Add dialect values.
(_gcry_ecc_fill_in_curve): Set dialect.
(_gcry_ecc_get_curve): Ditto.
(_gcry_mpi_ec_new): Ditto.
(_gcry_ecc_get_param): Use ECC_DIALECT_STANDARD for now.
* cipher/ecc-misc.c (_gcry_ecc_curve_copy): Copy dialect.
(_gcry_ecc_dialect2str): New.
* mpi/ec.c (ec_p_init): Add arg DIALECT.
(_gcry_mpi_ec_p_internal_new): Ditto.
(_gcry_mpi_ec_p_new): Ditto.

* mpi/mpiutil.c (gcry_mpi_set_opaque): Set the secure flag.
(_gcry_mpi_set_opaque_copy): New.

* cipher/ecc-misc.c (_gcry_ecc_os2ec): Take care of an opaque MPI.
* cipher/ecc.c (eddsa_generate_key): New.
(generate_key): Rename to nist_generate_key and factor some code out
to ...
(ecc_generate_ext): here.  Divert to eddsa_generate_key if desired.
(eddsa_decodepoint): Take care of an opaque MPI.
(ecc_check_secret_key): Ditto.
(ecc_sign): Ditto.
* cipher/pubkey.c (sexp_elements_extract_ecc): Store public and secret
key as opaque MPIs.
(gcry_pk_genkey): Add the curve_name also to the private key part of
the result.

* tests/benchmark.c (ecc_bench): Support Ed25519.
(main): Add option --debug.
* tests/curves.c (sample_key_2): Make sure that P and N are positive.
* tests/keygen.c (show): New.
(check_ecc_keys): Support Ed25519.
--

There are two main purposes of this patch: Add a key generation
feature for Ed25519 and add the "dialect" thingy which will eventually
be used to add curve specific optimization.

Note that the entire way of how we interface between the public key
modules and pubkey.c is overly complex and probably also the cause for
a lot of performance overhead.  Given that we don't have the loadable
module system anymore, we should entirely get rid of the MPI-array
based internal interface and move parts of the s-expression handling
direct into the pubkey modules.  This needs to be fixed or we are
turning Libgcrypt into another software incarnation of Heathrow
Airport.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agompi: Support printing of negative numbers.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
mpi: Support printing of negative numbers.

* mpi/mpicoder.c (twocompl, onecompl): New.
(gcry_mpi_print): Use it for STD and SSH.
(gcry_mpi_scan): Use it for STD and SSH.  Always set NSCANNED.
(gcry_mpi_aprint): Clear the extra allocated byte.
* tests/t-convert.c (showhex, showmpi): New.
(mpi2bitstr_nlz): New.
(check_formats): New.
(main): Call new test.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoFix bug in _gcry_mpi_tdiv_q_2exp.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
Fix bug in _gcry_mpi_tdiv_q_2exp.

* mpi/mpi-internal.h (MPN_COPY_INCR): Make it work.
--

This bug has been with us since the version 0.0.0 of GnuPG.
Fortunately it only affects an optimized code path which is rarely
used in practice: If the shift size matches the size of a
limb (i.e.. 32 or 64); this is is_prime in primegen.c.  Over there the
Rabin-Miller test may fail with a probability of 2^-31 (that is if the
to be tested prime - 1 has the low 32 bits cleared).  In practice the
probability is even much less because we first do a Fermat test on the
randomly generated candidates which sorts out the majority of
composite numbers.

The bug in MPN_COPY_INCR was found by Sven Bjorn.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoecc: Implement Curve Ed25519 signing and verification.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
ecc: Implement Curve Ed25519 signing and verification.

* cipher/ecc-curves.c (domain_parms): Add curve "Ed25519".
* cipher/ecc.c (reverse_buffer): New.
(eddsa_encodempi): New.
(eddsa_encodepoint): New.
(eddsa_decodepoint): New.
(sign_eddsa): Implement.
(verify_eddsa): Implement.
(ecc_sign): Init unused Q.  Pass public key to sign_eddsa.
(ecc_verify): Init pk.Q if not used.  Pass public key verbatim to
verify_eddsa.
* cipher/pubkey.c (sexp_elements_extract): Add arg OPAQUE.  Change all
callers to pass 0.
(sexp_to_sig): Add arg OPAQUE and pass it to sexp_elements_extract.
(sexp_data_to_mpi): Allow for a zero length "value".
(gcry_pk_verify): Reorder parameter processing.  Pass OPAQUE flag as
required.
* mpi/ec.c (ec_invm): Print a warning if the inverse does not exist.
(_gcry_mpi_ec_get_affine): Implement for our Twisted Edwards curve
model.
(dup_point_twistededwards): Implement.
(add_points_twistededwards): Implement.
(_gcry_mpi_ec_mul_point): Support Twisted Edwards.

* mpi/mpicoder.c (do_get_buffer): Add arg FILL_LE.
(_gcry_mpi_get_buffer): Ditto.  Change all callers.
(_gcry_mpi_get_secure_buffer): Ditto.

* src/sexp.c (_gcry_sexp_nth_opaque_mpi): New.

* tests/t-ed25519.c: New.
* tests/t-ed25519.inp: New.
* tests/t-mpi-point.c (basic_ec_math_simplified): Print some output
only in debug mode.
(twistededwards_math): New test.
(main): Call new test.
--

This is a non optimized version which takes far too long.  On my X220
Thinkpad the 1024 test cases take 14 seconds (12 with --sign-with-pk).
There should be a lot of room for improvements.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agompi: Add internal convenience function.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
mpi: Add internal convenience function.

* mpi/mpiutil.c (_gcry_mpi_get_opaque_copy): New.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agompi: Add debug function to print a point.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
mpi: Add debug function to print a point.

* mpi/ec.c (_gcry_mpi_point_log): New.
* src/mpi.h (log_printpnt): new macro.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agotests: Factor time measurement code out.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
tests: Factor time measurement code out.

* tests/benchmark.c (started_at, stopped_at, start_timer, stop_timer)
(elapsed time): Factor out to ..
* tests/stopwatch.h: new file.

6 years agoFix _gcry_log_printmpi to print 00 instead of a sole sign.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
Fix _gcry_log_printmpi to print 00 instead of a sole sign.

* src/misc.c: Special case an mpi length of 0.

6 years agoStreamline the use of the internal mpi and hex debug functions.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
Streamline the use of the internal mpi and hex debug functions.

* mpi/mpicoder.c (gcry_mpi_dump): Remove.
(_gcry_log_mpidump): Remove.
* src/misc.c (_gcry_log_printhex): Factor all code out to ...
(do_printhex): new.  Add line wrapping a and compact printing.
(_gcry_log_printmpi): New.
* src/mpi.h (log_mpidump): Remove macro.
* src/g10lib.h (log_mpidump): Add compatibility macro.
(log_printmpi): New macro
* src/visibility.c (gcry_mpi_dump): Call _gcry_log_printmpi.
* cipher/primegen.c (prime_generate_internal): Replace gcry_mpi_dump
by log_printmpi.
(gcry_prime_group_generator): Ditto.
* cipher/pubkey.c: Remove extra colons from log_mpidump call.
* cipher/rsa.c (stronger_key_check): Use log_printmpi.
--

The values to debug get longer and longer and the different debug
functions made it hard to check them out. Now MPIs and hex buffers are
printed very similar.  Lines may now wrap with an backslash as
indicator.  MPIs are distinguished from plain buffers in the output by
always using a sign.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agomd: Add function gcry_md_hash_buffers.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
md: Add function gcry_md_hash_buffers.

* src/gcrypt.h.in (gcry_buffer_t): new.
(gcry_md_hash_buffers): New.
* src/visibility.c, src/visibility.h: Add wrapper for new function.
* src/libgcrypt.def, src/libgcrypt.vers: Export new function.
* cipher/md.c (gcry_md_hash_buffers): New.
* cipher/sha1.c (_gcry_sha1_hash_buffers): New.
* tests/basic.c (check_one_md_multi): New.
(check_digests): Run that test.
* tests/hmac.c (check_hmac_multi): New.
(main): Run that test.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agomd: Fix Whirlpool flaw.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
md: Fix Whirlpool flaw.

* cipher/whirlpool.c (whirlpool_add): Remove shortcut return so that
byte counter is always properly updated.
--

Using the forthcoming gcry_md_hash_buffers() and its test suite, I
found that a message of size 62 won't yield the correct hash if it is
fed into Whirlpool into in chunks.  The fix is obvious.  The wrong
code was likely due to using similar structure as SHA-1 but neglecting
that bytes and not blocks are counted.

6 years agomd: Update URL of the Whirlpool specs.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
md: Update URL of the Whirlpool specs.

--

6 years agoFix static build on AMD64
Jussi Kivilinna [Sat, 7 Sep 2013 08:55:19 +0000 (11:55 +0300)]
Fix static build on AMD64

* cipher/rijndael-amd64.S: Correct 'RIP' macro for non-PIC build.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoscrypt: fix for big-endian systems
Jussi Kivilinna [Sat, 7 Sep 2013 08:52:05 +0000 (11:52 +0300)]
scrypt: fix for big-endian systems

* cipher/scrypt.c (_salsa20_core): Fix endianess issues.
--

On big-endian systems 'tests/t-kdf' was failing scrypt tests. Patch fixes the
issue.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoUse gcc "unused" attribute only with gcc >= 3.5.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
Use gcc "unused" attribute only with gcc >= 3.5.

* src/g10lib.h (GCC_ATTR_UNUSED): Fix gcc version detection.
--

Reported-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoAdd support for Salsa20/12 - 12 round version of Salsa20
Dmitry Eremin-Solenikov [Thu, 5 Sep 2013 09:42:11 +0000 (13:42 +0400)]
Add support for Salsa20/12 - 12 round version of Salsa20

* src/gcrypt.h.in (GCRY_CIPHER_SALSA20R12): New.
* src/salsa20.c (salsa20_core, salsa20_do_encrypt_stream): Add support
for reduced round versions.
  (salsa20r12_encrypt_stream, _gcry_cipher_spec_salsa20r12): Implement
Salsa20/12 - a 12 round version of Salsa20 selected by eStream.
* src/cipher.h: Declsare Salsa20/12 definition.
* cipher/cipher.c: Register Salsa20/12
* tests/basic.c: (check_stream_cipher, check_stream_cipher_large_block):
Populate Salsa20/12 tests with test vectors from ecrypt
(check_ciphers): Add simple test for Salsa20/12

--
Salsa20/12 is a reduced round version of Salsa20 that is amongst ciphers
selected by eSTREAM for Phase 3 of Profile 1 algorithm. Moreover it is
one of proposed ciphers for TLS (draft-josefsson-salsa20-tls-02).

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
6 years agoAdd configure option --disable-amd64-as-feature-detection.
Werner Koch [Sat, 7 Sep 2013 07:50:44 +0000 (09:50 +0200)]
Add configure option --disable-amd64-as-feature-detection.

* configure.ac: Implement new disable flag.
--

Doing a static build of Libgcrypt currently throws an as error on my
box.  Adding this configure option as a workaround

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agompi: Improve support for non-Weierstrass support.
Werner Koch [Sat, 7 Sep 2013 08:06:46 +0000 (10:06 +0200)]
mpi: Improve support for non-Weierstrass support.

* mpi/ec.c (ec_p_init): Add args MODEL and P.  Change all callers.
(_gcry_mpi_ec_p_internal_new): Ditto.
(_gcry_mpi_ec_p_new): Ditto.
* cipher/ecc-curves.c (_gcry_ecc_fill_in_curve): Return
GPG_ERR_UNKNOWN_CURVE instead of invalid value.  Init curve model.
* cipher/ecc.c (ecc_verify, ecc_encrypt_raw): Ditto.
* cipher/pubkey.c (sexp_data_to_mpi): Fix EDDSA flag error checking.
--

(fixes commit c26be7a337d0bf98193bc58e043209e46d0769bb)

6 years agompi: Add gcry_mpi_ec_curve_point.
Werner Koch [Fri, 6 Sep 2013 18:07:07 +0000 (20:07 +0200)]
mpi: Add gcry_mpi_ec_curve_point.

* mpi/ec.c (_gcry_mpi_ec_curve_point): New.
(ec_powm): Return the absolute value.
* src/visibility.c, src/visibility.c: Add wrappers.
* src/libgcrypt.def, src/libgcrypt.vers: Export them.

6 years agompi: Add functions to manipulate the sign.
Werner Koch [Fri, 6 Sep 2013 17:58:50 +0000 (19:58 +0200)]
mpi: Add functions to manipulate the sign.

* src/gcrypt.h.in (gcry_mpi_is_neg): New.
(gcry_mpi_neg, gcry_mpi_abs): New.
* mpi/mpiutil.c (_gcry_mpi_is_neg): New.
(_gcry_mpi_neg, _gcry_mpi_abs): New.
* src/visibility.c, src/visibility.h: Add wrappers.
* src/libgcrypt.def, src/libgcrypt.vers: Export them.
* src/mpi.h (mpi_is_neg): New.  Rename old macro to mpi_has_sign.
* mpi/mpi-mod.c (_gcry_mpi_mod_barrett): Use mpi_has_sign.
* mpi/mpi-mpow.c (calc_barrett): Ditto.
* cipher/primegen.c (_gcry_derive_x931_prime): Ditto
* cipher/rsa.c (secret): Ditto.

6 years agoTune armv6 mpi assembly
Jussi Kivilinna [Fri, 6 Sep 2013 08:11:37 +0000 (11:11 +0300)]
Tune armv6 mpi assembly

* mpi/armv6/mpih-mul1.S: Tune assembly for Cortex-A8.
* mpi/armv6/mpih-mul2.S: Ditto.
* mpi/armv6/mpih-mul3.S: Ditto.
--

Little bit of tuning of assembly functions with help of Cortex-A8 profiler.

Old (armhf/Cortex-A8 1Ghz):
Algorithm         generate  100*sign  100*verify
------------------------------------------------
RSA 1024 bit         350ms    2230ms        50ms
RSA 2048 bit        3500ms   11890ms       150ms
RSA 3072 bit       23900ms   32540ms       280ms
RSA 4096 bit       15750ms   69420ms       450ms
Algorithm         generate  100*sign  100*verify
------------------------------------------------
DSA 1024/160             -     990ms       930ms
DSA 2048/224             -    3840ms      3400ms
DSA 3072/256             -    8280ms      7620ms
Algorithm         generate  100*sign  100*verify
------------------------------------------------
ECDSA 192 bit         60ms    1760ms      3300ms
ECDSA 224 bit         80ms    2240ms      4300ms
ECDSA 256 bit        110ms    2740ms      5420ms
ECDSA 384 bit        230ms    5680ms     11300ms
ECDSA 521 bit        540ms   13590ms     26890ms

New:
Algorithm         generate  100*sign  100*verify
------------------------------------------------
RSA 1024 bit         350ms    2190ms        60ms
RSA 2048 bit        8910ms   11800ms       150ms
RSA 3072 bit       11000ms   31810ms       270ms
RSA 4096 bit       50290ms   68690ms       450ms
Algorithm         generate  100*sign  100*verify
------------------------------------------------
DSA 1024/160             -     980ms       920ms
DSA 2048/224             -    3780ms      3370ms
DSA 3072/256             -    8100ms      7060ms
Algorithm         generate  100*sign  100*verify
------------------------------------------------
ECDSA 192 bit         70ms    1730ms      3200ms
ECDSA 224 bit         90ms    2180ms      4220ms
ECDSA 256 bit        110ms    2660ms      5200ms
ECDSA 384 bit        220ms    5660ms     10910ms
ECDSA 521 bit        530ms   13420ms     26000ms

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoChange _gcry_burn_stack take burn depth as unsigned integer
Jussi Kivilinna [Thu, 5 Sep 2013 06:34:25 +0000 (09:34 +0300)]
Change _gcry_burn_stack take burn depth as unsigned integer

* src/misc.c (_gcry_burn_stack): Change to handle 'unsigned int' bytes.
--

Unsigned integer is better here for code generation because we can now avoid
possible branching caused by (bytes <= 0) check.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agompicalc: fix building on linux and win32
Jussi Kivilinna [Thu, 5 Sep 2013 06:46:29 +0000 (09:46 +0300)]
mpicalc: fix building on linux and win32

* src/Makefile.am (mpicalc): Adjust CFLAGS and LDADD.
--

Building libgcrypt is now failing on Ubuntu 13.04 machine. Patch changes src/Makefile.am for 'mpicalc' to correct this issue.

$ make distclean; ./configure --enable-maintainer-mode; make
...
libtool: link: gcc -g -O2 -fvisibility=hidden -Wall -Wcast-align -Wshadow -Wstrict-prototypes -Wformat -Wno-format-y2k -Wformat-security -W -Wextra -Wbad-function-cast -Wwrite-strings -Wdeclaration-after-statement -Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -o .libs/mpicalc mpicalc-mpicalc.o  ../src/.libs/libgcrypt.so
/usr/bin/ld: mpicalc-mpicalc.o: undefined reference to symbol 'gpg_strerror'
/usr/bin/ld: note: 'gpg_strerror' is defined in DSO /lib/x86_64-linux-gnu/libgpg-error.so.0 so try adding it to the linker command line
/lib/x86_64-linux-gnu/libgpg-error.so.0: could not read symbols: Invalid operation

With win32 target, gpg-error.h is not found.

$ make distclean; ./autogen.sh --build-w32; make
...
i686-w64-mingw32-gcc -DHAVE_CONFIG_H -I. -I..     -g -O2 -Wall -Wcast-align -Wshadow -Wstrict-prototypes -Wformat -Wno-format-y2k -Wformat-security -W -Wextra -Wbad-function-cast -Wwrite-strings -Wdeclaration-after-statement -Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -MT mpicalc-mpicalc.o -MD -MP -MF .deps/mpicalc-mpicalc.Tpo -c -o mpicalc-mpicalc.o `test -f 'mpicalc.c' || echo './'`mpicalc.c
In file included from mpicalc.c:36:0:
gcrypt.h:32:23: fatal error: gpg-error.h: No such file or directory

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoChange mpicalc to use Libgcrypt and install it.
Werner Koch [Wed, 4 Sep 2013 15:51:30 +0000 (17:51 +0200)]
Change mpicalc to use Libgcrypt and install it.

* src/mpicalc.c: Make use of gcry_ functions.
(MPICALC_VERSION): New.  Set to 2.0.
(strusage): Remove.
(scan_mpi): New.  Replaces mpi_fromstr.
(print_mpi): New.  Replaces mpi_print.
(my_getc): New.
(print_help): New.
(main): Use simple option parser and print version info.
* src/Makefile.am (bin_PROGRAMS): Add mpicalc.
(mpicalc_SOURCES, mpicalc_CFLAGS, mpicalc_LDADD): New.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoRe-indent mpicalc.c and change license.
Werner Koch [Wed, 4 Sep 2013 14:17:11 +0000 (16:17 +0200)]
Re-indent mpicalc.c and change license.

--

Changed license to LGPLv2.1+.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoAdd mpicalc.c to help with testing.
Werner Koch [Wed, 4 Sep 2013 13:37:01 +0000 (15:37 +0200)]
Add mpicalc.c to help with testing.

* src/mpicalc.c: Take from GnuPG 1.4
--

Taken from GnuPG commit 45efde9557661ea071a01bcb938f1591ed4ec1a3

6 years agoPrepare support for EdDSA.
Werner Koch [Wed, 4 Sep 2013 09:20:57 +0000 (11:20 +0200)]
Prepare support for EdDSA.

* src/cipher.h (PUBKEY_FLAG_EDDSA): New.
* cipher/pubkey.c (pubkey_verify): Repalce args CMP and OPAQUEV by
CTX.  Pass flags and hash algo to the verify function.  Change all
verify functions to accept these args.
(sexp_data_to_mpi): Implement new flag "eddsa".
(gcry_pk_verify): Pass CTX instead of the compare function to
pubkey_verify.
* cipher/ecc.c (sign): Rename to sign_ecdsa.  Change all callers.
(verify): Rename to verify_ecdsa.  Change all callers.
(sign_eddsa, verify_eddsa): New stub functions.
(ecc_sign): Divert to sign_ecdsa or sign_eddsa.
(ecc_verify): Divert to verify_ecdsa or verify_eddsa.

6 years agoPrepare support for non-Weierstrass EC equations.
Werner Koch [Tue, 3 Sep 2013 10:01:15 +0000 (12:01 +0200)]
Prepare support for non-Weierstrass EC equations.

* src/mpi.h (gcry_mpi_ec_models): New.
* src/ec-context.h (mpi_ec_ctx_s): Add MODEL.
* cipher/ecc-common.h (elliptic_curve_t): Ditto.
* cipher/ecc-curves.c (ecc_domain_parms_t): Ditto.
(domain_parms): Mark als as Weierstrass.
(_gcry_ecc_fill_in_curve): Check model.
(_gcry_ecc_get_curve): Set model to Weierstrass.
* cipher/ecc-misc.c (_gcry_ecc_model2str): New.
* cipher/ecc.c (generate_key, ecc_generate_ext): Print model in the
debug output.

* mpi/ec.c (_gcry_mpi_ec_dup_point): Switch depending on model.
Factor code out to ...
(dup_point_weierstrass): new.
(dup_point_montgomery, dup_point_twistededwards): New stub functions.
(_gcry_mpi_ec_add_points): Switch depending on model.  Factor code out
to ...
(add_points_weierstrass): new.
(add_points_montgomery, add_points_twistededwards): New stub
functions.

* tests/Makefile.am (TESTS): Reorder tests.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agompi: Suppress newer gcc warnings.
Werner Koch [Fri, 30 Aug 2013 15:56:35 +0000 (17:56 +0200)]
mpi: Suppress newer gcc warnings.

* src/g10lib.h (GCC_ATTR_UNUSED): Define for gcc >= 3.5.
* mpi/mpih-div.c (_gcry_mpih_mod_1, _gcry_mpih_divmod_1): Mark dummy
as unused.
* mpi/mpi-internal.h (UDIV_QRNND_PREINV): Mark _ql as unused.
--

Due to the use of macros and longlong.h, we use variables which are
only used by some architectures.  At least gcc 4.7.2 prints new
warnings abot set but not used variables.  This patch silences them.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoDo not check with cpp for typedefed constants.
Werner Koch [Fri, 30 Aug 2013 15:52:17 +0000 (17:52 +0200)]
Do not check with cpp for typedefed constants.

* src/gcrypt-int.h: Include error code replacements depeding on the
version of libgpg-error.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoMake _gcry_burn_stack use variable length array
Jussi Kivilinna [Wed, 4 Sep 2013 07:00:45 +0000 (10:00 +0300)]
Make _gcry_burn_stack use variable length array

* configure.ac (HAVE_VLA): Add check.
* src/misc.c (_gcry_burn_stack) [HAVE_VLA]: Add VLA code.
--

Some gcc versions convert _gcry_burn_stack into loop that overwrites the same
64-byte stack buffer instead of burn stack deeper. It's argued at GCC bugzilla
that _gcry_burn_stack is doing wrong thing here [1] and that this kind of
optimization is allowed.

So lets fix _gcry_burn_stack by using variable length array when VLAs are
supported by compiler. This should ensure proper stack burning to the requested
depth and avoid GCC loop optimizations.

[1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52285

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoMove stack burning from block ciphers to cipher modes
Jussi Kivilinna [Wed, 4 Sep 2013 07:00:45 +0000 (10:00 +0300)]
Move stack burning from block ciphers to cipher modes

* src/gcrypt-module.h (gcry_cipher_encrypt_t)
(gcry_cipher_decrypt_t): Return 'unsigned int'.
* cipher/cipher.c (dummy_encrypt_block, dummy_decrypt_block): Return
zero.
(do_ecb_encrypt, do_ecb_decrypt): Get largest stack burn depth from
block cipher crypt function and burn stack at end.
* cipher/cipher-aeswrap.c (_gcry_cipher_aeswrap_encrypt)
(_gcry_cipher_aeswrap_decrypt): Ditto.
* cipher/cipher-cbc.c (_gcry_cipher_cbc_encrypt)
(_gcry_cipher_cbc_decrypt): Ditto.
* cipher/cipher-cfb.c (_gcry_cipher_cfb_encrypt)
(_gcry_cipher_cfb_decrypt): Ditto.
* cipher/cipher-ctr.c (_gcry_cipher_cbc_encrypt): Ditto.
* cipher/cipher-ofb.c (_gcry_cipher_ofb_encrypt)
(_gcry_cipher_ofb_decrypt): Ditto.
* cipher/blowfish.c (encrypt_block, decrypt_block): Return burn stack
depth.
* cipher/camellia-glue.c (camellia_encrypt, camellia_decrypt): Ditto.
* cipher/cast5.c (encrypt_block, decrypt_block): Ditto.
* cipher/des.c (do_tripledes_encrypt, do_tripledes_decrypt)
(do_des_encrypt, do_des_decrypt): Ditto.
* cipher/idea.c (idea_encrypt, idea_decrypt): Ditto.
* cipher/rijndael.c (rijndael_encrypt, rijndael_decrypt): Ditto.
* cipher/seed.c (seed_encrypt, seed_decrypt): Ditto.
* cipher/serpent.c (serpent_encrypt, serpent_decrypt): Ditto.
* cipher/twofish.c (twofish_encrypt, twofish_decrypt): Ditto.
* cipher/rfc2268.c (encrypt_block, decrypt_block): New.
(_gcry_cipher_spec_rfc2268_40): Use encrypt_block and decrypt_block.
--

Patch moves stack burning from block ciphers and cipher mode loop to end of
cipher mode functions. This greatly reduces the overall CPU usage of the
problematic _gcry_burn_stack. Internal cipher module API is changed so
that encrypt/decrypt functions now return the stack burn depth as unsigned
int to cipher mode function.

(Note, patch also adds missing burn_stack for RFC2268_40 cipher).

_gcry_burn_stack CPU time (looping tests/benchmark cipher blowfish):

arch CPU Old New
i386 Intel-Haswell 4.1% 0.16%
x86_64 Intel-Haswell 3.4% 0.07%
armhf Cortex-A8 8.7% 0.14%

New vs. old (armhf/Cortex-A8):
                ECB/Stream         CBC             CFB             OFB             CTR
             --------------- --------------- --------------- --------------- ---------------
IDEA          1.05x   1.05x   1.04x   1.04x   1.04x   1.04x   1.07x   1.05x   1.04x   1.04x
3DES          1.04x   1.03x   1.04x   1.03x   1.04x   1.04x   1.04x   1.04x   1.04x   1.04x
CAST5         1.19x   1.20x   1.15x   1.00x   1.17x   1.00x   1.15x   1.05x   1.00x   1.00x
BLOWFISH      1.21x   1.22x   1.16x   1.00x   1.18x   1.00x   1.16x   1.16x   1.00x   1.00x
AES           1.09x   1.09x   1.00x   1.00x   1.00x   1.00x   1.07x   1.07x   1.00x   1.00x
AES192        1.11x   1.11x   1.00x   1.00x   1.00x   1.00x   1.08x   1.09x   1.01x   1.00x
AES256        1.07x   1.08x   1.01x   .99x    1.00x   1.00x   1.07x   1.06x   1.00x   1.00x
TWOFISH       1.10x   1.09x   1.09x   1.00x   1.09x   1.00x   1.08x   1.09x   1.00x   1.00x
ARCFOUR       1.00x   1.00x
DES           1.07x   1.11x   1.06x   1.08x   1.07x   1.07x   1.06x   1.06x   1.06x   1.06x
TWOFISH128    1.10x   1.10x   1.09x   1.00x   1.09x   1.00x   1.08x   1.08x   1.00x   1.00x
SERPENT128    1.06x   1.07x   1.02x   1.00x   1.06x   1.00x   1.06x   1.05x   1.00x   1.00x
SERPENT192    1.07x   1.06x   1.03x   1.00x   1.06x   1.00x   1.06x   1.05x   1.00x   1.00x
SERPENT256    1.06x   1.07x   1.02x   1.00x   1.06x   1.00x   1.05x   1.06x   1.00x   1.00x
RFC2268_40    0.97x   1.01x   0.99x   0.98x   1.00x   0.97x   0.96x   0.96x   0.97x   0.97x
SEED          1.45x   1.54x   1.53x   1.56x   1.50x   1.51x   1.50x   1.50x   1.42x   1.42x
CAMELLIA128   1.08x   1.07x   1.06x   1.00x   1.07x   1.00x   1.06x   1.06x   1.00x   1.00x
CAMELLIA192   1.08x   1.08x   1.08x   1.00x   1.07x   1.00x   1.07x   1.07x   1.00x   1.00x
CAMELLIA256   1.08x   1.09x   1.07x   1.01x   1.08x   1.00x   1.07x   1.07x   1.00x   1.00x
SALSA20 .99x  1.00x

Raw data:

New (armhf/Cortex-A8):
Running each test 100 times.
                ECB/Stream         CBC             CFB             OFB             CTR
             --------------- --------------- --------------- --------------- ---------------
IDEA          8620ms  8680ms  9640ms 10010ms  9140ms  8960ms  9630ms  9660ms  9180ms  9180ms
3DES         13990ms 14000ms 14780ms 15300ms 14320ms 14370ms 14780ms 14780ms 14480ms 14480ms
CAST5         2980ms  2980ms  3780ms  2300ms  3290ms  2320ms  3770ms  4100ms  2320ms  2320ms
BLOWFISH      2740ms  2660ms  3530ms  2060ms  3050ms  2080ms  3530ms  3530ms  2070ms  2070ms
AES           2200ms  2330ms  2330ms  2450ms  2270ms  2270ms  2700ms  2690ms  2330ms  2320ms
AES192        2550ms  2670ms  2700ms  2910ms  2630ms  2640ms  3060ms  3060ms  2680ms  2690ms
AES256        2920ms  3010ms  3040ms  3190ms  3010ms  3000ms  3380ms  3420ms  3050ms  3050ms
TWOFISH       2790ms  2840ms  3300ms  2950ms  3010ms  2870ms  3310ms  3280ms  2940ms  2940ms
ARCFOUR       2050ms  2050ms
DES           5640ms  5630ms  6440ms  6970ms  5960ms  6000ms  6440ms  6440ms  6120ms  6120ms
TWOFISH128    2790ms  2840ms  3300ms  2950ms  3010ms  2890ms  3310ms  3290ms  2930ms  2930ms
SERPENT128    4530ms  4340ms  5210ms  4470ms  4740ms  4620ms  5020ms  5030ms  4680ms  4680ms
SERPENT192    4510ms  4340ms  5190ms  4460ms  4750ms  4620ms  5020ms  5030ms  4680ms  4680ms
SERPENT256    4540ms  4330ms  5220ms  4460ms  4730ms  4600ms  5030ms  5020ms  4680ms  4680ms
RFC2268_40   10530ms  7790ms 11140ms  9490ms 10650ms 10710ms 11710ms 11690ms 11000ms 11000ms
SEED          4530ms  4540ms  5050ms  5380ms  4760ms  4810ms  5060ms  5060ms  4850ms  4860ms
CAMELLIA128   2660ms  2630ms  3170ms  2750ms  2880ms  2740ms  3170ms  3170ms  2780ms  2780ms
CAMELLIA192   3430ms  3400ms  3930ms  3530ms  3650ms  3500ms  3940ms  3940ms  3570ms  3560ms
CAMELLIA256   3430ms  3390ms  3940ms  3500ms  3650ms  3510ms  3930ms  3940ms  3550ms  3550ms
SALSA20       1910ms  1900ms

Old (armhf/Cortex-A8):
Running each test 100 times.
                ECB/Stream         CBC             CFB             OFB             CTR
             --------------- --------------- --------------- --------------- ---------------
IDEA          9030ms  9100ms 10050ms 10410ms  9540ms  9360ms 10350ms 10190ms  9560ms  9570ms
3DES         14580ms 14460ms 15300ms 15720ms 14880ms 14900ms 15350ms 15330ms 15030ms 15020ms
CAST5         3560ms  3570ms  4350ms  2300ms  3860ms  2330ms  4340ms  4320ms  2330ms  2320ms
BLOWFISH      3320ms  3250ms  4110ms  2060ms  3610ms  2080ms  4100ms  4090ms  2070ms  2070ms
AES           2390ms  2530ms  2320ms  2460ms  2280ms  2270ms  2890ms  2880ms  2330ms  2330ms
AES192        2830ms  2970ms  2690ms  2900ms  2630ms  2650ms  3320ms  3330ms  2700ms  2690ms
AES256        3110ms  3250ms  3060ms  3170ms  3000ms  3000ms  3610ms  3610ms  3050ms  3060ms
TWOFISH       3080ms  3100ms  3600ms  2940ms  3290ms  2880ms  3560ms  3570ms  2940ms  2930ms
ARCFOUR       2060ms  2050ms
DES           6060ms  6230ms  6850ms  7540ms  6380ms  6400ms  6830ms  6840ms  6500ms  6510ms
TWOFISH128    3060ms  3110ms  3600ms  2940ms  3290ms  2890ms  3560ms  3560ms  2940ms  2930ms
SERPENT128    4820ms  4630ms  5330ms  4460ms  5030ms  4620ms  5300ms  5300ms  4680ms  4680ms
SERPENT192    4830ms  4620ms  5320ms  4460ms  5040ms  4620ms  5300ms  5300ms  4680ms  4680ms
SERPENT256    4820ms  4640ms  5330ms  4460ms  5030ms  4620ms  5300ms  5300ms  4680ms  4660ms
RFC2268_40   10260ms  7850ms 11080ms  9270ms 10620ms 10380ms 11250ms 11230ms 10690ms 10710ms
SEED          6580ms  6990ms  7710ms  8370ms  7140ms  7240ms  7600ms  7610ms  6870ms  6900ms
CAMELLIA128   2860ms  2820ms  3360ms  2750ms  3080ms  2740ms  3350ms  3360ms  2790ms  2790ms
CAMELLIA192   3710ms  3680ms  4240ms  3520ms  3910ms  3510ms  4200ms  4210ms  3560ms  3560ms
CAMELLIA256   3700ms  3680ms  4230ms  3520ms  3930ms  3510ms  4200ms  4210ms  3550ms  3560ms
SALSA20       1900ms  1900ms

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agocamellia-aesni-avx2-amd64: Move register clearing to assembly functions
Jussi Kivilinna [Sun, 1 Sep 2013 13:50:55 +0000 (16:50 +0300)]
camellia-aesni-avx2-amd64: Move register clearing to assembly functions

* cipher/camellia-aesni-avx2-amd64.S
(_gcry_camellia_aesni_avx2_ctr_enc): Add 'vzeroall'.
(_gcry_camellia_aesni_avx2_cbc_dec)
(_gcry_camellia_aesni_avx2_cfb_dec): Add 'vzeroupper' at head and
'vzeroall' at tail.
* cipher/camellia-glue.c (_gcry_serpent_ctr_enc, _gcry_serpent_cbc_dec)
(_gcry_serpent_avx2_cfb_dec) [USE_AESNI_AVX2]: Remove register
clearing.
--

Patch moves register clearing with 'vzeroall' to assembly functions and
adds missing 'vzeroupper' instructions at head of assembly functions.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agocamellia-aesni-avx-amd64: Move register clearing to assembly functions
Jussi Kivilinna [Sun, 1 Sep 2013 13:50:55 +0000 (16:50 +0300)]
camellia-aesni-avx-amd64: Move register clearing to assembly functions

* cipher/camellia-aesni-avx-amd64.S (_gcry_camellia_aesni_avx_ctr_enc)
(_gcry_camellia_aesni_avx_cbc_dec)
(_gcry_camellia_aesni_avx_cfb_dec): Add 'vzeroupper' at head and
'vzeroall' at tail.
* cipher/camellia-glue.c (_gcry_serpent_ctr_enc, _gcry_serpent_cbc_dec)
(_gcry_serpent_avx2_cfb_dec) [USE_AESNI_AVX]: Remove register clearing.
--

Patch moves register clearing with 'vzeroall' to assembly functions and
adds missing 'vzeroupper' instructions at head of assembly functions.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoserpent-avx2-amd64: Move register clearing to assembly
Jussi Kivilinna [Sun, 1 Sep 2013 13:50:55 +0000 (16:50 +0300)]
serpent-avx2-amd64: Move register clearing to assembly

* cipher/serpent-avx2-amd64.S (_gcry_serpent_avx2_ctr_enc)
(_gcry_serpent_avx2_cbc_dec, _gcry_serpent_avx2_cfb_dec): Change last
'vzeroupper' to 'vzeroall'.
* cipher/serpent.c (_gcry_serpent_ctr_enc, _gcry_serpent_cbc_dec)
(_gcry_serpent_avx2_cfb_dec) [USE_AVX2]: Remove register clearing with
'vzeroall'.
--

AVX2 implementation was already clearing upper halfs of YMM registers at end of
assembly functions to prevent long SSE<->AVX transition stalls present on Intel
CPUs. Patch changes these 'vzeroupper' instructions to 'vzeroall' to fully
clear YMM registers. After this change register clearing in serpent.c in not
needed.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoFix building for x32 target
Jussi Kivilinna [Sun, 1 Sep 2013 13:46:32 +0000 (16:46 +0300)]
Fix building for x32 target

* mpi/amd64/mpi-asm-defs.h: New file.
* random/rndhw.c (poll_padlock) [__x86_64__]: Also check if __LP64__ is
defined.
[USE_DRNG, __x86_64__]: Also check if __LP64__ is defined.
--

In short, x32 is new x86-64 ABI with 32-bit pointers. Adding support is
straightforward, small fix for mpi and fixes for random/rndhw.c. AMD64 assembly
functions appear to work fine with x32 and 'make check' passes.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agosha512: add ARM/NEON assembly version of transform function
Jussi Kivilinna [Sat, 31 Aug 2013 09:48:31 +0000 (12:48 +0300)]
sha512: add ARM/NEON assembly version of transform function

* cipher/Makefile.am: Add 'sha512-armv7-neon.S'.
* cipher/sha512-armv7-neon.S: New file.
* cipher/sha512.c (USE_ARM_NEON_ASM): New macro.
(SHA512_CONTEXT) [USE_ARM_NEON_ASM]: Add 'use_neon'.
(sha512_init, sha384_init) [USE_ARM_NEON_ASM]: Enable 'use_neon' if
CPU support NEON instructions.
(k): Round constant array moved outside of 'transform' function.
(__transform): Renamed from 'tranform' function.
[USE_ARM_NEON_ASM] (_gcry_sha512_transform_armv7_neon): New prototype.
(transform): New wrapper function for different transform versions.
(sha512_write, sha512_final): Burn stack by the amount returned by
transform function.
* configure.ac (sha512) [neonsupport]: Add 'sha512-armv7-neon.lo'.
--

Add NEON assembly for transform function for faster SHA512 on ARM. Major speed
up thanks to 64-bit integer registers and large register file that can hold
full input buffer.

Benchmark results on Cortex-A8, 1Ghz:

Old:
$ tests/benchmark --hash-repetitions 100 md sha512 sha384
SHA512       17050ms 18780ms 29120ms 18040ms 17190ms
SHA384       17130ms 18720ms 29160ms 18090ms 17280ms

New:
$ tests/benchmark --hash-repetitions 100 md sha512 sha384
SHA512        3600ms  5070ms 15330ms  4510ms  3480ms
SHA384        3590ms  5060ms 15350ms  4510ms  3520ms

New vs old:
SHA512        4.74x   3.70x   1.90x   4.00x   4.94x
SHA384        4.77x   3.70x   1.90x   4.01x   4.91x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agosha512: reduce stack use in transform function by 512 bytes
Jussi Kivilinna [Sat, 31 Aug 2013 09:48:30 +0000 (12:48 +0300)]
sha512: reduce stack use in transform function by 512 bytes

* cipher/sha512.c (transform): Change 'u64 w[80]' to 'u64 w[16]' and
inline input expansion to first 64 rounds.
(sha512_write, sha512_final): Reduce burn_stack depth by 512 bytes.
--

The input expansion to w[] array can be inlined with rounds and size of array
reduced from u64[80] to u64[16]. On Cortex-A8, this change gives small boost,
possibly thanks to reduced burn_stack depth.

New vs old (tests/benchmark md sha512 sha384):
SHA512 1.09x 1.11x 1.06x 1.09x 1.08x
SHA384 1.09x 1.11x 1.06x 1.09x 1.09x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoAdd ARM HW feature detection module and add NEON detection
Jussi Kivilinna [Sat, 31 Aug 2013 09:48:30 +0000 (12:48 +0300)]
Add ARM HW feature detection module and add NEON detection

* configure.ac: Add option --disable-neon-support.
(HAVE_GCC_INLINE_ASM_NEON): New.
(ENABLE_NEON_SUPPORT): New.
[arm]: Add 'hwf-arm.lo' as HW feature module.
* src/Makefile.am: Add 'hwf-arm.c'.
* src/g10lib.h (HWF_ARM_NEON): New macro.
* src/global.c (hwflist): Add HWF_ARM_NEON entry.
* src/hwf-arm.c: New file.
* src/hwf-common.h (_gcry_hwf_detect_arm): New prototype.
* src/hwfeatures.c (_gcry_detect_hw_features) [HAVE_CPU_ARCH_ARM]: Add
call to _gcry_hwf_detect_arm.
--

Add HW detection module for detecting ARM NEON instruction set. ARM does not
have cpuid instruction so we have to rely on OS to pass feature set information
to user-space. For linux, NEON support can be detected by parsing
'/proc/self/auxv' for hardware capabilities information. For other OSes, NEON
can be detected by checking if platform/compiler only supports NEON capable
CPUs (by check if __ARM_NEON__ macro is defined).

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoCorrect mpi_cpu_arch for ARMv6
Jussi Kivilinna [Sat, 31 Aug 2013 09:48:30 +0000 (12:48 +0300)]
Correct mpi_cpu_arch for ARMv6

* mpi/config.links [armv6]: Set mpi_cpu_arch to "arm", instead of
"armv6".
--

Without this change, HAVE_CPU_ARCH_ARM stays undefined.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agompi: Make gcry_mpi_print work with negative zeroes.
Werner Koch [Fri, 30 Aug 2013 15:04:21 +0000 (17:04 +0200)]
mpi: Make gcry_mpi_print work with negative zeroes.

* mpi/mpicoder.c (gcry_mpi_print): Take care of negative zero.
(gcry_mpi_aprint): Allocate at least 1 byte.
* tests/t-convert.c: New.
* tests/Makefile.am (TESTS): Add t-convert.
--

Reported-by: Christian Fuchs
Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoRefactor the ECC code into 3 files.
Werner Koch [Thu, 29 Aug 2013 19:37:30 +0000 (21:37 +0200)]
Refactor the ECC code into 3 files.

* cipher/ecc-common.h, cipher/ecc-curves.c, cipher/ecc-misc.c: New.
* cipher/Makefile.am (EXTRA_libcipher_la_SOURCES): Add new files.
* configure.ac (GCRYPT_PUBKEY_CIPHERS): Add new .c files.
* cipher/ecc.c (curve_aliases, ecc_domain_parms_t, domain_parms)
(scanval): Move to ecc-curves.c.
(fill_in_curve): Move to ecc-curve.c as _gcry_ecc_fill_in_curve.
(ecc_get_curve): Move to ecc-curve.c as _gcry_ecc_get_curve.
(_gcry_mpi_ec_ec2os): Move to ecc-misc.c.
(ec2os): Move to ecc-misc.c as _gcry_ecc_ec2os.
(os2ec): Move to ecc-misc.c as _gcry_ecc_os2ec.
(point_set): Move as inline function to ecc-common.h.
(_gcry_ecc_curve_free): Move to ecc-misc.c as _gcry_ecc_curve_free.
(_gcry_ecc_curve_copy): Move to ecc-misc.c as _gcry_ecc_curve_copy.
(mpi_from_keyparam, point_from_keyparam): Move to ecc-curves.c.
(_gcry_mpi_ec_new): Move to ecc-curves.c.
(ecc_get_param): Move to ecc-curves.c as _gcry_ecc_get_param.
(ecc_get_param_sexp): Move to ecc-curves.c as _gcry_ecc_get_param_sexp.

Signed-off-by: Werner Koch <wk@gnupg.org>
6 years agoserpent-sse2-amd64: Move register clearing to assembly functions
Jussi Kivilinna [Thu, 22 Aug 2013 12:26:52 +0000 (15:26 +0300)]
serpent-sse2-amd64: Move register clearing to assembly functions

cipher/serpent-sse2-amd64.S (_gcry_serpent_sse2_ctr_enc)
(_gcry_serpent_sse2_cbc_dec, _gcry_serpent_sse2_cfb_dec): Clear used
XMM registers.
cipher/serpent.c (_gcry_serpent_ctr_enc, _gcry_serpent_cbc_dec)
( _gcry_serpent_cfb_dec) [USE_SSE2]: Remove XMM register clearing from
bulk functions.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agotwofish-amd64: do not make __twofish_dec_blk3 global
Jussi Kivilinna [Thu, 22 Aug 2013 12:26:52 +0000 (15:26 +0300)]
twofish-amd64: do not make __twofish_dec_blk3 global

* cipher/twofish-amd64.S (__twofish_dec_blk3): Do not export symbol as
global.
(__twofish_dec_blk3): Mark symbol as function.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agompi: add ARMv6 assembly
Jussi Kivilinna [Sat, 17 Aug 2013 10:41:03 +0000 (13:41 +0300)]
mpi: add ARMv6 assembly

* mpi/armv6/mpi-asm-defs.h: New.
* mpi/armv6/mpih-add1.S: New.
* mpi/armv6/mpih-mul1.S: New.
* mpi/armv6/mpih-mul2.S: New.
* mpi/armv6/mpih-mul3.S: New.
* mpi/armv6/mpih-sub1.S: New.
* mpi/config.links [arm]: Enable ARMv6 assembly.
--

Add mpi assembly for ARMv6 (or later). These are partly based on ARM assembly
found in GMP 4.2.1.

Old vs new (Cortex-A8, 1Ghz):

Algorithm         generate  100*sign  100*verify
------------------------------------------------
ECDSA 192 bit        1.14x     1.10x       1.13x
ECDSA 224 bit        1.11x     1.12x       1.12x
ECDSA 256 bit        1.20x     1.13x       1.14x
ECDSA 384 bit        1.13x     1.21x       1.21x
ECDSA 521 bit        1.17x     1.20x       1.22x
Algorithm         generate  100*sign  100*verify
------------------------------------------------
RSA 1024 bit             -     1.31x       1.60x
RSA 2048 bit             -     1.41x       1.47x
RSA 3072 bit             -     1.50x       1.63x
RSA 4096 bit             -     1.50x       1.57x
Algorithm         generate  100*sign  100*verify
------------------------------------------------
DSA 1024/160             -     1.39x       1.38x
DSA 2048/224             -     1.50x       1.51x
DSA 3072/256             -     1.59x       1.64x

NEW:

Algorithm         generate  100*sign  100*verify
------------------------------------------------
ECDSA 192 bit         70ms    1750ms      3170ms
ECDSA 224 bit         90ms    2210ms      4250ms
ECDSA 256 bit        100ms    2710ms      5170ms
ECDSA 384 bit        230ms    5670ms     11040ms
ECDSA 521 bit        540ms   13370ms     25870ms
Algorithm         generate  100*sign  100*verify
------------------------------------------------
RSA 1024 bit         360ms    2200ms        50ms
RSA 2048 bit        2770ms   11900ms       150ms
RSA 3072 bit        6680ms   32530ms       270ms
RSA 4096 bit       10320ms   69440ms       460ms
Algorithm         generate  100*sign  100*verify
------------------------------------------------
DSA 1024/160             -     990ms       910ms
DSA 2048/224             -    3830ms      3410ms
DSA 3072/256             -    8270ms      7030ms

OLD:

Algorithm         generate  100*sign  100*verify
------------------------------------------------
ECDSA 192 bit         80ms    1920ms      3580ms
ECDSA 224 bit        100ms    2470ms      4760ms
ECDSA 256 bit        120ms    3050ms      5870ms
ECDSA 384 bit        260ms    6840ms     13330ms
ECDSA 521 bit        630ms   16080ms     31500ms
Algorithm         generate  100*sign  100*verify
------------------------------------------------
RSA 1024 bit         450ms    2890ms        80ms
RSA 2048 bit        2320ms   16760ms       220ms
RSA 3072 bit       26300ms   48650ms       440ms
RSA 4096 bit       15700ms   103910ms      720ms
Algorithm         generate  100*sign  100*verify
------------------------------------------------
DSA 1024/160             -    1380ms      1260ms
DSA 2048/224             -    5740ms      5140ms
DSA 3072/256             -   13130ms     11510ms

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoMove ARMv6 detection to configure.ac
Jussi Kivilinna [Sat, 17 Aug 2013 10:41:46 +0000 (13:41 +0300)]
Move ARMv6 detection to configure.ac

* cipher/blowfish-armv6.S: Replace __ARM_ARCH >= 6 checks with
HAVE_ARM_ARCH_V6.
* cipher/blowfish.c: Ditto.
* cipher/camellia-armv6.S: Ditto.
* cipher/camellia.h: Ditto.
* cipher/cast5-armv6.S: Ditto.
* cipher/cast5.c: Ditto.
* cipher/rijndael-armv6.S: Ditto.
* cipher/rijndael.c: Ditto.
* configure.ac: Add HAVE_ARM_ARCH_V6 check.
--

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoAdd optimized wipememory for ARM
Jussi Kivilinna [Sat, 17 Aug 2013 07:09:33 +0000 (10:09 +0300)]
Add optimized wipememory for ARM

src/g10lib.h [__arm__] (fast_wipememory2_unaligned_head)
(fast_wipememory2): New macros.
--

Previous patch that removed _gcry_burn_stack optimization causes burn_stack
take over 30% CPU usage when looping 'benchmark cipher blowfish' on
ARM/Cortex-A8. Optimizing wipememory2 for ARM helps situation a lot.

Old vs new (Cortex-A8):
                  ECB/Stream         CBC             CFB             OFB             CTR
               --------------- --------------- --------------- --------------- ---------------
IDEA            1.20x   1.18x   1.16x   1.15x   1.16x   1.18x   1.18x   1.16x   1.16x   1.17x
3DES            1.14x   1.14x   1.12x   1.13x   1.12x   1.13x   1.12x   1.13x   1.13x   1.15x
CAST5           1.66x   1.67x   1.43x   1.00x   1.48x   1.00x   1.44x   1.44x   1.04x   0.96x
BLOWFISH        1.56x   1.66x   1.47x   1.00x   1.54x   1.05x   1.44x   1.47x   1.00x   1.00x
AES             1.52x   1.42x   1.04x   1.00x   1.00x   1.00x   1.38x   1.37x   1.00x   1.00x
AES192          1.36x   1.36x   1.00x   1.00x   1.00x   1.04x   1.26x   1.22x   1.00x   1.04x
AES256          1.32x   1.31x   1.03x   1.00x   1.00x   1.00x   1.24x   1.30x   1.03x   0.97x
TWOFISH         1.31x   1.26x   1.23x   1.00x   1.25x   1.00x   1.24x   1.23x   1.00x   1.03x
ARCFOUR         1.05x   0.96x
DES             1.31x   1.33x   1.26x   1.29x   1.28x   1.29x   1.26x   1.29x   1.27x   1.29x
TWOFISH128      1.27x   1.24x   1.23x   1.00x   1.28x   1.00x   1.21x   1.26x   0.97x   1.06x
SERPENT128      1.19x   1.19x   1.15x   1.00x   1.14x   1.00x   1.17x   1.17x   0.98x   1.00x
SERPENT192      1.19x   1.24x   1.17x   1.00x   1.14x   1.00x   1.15x   1.17x   1.00x   1.00x
SERPENT256      1.16x   1.19x   1.17x   1.00x   1.14x   1.00x   1.15x   1.15x   1.00x   1.00x
RFC2268_40      1.00x   0.99x   1.00x   1.01x   1.00x   1.00x   1.03x   1.00x   1.01x   1.00x
SEED            1.20x   1.20x   1.18x   1.17x   1.17x   1.19x   1.18x   1.16x   1.19x   1.19x
CAMELLIA128     1.38x   1.34x   1.31x   1.00x   1.31x   1.00x   1.29x   1.32x   1.00x   1.00x
CAMELLIA192     1.27x   1.27x   1.23x   1.00x   1.25x   1.03x   1.20x   1.23x   1.00x   1.00x
CAMELLIA256     1.27x   1.27x   1.26x   1.00x   1.25x   1.03x   1.20x   1.23x   1.00x   1.00x
SALSA20         1.04x   1.00x

(Note: bulk encryption/decryption do burn_stack after full buffer processing,
instead of after each block.)

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agocipher: bufhelp: allow unaligned memory accesses on ARM
Jussi Kivilinna [Fri, 16 Aug 2013 16:44:55 +0000 (19:44 +0300)]
cipher: bufhelp: allow unaligned memory accesses on ARM

* cipher/bufhelp.h [__arm__ && __ARM_FEATURE_UNALIGNED]: Enable
BUFHELP_FAST_UNALIGNED_ACCESS.
--

Newer ARM systems support unaligned memory accesses and on gcc-4.7 and onwards
this is identified by __ARM_FEATURE_UNALIGNED macro.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoRemove burn_stack optimization
Jussi Kivilinna [Sat, 17 Aug 2013 07:48:36 +0000 (10:48 +0300)]
Remove burn_stack optimization

* src/misc.c (_gcry_burn_stack): Remove SIZEOF_UNSIGNED_LONG == 4 or 8
optimization.
--

At least GCC 4.6 on Debian Wheezy (armhf) generates wrong code for burn_stack,
causing recursive structure to be transformed in to iterative without updating
stack pointer between iterations. Therefore only first 64 bytes of stack get
zeroed. This appears to be fixed in GCC 4.7, but lets play this safe and
remove this optimization.

Better approach would probably be to add architecture specific assembly
routine(s) that replace this generic function.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agocamellia: add ARMv6 assembly implementation
Jussi Kivilinna [Fri, 16 Aug 2013 11:40:34 +0000 (14:40 +0300)]
camellia: add ARMv6 assembly implementation

* cipher/Makefile.am: Add 'camellia-armv6.S'.
* cipher/camellia-armv6.S: New file.
* cipher/camellia-glue.c [USE_ARMV6_ASM]
(_gcry_camellia_armv6_encrypt_block)
(_gcry_camellia_armv6_decrypt_block): New prototypes.
[USE_ARMV6_ASM] (Camellia_EncryptBlock, Camellia_DecryptBlock)
(camellia_encrypt, camellia_decrypt): New functions.
* cipher/camellia.c [!USE_ARMV6_ASM]: Compile encryption and decryption
routines if USE_ARMV6_ASM macro is _not_ defined.
* cipher/camellia.h (USE_ARMV6_ASM): New macro.
[!USE_ARMV6_ASM] (Camellia_EncryptBlock, Camellia_DecryptBlock): If
USE_ARMV6_ASM is defined, disable these function prototypes.
(camellia) [arm]: Add 'camellia-armv6.lo'.
--

Add optimized ARMv6 assembly implementation for Camellia. Implementation is tuned
for Cortex-A8. Unaligned access handling is done in assembly part.

For now. only enable this on little-endian systems as big-endian correctness
have not been tested yet.

Old vs new. Cortex-A8 (on Debian Wheezy/armhf):
                ECB/Stream         CBC             CFB             OFB             CTR
             --------------- --------------- --------------- --------------- ---------------
CAMELLIA128   1.44x   1.47x   1.35x   1.34x   1.43x   1.39x   1.38x   1.36x   1.38x   1.39x
CAMELLIA192   1.60x   1.62x   1.52x   1.47x   1.56x   1.54x   1.52x   1.53x   1.52x   1.53x
CAMELLIA256   1.59x   1.60x   1.49x   1.47x   1.53x   1.54x   1.51x   1.50x   1.52x   1.53x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agoblowfish: add ARMv6 assembly implementation
Jussi Kivilinna [Fri, 16 Aug 2013 09:51:52 +0000 (12:51 +0300)]
blowfish: add ARMv6 assembly implementation

* cipher/Makefile.am: Add 'blowfish-armv6.S'.
* cipher/blowfish-armv6.S: New file.
* cipher/blowfish.c (USE_ARMV6_ASM): New macro.
[USE_ARMV6_ASM] (_gcry_blowfish_armv6_do_encrypt)
(_gcry_blowfish_armv6_encrypt_block)
(_gcry_blowfish_armv6_decrypt_block, _gcry_blowfish_armv6_ctr_enc)
(_gcry_blowfish_armv6_cbc_dec, _gcry_blowfish_armv6_cfb_dec): New
prototypes.
[USE_ARMV6_ASM] (do_encrypt, do_encrypt_block, do_decrypt_block)
(encrypt_block, decrypt_block): New functions.
(_gcry_blowfish_ctr_enc) [USE_ARMV6_ASM]: Use ARMv6 assembly function.
(_gcry_blowfish_cbc_dec) [USE_ARMV6_ASM]: Use ARMv6 assembly function.
(_gcry_blowfish_cfb_dec) [USE_ARMV6_ASM]: Use ARMv6 assembly function.
* configure.ac (blowfish) [arm]: Add 'blowfish-armv6.lo'.
--

Patch provides non-parallel implementations for small speed-up and 2-way
parallel implementations that gets accelerated on multi-issue CPUs (hand-tuned
for in-order dual-issue Cortex-A8). Unaligned access handling is done in
assembly.

For now, only enable this on little-endian systems as big-endian correctness
have not been tested yet.

Old vs new (Cortex-A8, Debian Wheezy/armhf):

             ECB/Stream         CBC             CFB             OFB             CTR
  --------------- --------------- --------------- --------------- ---------------
BLOWFISH   1.28x   1.16x   1.21x   2.16x   1.26x   1.86x   1.21x   1.25x   1.89x   1.96x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
6 years agocast5: add ARMv6 assembly implementation
Jussi Kivilinna [Wed, 14 Aug 2013 18:06:15 +0000 (21:06 +0300)]
cast5: add ARMv6 assembly implementation

* cipher/Makefile.am: Add 'cast5-armv6.S'.
* cipher/cast5-armv6.S: New file.
* cipher/cast5.c (USE_ARMV6_ASM): New macro.
(CAST5_context) [USE_ARMV6_ASM]: New members 'Kr_arm_enc' and
'Kr_arm_dec'.
[USE_ARMV6_ASM] (_gcry_cast5_armv6_encrypt_block)
(_gcry_cast5_armv6_decrypt_block, _gcry_cast5_armv6_ctr_enc)
(_gcry_cast5_armv6_cbc_dec, _gcry_cast5_armv6_cfb_dec): New prototypes.
[USE_ARMV6_ASM] (do_encrypt_block, do_decrypt_block, encrypt_block)
(decrypt_block): New functions.
(_gcry_cast5_ctr_enc) [USE_ARMV6_ASM]: Use ARMv6 assembly function.
(_gcry_cast5_cbc_dec) [USE_ARMV6_ASM]: Use ARMv6 assembly function.
(_gcry_cast5_cfb_dec) [USE_ARMV6_ASM]: Use ARMv6 assembly function.
(do_cast_setkey) [USE_ARMV6_ASM]: Initialize 'Kr_arm_enc' and
'Kr_arm_dec'.
* configure.ac (cast5) [arm]: Add 'cast5-armv6.lo'.
--

Provides non-parallel implementations for small speed-up and 2-way parallel
implementations that gets accelerated on multi-issue CPUs (hand-tuned for
in-order dual-issue Cortex-A8). Unaligned access handling is done in assembly.

For now, only enable this on little-endian systems as big-endian correctness
have not been tested yet.

Old vs new (Cortex-A8, Debian Wheezy/armhf):

          ECB/Stream         CBC             CFB             OFB             CTR
       --------------- --------------- --------------- --------------- ---------------
CAST5   1.15x   1.12x   1.12x   2.07x   1.14x   1.60x   1.12x   1.13x   1.62x   1.63x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>