struct of_device_id is not implicitly included anymore. Include
<linux/mod_devicetable.h> to fix compilation on Linux 5.15.
Also upstream commit a24d22b225ce15 ("crypto: sha - split sha.h into
sha1.h and sha2.h") from Linux 5.11 moves functionality from sha.h to
sha1.h.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
ARC4 was used for WEP, which is not secure anymore. Therefor it is
disabled in the driver, but the code is not removed for now.
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
The lantiq AES hardware does not support the gcm algorithm. But it
can be implemented in the driver as a combination of the aes_ctr
algorithm and the xor plus gfmul operations for the hashing.
Due to the wrapping of the several algorithms and the inefficient
16 byte block by 16 byte block invokation in the kernel
implementations, this driver is about 3 times faster for the larger
block sizes.
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
After adding xts and cbcmac the aes algorithm source had three sections
for setting the aes key to the hardware which are identical.
Method aes_set_key_hw was created which is now called from within the
spinlock secured control sections in methods ifx_deu_aes, ifx_deu_aes_xts
and aes_cbcmac_final_impl and reduces the size of ifxmips_aes.c.
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
Since commit 53b6783 hostapd is using the kernel api which includes the
cbcmac-aes shash algorithm. The kernels implementation is a wrapper around
the aes encryption algorithm, which encrypts block (16 bytes) by block.
When the ltq-deu driver is present, it uses hardware aes, but every 16 byte
encrypt requires setting the key. This is very inefficient and is a huge
overhead. Since the cbcmac-aes is simply a hash that uses the cbc aes
algorithm starting with an iv set to x'00' with an optional ecb aes
encryption of a possible last incomplete block that is padded with the
positional bytes of the last cbc encrypted block, this algorithm is now
added to the driver. Most of the code is derived from md5-hmac and
tailored for aes. Tested with the kernels crypto testmgr including extra
tests against the kernels generic ccm module implementation.
This patch also fixes the overallocation in the aes_ctx that is caused
by using u32 instead of u8 for the aes keys.
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
The lantiq AES hardware does not support the xts algorithm. Apart
from the cipher text stealing (XTS), the AES XTS implementation is
just an XOR with the IV, followed by AES ECB, followed by another
XOR with the IV and as such can be also implemented by using the
lantiq hardware's CBC AES implemention plus one additional XOR with
the IV in the driver. The output IV by CBC AES is also not usable
and the gfmul operation not supported by lantiq hardware. Both need
to be done in the driver too in addition to the IV treatment which is
the initial encryption by the other half of the input key and to
set the IV to the IV registers for every block.
In the generic kernel implementation, the block size for XTS is set
to 16 bytes, although the algorithm is designed to process any size
of input larger than 16 bytes. But since there is no way to
indicate a minimum input length, the block size is used. This leads
to certain issues when the skcipher walk functions are used, e.g.
processing less than block size bytes is not supported by calling
skcipher_walk_done.
The walksize is 2 AES blocks because otherwise for splitted input
or output data, less than blocksize is to be returned in some cases,
which cannot be processed. Another issue was that depending on
possible split of input/output data, just 16 bytes are returned while
less than 16 bytes were remaining, while cipher text stealing
requires 17 bytes or more for processing.
For example, if the input is 60 bytes and the walk is 48, then
processing 48 bytes leads to a return code of -EINVAL for
skcipher_walk_done. Therefor the processed counter is used to
figure out, when the actual cipher text stealing for the remaining
bytes less than blocksize needs to be applied.
Measured with cryptsetup benchmark, this XTS AES implementation is
about 19% faster than the kernels XTS implementation that uses the
hardware ECB AES (ca. 18.6 MiB/s vs. 15.8 MiB/s decryption 256b key).
The implementation was tested with the kernels crypto testmgr against
the kernels generic XTS AES implementation including extended tests.
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
The processing in the hmac algorithms depends on the status fields:
count, dbn and started. Not all were initialised in the init method
and after finishing the final method. Added missing fields to init
method and call init method after finishing final.
The memsets have the wrong size in the original driver and did not
clear everything and are not necessary. Since no memset is done in
the kernels generic implementation, memsets were removed.
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
Removing hash pointer in _hmac_setkey since its not needed and causes
a compiler warning.
Make the spinlock control sections shorter and move initializations
out of the control sections to free the spinlock faster for allowing
other threads to use the hash engine.
Minor improvements for indentation and removal of blanks and blank
lines in some areas.
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
Exceeding the temp array size was not checked and instead storage not
allocated by the driver was used/overwritten which in most cases
resulted in reboots. This patch implements processing the input to the
hash algorithm in tempsize chunks.
The _hmac_final methods were changed to _hmac_final_impl adding a
parameter that indicates intermediate or final processing. The started
variable was added to the context to indicate, if there is an
intermediate result in the context. For sha1_hmac the variable to store
the intermediate hash was added to the context too.
In order to avoid md5_hmac_final_impl being recursively called if the
padding of the input and the resulting last transform during the hmac
algorighms final processing causes the temp array to overflow and to
make sure that there is at least one block in the temp array when the
_hmac_final for final processing is called, the check for exceeding
the temp array in _hmac_transform was moved before copying the block
and incrementing dbn. dbn needs to be at least 1 at final processing
time to let the hash engine apply the opad operation.
To make the hash engine not apply the hmac algorithms final opad
operation, for intermediate processing the dbn in the control register
is set to a higher value than number of dbns are actually processed.
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
The hmac algorithms state, that keys larger than the key size should be
hashed with the underlying hash algorithms and then those hashes are to
be used as keys. This patch implements this. In order to avoid allocating
a descriptor during setkey, a shash_desc pointer is added to the context.
Another issue for multithreaded callers is the shared temp array.
The temp array is static and as such would be shared among multithreaded
callers, which obviously would neither work nor produce correct results.
The temp array (4k size) is moved to the context and since the size of
the context is limited, it can only be defined as pointer otherwise the
initialisation of the hash algorithm fails.
The allocations and freeing of both the temp and the desc pointer in the
context are done by implementing cra_init and cra_exit functions for
the hmac algorithms.
Also improved indentation in some areas.
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
Error ifxdeu-ctr-rfc3686(aes) (16) doesn't match generic impl (20) occurs
when running the cryptomgr extra tests that compare against the linux
kernels generic implementation.
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
The algorithms sha1, sha1_hmac and md5_hmac all use ENDI=1. The md5
algorithm uses ENDI=0 and the endian_swap methods to reverse the
endianess switch by using user CPU time, which is unnecessary overhead.
Danube and AR9 devices do not set endianess for SHA1, so is done for
MD5.
Furthermore the patch replaces endian_swap with le32_to_cpu for md5 and
md5 hmac algorithms and removes endian_swap for them.
The init functions initialize the algorithm in the hardware. The lock is
not used to write to the control register. If another thread calls
another hash algo before update or final, the result will be wrong.
Therefore move the algorithm init to the lock protected sections in the
transform or final methods.
Setting the hw key for the hmac algorithms is now done from within the
lock protected sections in their final methods. The lock protecting is
removed from the _hmac_setkey_hw functions.
In final for md5 and sha1 the lock section is removed, because all the
work was already done in transform (which is called from final). As such
only copying the hash to the output is required.
MD5 and MD5_HMAC produce 16 byte hashes (4 DWORDS) only, therefor
writing register D5R to the hash output is removed for MD5_HMAC.
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
All hash algorithms use the same base IFX_HASH_CON to access the hash unit.
Parallel threads should not be able to call different hash algorithms and
therefor a global lock is required.
Fixed linker warning, that md5_hmac_init, md5_hmac_update and
md5_hmac_final are static export symbols. The export symbols are not
required, because the functions are exposed using shash_alg structure.
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
The functions ifx_deu_aes_cfg and ifx_deu_aes_ofb have been part of the
driver ever since. But the functions and definitions to make the
algorithms actually usable were missing.
This patch adds the neccessary code for aes_ofb and aes_cfb algorithms.
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
When running cryptomgr tests against the driver, there are several
occurences of different errors for even and uneven splitted data in the
underlying scatterlists for the ctr and ctr_rfc3686 algorithms which are
now fixed.
Fixed error in ctr_rfc3686_aes_decrypt function which was introduced with
the previous commit by using CRYPTO_DIR_ENCRYPT in the decrypt function.
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
When running cryptomgr tests against the driver, there are several
occurences of different errors for setkey of des and des3-ede
algorithms.
Those key checks are already implemented in the kernels des
implementation, so this is added as dependency and the kernel methods
are called. It also required adding the kernels des/des3 context
definitions to the des_ctx internal structure to be able to call the
kernel methods.
Fixed ifxdeu-des... setkey unexpectedly succeeded on test vector x;
expected_error=-22.
Fixed ifxdeu-des... setkey failed on test vector x; expected_error=0,
actual_error=-22.
Renamed des_ctx internal structure and des_encrypt/des_decrypt methods
because they are already defined in the kernel module.
Fixed wrong DES_xxx constant definitions in crypto_alg definition for
ifxdeu_des3_ede_alg.
Fixed method comment errors.
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
The <linux/cryptohash.h> was removed with Linux 5.8, because it only
contained the library implementation of SHA1, which was folded
into <crypto/sha.h>.
So switch this driver away from using <linux/cryptohash.h>.
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
Convert blkcipher to skcipher for the synchronous versions of AES,
DES and ARC4.
The Block Cipher API was depracated for a while and was removed with
Linux 5.5. So switch this driver to the skcipher API.
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
Some devices initialize AES during boot and AES works out of the box
and the correct endianess is set.
NDC means (No Danube Compatibility Mode) and the endianess setting has
no effect if its set to 0.
NDC 0: OFF ENDI bit cannot be written as in Danube
To make it work for other devices, the NDC control register needs to
be set to 1.
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
OpenSSL with cryptdev support uses the data encryption unit (DEU) driver
for hard accelerated processing of ciphers/digests, if the flag
CRYPTO_ALG_KERN_DRIVER_ONLY is set.
Signed-off-by: Mathias Kresin <dev@kresin.me>
[fix commit title prefix]
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
Even if the minimum blocksize is set to 16 (AES_BLOCK_SIZE), the crypto
manager tests pass 499 bytes of data to the aes-ctr encryption, from
which only 496 bytes are actually encrypted.
Reading the comment regarding the minimum blocksize, it only states that
it's the "smallest possible unit which can be transformed with this
algorithm". Which doesn't necessarily mean, the data have to be a
multiple of the minimal blocksize.
All kernel hardware crypto driver enforce a minimum blocksize of 1,
which perfect fine works for the lantiq data encryption unit as well.
Lower the blocksize limit to 1, to process not padded data as well.
In AES for processing the remaining bytes, uninitialized pointers
were used.
This patch fixes using uninitialized pointers and wrong offsets.
Signed-off-by: Mathias Kresin <dev@kresin.me>
[fix commit title prefix]
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
When handling non-aligned remaining data (not padded to 16 byte
[AES_BLOCK_SIZE]), a full 16 byte block is read from the input buffer
and written to the output buffer after en-/decryption.
While code already assumes that an input buffer could have less than 16
byte remaining, as it can be seen by the code zeroing the remaining
bytes till AES_BLOCK_SIZE, the full AES_BLOCK_SIZE is read.
An output buffer size of a multiple of AES_BLOCK_SIZE is expected but
never validated.
To get rid of the read/write behind buffer, use a temporary buffer when
dealing with not padded data and only write as much bytes to the output
as we read.
Do not memcpy directly to the register, to make used of the endian swap
macro and to trigger the crypto start operator via the ID0R to trigger
the register. Since we might need an endian swap for the output in
future, use a temporary buffer for the output as well.
The issue could not be observed so far, since all caller of ifx_deu_aes
will ignore the padded (remaining) data. Considering that the minimum
blocksize for the algorithm is set to AES_BLOCK_SIZE, the behaviour
could be called expected.
Signed-off-by: Mathias Kresin <dev@kresin.me>
[fix commit title prefix]
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
The crypto algorithms are registered and available to the system before
the chip is actually powered on and the generic parameter for the DEU
behaviour set.
The issue can mainly be observed if the crypto manager tests are enabled
in the kernel config. The crypto manager test run directly after an
algorithm is registered.
Signed-off-by: Mathias Kresin <dev@kresin.me>
[fix commit title prefix]
Signed-off-by: Daniel Kestrel <kestrel1974@t-online.de>
This drops kernel version switches for versions not supported by
OpenWrt master at the moment. This only adjusts local code, but
doesn't touch patches to existing external packages.
Signed-off-by: Adrian Schmutzler <freifunk@adrianschmutzler.de>
Upstream commit 84ede58dfcd1d ("crypto: hash - remove
CRYPTO_ALG_TYPE_DIGEST") drops the CRYPTO_ALG_TYPE_DIGEST define because
it has the same value as CRYPTO_ALG_TYPE_HASH. This was the case for
earlier kernels as well. Switch to CRYPTO_ALG_TYPE_HASH to fix building
against Linux 5.4.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
This is fixing multiple compile problems with kernel 4.14 and updates the
code to take care of changes introduced between kernel 4.9 and 4.14.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Mathias Kresin <dev@kresin.me>
With the default priority of 0, the DEU algos would be overlapped
by the generic algos (if available).
To fix this, set the cra_priority of the hardware algos to the
recommended value of 300/400.
Signed-off-by: Martin Schiller <mschiller@tdt.de>
Remove the "DEU test manager" code which has not been used for more than
two years (as the kernel module is not installed anymore since r38731).
This fixes compilation on kernel 4.3, which removes
aead_request_set_assoc (and newer kernels).
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
SVN-Revision: 48308
Upstream linux 4.2 commit 84be456f883c4685680fba8e5154b5f72e92957e
"remove <asm/scatterlist.h>" requires us to include linux/scatterlist.h
instead. This also works with older kernels (at least 4.1, thanks to
Hauke Mehrtens for testing).
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
SVN-Revision: 48282