As announced on the mailing list, WireGuard will be in Linux 5.6. As a
result, the wg(8) tool, used by OpenWRT in the same manner as ip(8), is
moving to its own wireguard-tools repo. Meanwhile, the out-of-tree
kernel module for kernels 3.10 - 5.5 moved to its own wireguard-linux-
compat repo. Yesterday, releases were cut out of these repos, so this
commit bumps packages to match. Since wg(8) and the compat kernel module
are versioned and released separately, we create a wireguard-tools
Makefile to contain the source for the new tools repo. Later, when
OpenWRT moves permanently to Linux 5.6, we'll drop the original module
package, leaving only the tools. So this commit shuffles the build
definition around a bit but is basically the same idea as before.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
(cherry picked from commit ea980fb9c6)
* curve25519: not all linkers support bmi2 and adx
This should allow WireGuard to build on older toolchains.
* global: switch to coarse ktime
Our prior use of fast ktime before meant that sometimes, depending on how
broken the motherboard was, we'd wind up calling into the HPET slow path. Here
we move to coarse ktime which is always super speedy. In the process we had to
fix the resolution of the clock, as well as introduce a new interface for it,
landing in 5.3. Older kernels fall back to a fast-enough mechanism based on
jiffies.
https://lore.kernel.org/lkml/tip-e3ff9c3678b4d80e22d2557b68726174578eaf52@git.kernel.org/https://lore.kernel.org/lkml/20190621203249.3909-3-Jason@zx2c4.com/
* netlink: cast struct over cb->args for type safety
This follow recent upstream changes such as:
https://lore.kernel.org/lkml/20190628144022.31376-1-Jason@zx2c4.com/
* peer: use LIST_HEAD macro
Style nit.
* receive: queue dead packets to napi queue instead of empty rx_queue
This mitigates a WARN_ON being triggered by the workqueue code. It was quite
hard to trigger, except sporadically, or reliably with a PC Engines ALIX, an
extremely slow board with an AMD LX800 that Ryan Whelan of Axatrax was kind
enough to mail me.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
(cherry picked from commit 7c23f741e9)
There was an issue with the backport compat layer in yesterday's snapshot,
causing issues on certain (mostly Atom) Intel chips on kernels older than
4.2, due to the use of xgetbv without checking cpu flags for xsave support.
This manifested itself simply at module load time. Indeed it's somewhat tricky
to support 33 different kernel versions (3.10+), plus weird distro
frankenkernels.
If OpenWRT doesn't support < 4.2, you probably don't need to apply this.
But it also can't hurt, and probably best to stay updated.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* tools: add wincompat layer to wg(8)
Consistent with a lot of the Windows work we've been doing this last cycle,
wg(8) now supports the WireGuard for Windows app by talking through a named
pipe. You can compile this as `PLATFORM=windows make -C src/tools` with mingw.
Because programming things for Windows is pretty ugly, we've done this via a
separate standalone wincompat layer, so that we don't pollute our pretty *nix
utility.
* compat: udp_tunnel: force cast sk_data_ready
This is a hack to work around broken Android kernel wrapper scripts.
* wg-quick: freebsd: workaround SIOCGIFSTATUS race in FreeBSD kernel
FreeBSD had a number of kernel race conditions, some of which we can vaguely
work around. These are in the process of being fixed upstream, but probably
people won't update for a while.
* wg-quick: make darwin and freebsd path search strict like linux
Correctness.
* socket: set ignore_df=1 on xmit
This was intended from early on but didn't work on IPv6 without the ignore_df
flag. It allows sending fragments over IPv6.
* qemu: use newer iproute2 and kernel
* qemu: build iproute2 with libmnl support
* qemu: do not check for alignment with ubsan
The QEMU build system has been improved to compile newer versions. Linking
against libmnl gives us better error messages. As well, enabling the alignment
check on x86 UBSAN isn't realistic.
* wg-quick: look up existing routes properly
* wg-quick: specify protocol to ip(8), because of inconsistencies
The route inclusion check was wrong prior, and Linux 5.1 made it break
entirely. This makes a better invocation of `ip route show match`.
* netlink: use new strict length types in policy for 5.2
* kbuild: account for recent upstream changes
* zinc: arm64: use cpu_get_elf_hwcap accessor for 5.2
The usual churn of changes required for the upcoming 5.2.
* timers: add jitter on ack failure reinitiation
Correctness tweak in the timer system.
* blake2s,chacha: latency tweak
* blake2s: shorten ssse3 loop
In every odd-numbered round, instead of operating over the state
x00 x01 x02 x03
x05 x06 x07 x04
x10 x11 x08 x09
x15 x12 x13 x14
we operate over the rotated state
x03 x00 x01 x02
x04 x05 x06 x07
x09 x10 x11 x08
x14 x15 x12 x13
The advantage here is that this requires no changes to the 'x04 x05 x06 x07'
row, which is in the critical path. This results in a noticeable latency
improvement of roughly R cycles, for R diagonal rounds in the primitive. As
well, the blake2s AVX implementation is now SSSE3 and considerably shorter.
* tools: allow setting WG_ENDPOINT_RESOLUTION_RETRIES
System integrators can now specify things like
WG_ENDPOINT_RESOLUTION_RETRIES=infinity when building wg(8)-based init
scripts and services, or 0, or any other integer.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
With this change, the file is reduced from 5186 bytes to 4649 bytes that
its approximately 10.5 percent less memory consumption. For small
devices, sometimes every byte counts.
Also, all other protocol handler use tabs instead of spaces.
Signed-off-by: Florian Eckert <fe@dev.tdt.de>
* allowedips: initialize list head when removing intermediate nodes
Fix for an important regression in removing allowed IPs from the last
snapshot. We have new test cases to catch these in the future as well.
* tools: warn if an AllowedIP has a nonzero host part
If you try to run `wg set wg0 peer ... allowed-ips 192.168.1.82/24`, wg(8)
will now print a warning. Even though we mask this automatically down to
192.168.1.0/24, usually when people specify it like this, it's a mistake.
* wg-quick: add 'strip' subcommand
The new strip subcommand prints the config file to stdout after stripping
it of all wg-quick-specific options. This enables tricks such as:
`wg addconf $DEV <(wg-quick strip $DEV)`.
* tools: avoid unneccessary next_peer assignments in sort_peers()
Small C optimization the compiler was probably already doing.
* peerlookup: rename from hashtables
* allowedips: do not use __always_inline
* device: use skb accessor functions where possible
Suggested tweaks from Dave Miller.
* blake2s: simplify
* blake2s: remove outlen parameter from final
The blake2s implementation has been simplified, since we don't use any of the
fancy tree hashing parameters or the like. We also no longer separate the
output length at initialization time from the output length at finalization
time.
* global: the _bh variety of rcu helpers have been unified
* compat: nf_nat_core.h was removed upstream
* compat: backport skb_mark_not_on_list
The usual assortment of compat fixes for Linux 5.1.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Instead of creating host-routes depending on fwmark as (accidentally)
pushed by commit
1e8bb50b93 ("wireguard: do not add host-dependencies if fwmark is set")
use a new config option 'nohostroute' to explicitely prevent creation
of the route to the endpoint.
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
The 'fwmark' option is used to define routing traffic to
wireguard endpoints to go through specific routing tables.
In that case it doesn't make sense to setup routes for
host-dependencies in the 'main' table, so skip setting host
dependencies if 'fwmark' is set.
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
* wg-quick: freebsd: allow loopback to work
FreeBSD adds a route for point-to-point destination addresses. We don't
really want to specify any destination address, but unfortunately we
have to. Before we tried to cheat by giving our own address as the
destination, but this had the unfortunate effect of preventing
loopback from working on our local ip address. We work around this with
yet another kludge: we set the destination address to 127.0.0.1. Since
127.0.0.1 is already assigned to an interface, this has the same effect
of not specifying a destination address, and therefore we accomplish the
intended behavior. Note that the bad behavior is still present in Darwin,
where such workaround does not exist.
* tools: remove unused check phony declaration
* highlighter: when subtracting char, cast to unsigned
* chacha20: name enums
* tools: fight compiler slightly harder
* tools: c_acc doesn't need to be initialized
* queueing: more reasonable allocator function convention
Usual nits.
* systemd: wg-quick should depend on nss-lookup.target
Since wg-quick(8) calls wg(8) which does hostname lookups, we should
probably only run this after we're allowed to look up hostnames.
* compat: backport ALIGN_DOWN
* noise: whiten the nanoseconds portion of the timestamp
This mitigates unrelated sidechannel attacks that think they can turn
WireGuard into a useful time oracle.
* hashtables: decouple hashtable allocations from the main device allocation
The hashtable allocations are quite large, and cause the device allocation in
the net framework to stall sometimes while it tries to find a contiguous
region that can fit the device struct. To fix the allocation stalls, decouple
the hashtable allocations from the device allocation and allocate the
hashtables with kvmalloc's implicit __GFP_NORETRY so that the allocations fall
back to vmalloc with little resistance.
* chacha20poly1305: permit unaligned strides on certain platforms
The map allocations required to fix this are mostly slower than unaligned
paths.
* noise: store clamped key instead of raw key
This causes `wg show` to now show the right thing. Useful for doing
comparisons.
* compat: ipv6_stub is sometimes null
On ancient kernels, ipv6_stub is sometimes null in cases where IPv6 has
been disabled with a command line flag or other failures.
* Makefile: don't duplicate code in install and modules-install
* Makefile: make the depmod path configurable
* queueing: net-next has changed signature of skb_probe_transport_header
A 5.1 change. This could change again, but for now it allows us to keep this
snapshot aligned with our upstream submissions.
* netlink: don't remove allowed ips for new peers
* peer: only synchronize_rcu_bh and traverse trie once when removing all peers
* allowedips: maintain per-peer list of allowedips
This is a rather big and important change that makes it much much faster to do
operations involving thousands of peers. Batch peer/allowedip addition and
clearing is several orders of magnitude faster now.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* tools: curve25519: handle unaligned loads/stores safely
This should fix sporadic crashes with `wg pubkey` on certain architectures.
* netlink: auth socket changes against namespace of socket
In WireGuard, the underlying UDP socket lives in the namespace where the
interface was created and doesn't move if the interface is moved. This
allows one to create the interface in some privileged place that has
Internet access, and then move it into a container namespace that only
has the WireGuard interface for egress. Consider the following
situation:
1. Interface created in namespace A. Socket therefore lives in namespace A.
2. Interface moved to namespace B. Socket remains in namespace A.
3. Namespace B now has access to the interface and changes the listen
port and/or fwmark of socket. Change is reflected in namespace A.
This behavior is arguably _fine_ and perhaps even expected or
acceptable. But there's also an argument to be made that B should have
A's cred to do so. So, this patch adds a simple ns_capable check.
* ratelimiter: build tests with !IPV6
Should reenable building in debug mode for systems without IPv6.
* noise: replace getnstimeofday64 with ktime_get_real_ts64
* ratelimiter: totalram_pages is now a function
* qemu: enable FP on MIPS
Linux 5.0 support.
* keygen-html: bring back pure javascript implementation
Benoît Viguier has proofs that values will stay well within 2^53. We
also have an improved carry function that's much simpler. Probably more
constant time than emscripten's 64-bit integers.
* contrib: introduce simple highlighter library
This is the highlighter library being used in:
- https://twitter.com/EdgeSecurity/status/1085294681003454465
- https://twitter.com/EdgeSecurity/status/1081953278248796165
It's included here as a contrib example, so that others can paste it into
their own GUI clients for having the same strictly validating highlighting.
* netlink: use __kernel_timespec for handshake time
This readies us for Y2038. See https://lwn.net/Articles/776435/ for more info.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* chacha20,poly1305: fix up for win64
* poly1305: only export neon symbols when in use
* poly1305: cleanup leftover debugging changes
* crypto: resolve target prefix on buggy kernels
* chacha20,poly1305: don't do compiler testing in generator and remove xor helper
* crypto: better path resolution and more specific generated .S
* poly1305: make frame pointers for auxiliary calls
* chacha20,poly1305: do not use xlate
This should fix up the various build errors, warnings, and insertion errors
introduced by the previous snapshot, where we added some significant
refactoring. In short, we're trying to port to using Andy Polyakov's original
perlasm files, and this means quite a lot of work to re-do that had stableized
in our old .S.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* Zinc no longer ships generated assembly code. Rather, we now
bundle in the original perlasm generator for it. The primary purpose
of this snapshot is to get testing of this.
* Clarify the peer removal logic and make lifetimes more precise.
* Use READ_ONCE for is_valid and is_dead.
* No need to use atomic when the recounter is mutex protected.
* Fix up macros and annotations in allowedips.
* Increment drop counter when staged packets are dropped.
* Use static constants instead of enums for 64-bit values in selftest.
* Mark large constants as ULL in poly1305-donna64.
* Fix sparse warnings in allowedips debugging code.
* Do not use wg_peer_get_maybe_zero in timer callbacks, since we now can
carefully control the lifetime of these functions and ensure they never
execute after dropping the last reference.
* Cleanup hashing in ratelimiter.
* Do not guard timer removals, since del_timer is always okay.
* We now check for PM_AUTOSLEEP, which makes the clear*on-suspend decision a
bit more general.
* Set csum_level to ~0, since the poly1305 authenticator certainly means
that no data was modified in transit.
* Use CHECKSUM_PARTIAL check for skb_checksum_help instead of
skb_checksum_setup check.
* wg.8: specify that wg(8) shows runtime info too
* wg.8: AllowedIPs isn't actually required
* keygen-html: add missing glue macro
* wg-quick: android: do not choke on empty allowed-ips
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
ba2ab5d version: bump snapshot
5f59c76 tools: wg-quick: wait for interface to disappear on freebsd
ac7e7a3 tools: don't fail if a netlink interface dump is inconsistent
8432585 main: get rid of unloaded debug message
139e57c tools: compile on gnu99
d65817c tools: use libc's endianness macro if no compiler macro
f985de2 global: give if statements brackets and other cleanups
b3a5d8a main: change module description
296d505 device: use textual error labels always
8bde328 allowedips: swap endianness early on
a650d49 timers: avoid using control statements in macro
db4dd93 allowedips: remove control statement from macro by rewriting
780a597 global: more nits
06b1236 global: rename struct wireguard_ to struct wg_
205dd46 netlink: do not stuff index into nla type
2c6b57b qemu: kill after 20 minutes
6f2953d compat: look in Kbuild and Makefile since they differ based on arch
a93d7e4 create-patch: blacklist instead of whitelist
8d53657 global: prefix functions used in callbacks with wg_
123f85c compat: don't output for grep errors
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
64750c1 version: bump snapshot
f11a2b8 global: style nits
4b34b6a crypto: clean up remaining .h->.c
06d9fc8 allowedips: document additional nobs
c32b5f9 makefile: do more generic wildcard so as to avoid rename issues
20f48d8 crypto: use BIT(i) & bitmap instead of (bitmap >> i) & 1
b6e09f6 crypto: disable broken implementations in selftests
fd50f77 compat: clang cannot handle __builtin_constant_p
bddaca7 compat: make asm/simd.h conditional on its existence
b4ba33e compat: account for ancient ARM assembler
Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
* Account for big-endian 2^26 conversion in Poly1305.
* Account for big-endian NEON in Curve25519.
* Fix macros in big-endian AArch64 code so that this will actually run there
at all.
* Prefer if (IS_ENABLED(...)) over ifdef mazes when possible.
* Call simd_relax() within any preempt-disabling glue code every once in a
while so as not to increase latency if folks pass in super long buffers.
* Prefer compiler-defined architecture macros in assembly code, which puts us
in closer alignment with upstream CRYPTOGAMS code, and is cleaner.
* Non-static symbols are prefixed with wg_ to avoid polluting the global
namespace.
* Return a bool from simd_relax() indicating whether or not we were
rescheduled.
* Reflect the proper simd conditions on arm.
* Do not reorder lines in Kbuild files for the simd asm-generic addition,
since we don't want to cause merge conflicts.
* WARN() if the selftests fail in Zinc, since if this is an initcall, it won't
block module loading, so we want to be loud.
* Document some interdependencies beside include statements.
* Add missing static statement to fpu init functions.
* Use union in chacha to access state words as a flat matrix, instead of
casting a struct to a u8 and hoping all goes well. Then, by passing around
that array as a struct for as long as possible, we can update counter[0]
instead of state[12] in the generic blocks, which makes it clearer what's
happening.
* Remove __aligned(32) for chacha20_ctx since we no longer use vmovdqa on x86,
and the other implementations do not require that kind of alignment either.
* Submit patch to ARM tree for adjusting RiscPC's cflags to be -march=armv3 so
that we can build code that uses umull.
* Allow CONFIG_ARM[64] to imply [!]CONFIG_64BIT, and use zinc arch config
variables consistently throughout.
* Document rationale for the 2^26->2^64/32 conversion in code comments.
* Convert all of remaining BUG_ON to WARN_ON.
* Replace `bxeq lr` with `reteq lr` in ARM assembler to be compatible with old
ISAs via the macro in <asm/assembler.h>.
* Do not allow WireGuard to be a built-in if IPv6 is a module.
* Writeback the base register and reorder multiplications in the NEON x25519
implementation.
* Try all combinations of different implementations in selftests, so that
potential bugs are more immediately unearthed.
* Self tests and SIMD glue code work with #include, which lets the compiler
optimize these. Previously these files were .h, because they were included,
but a simple grep of the kernel tree shows 259 other files that carry out
this same pattern. Only they prefer to instead name the files with a .c
instead of a .h, so we now follow the convention.
* Support many more platforms in QEMU, especially big endian ones.
* Kernels < 3.17 don't have read_cpuid_part, so fix building there.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
33523a5 version: bump snapshot
0759480 curve25519-hacl64: reduce stack usage under KASAN
b9ab0fc chacha20: add bounds checking to selftests
2e99d19 chacha20-mips32r2: reduce stack and branches in loop, refactor jumptable handling
d6ac367 qemu: bump musl
28d8b7e crypto: make constant naming scheme consistent
56c4ea9 hchacha20: keep in native endian in words
0c3c0bc chacha20-arm: remove unused preambles
3dcd246 chacha20-arm: updated scalar code from Andy
6b9d5ca poly1305-mips64: remove useless preprocessor error
3ff3990 crypto-arm: rework KERNEL_MODE_NEON handling again
dd2f91e crypto: flatten out makefile
67a3cfb curve25519-fiat32: work around m68k compiler stack frame bug
9aa2943 allowedips: work around kasan stack frame bug in selftest
317b318 chacha20-arm: use new scalar implementation
b715e3b crypto-arm: rework KERNEL_MODE_NEON handling
77b07d9 global: reduce stack frame size
ddc2bd6 chacha20: add chunked selftest and test sliding alignments and hchacha20
2eead02 chacha20-mips32r2: reduce jumptable entry size and stack usage
a0ac620 chacha20-mips32r2: use simpler calling convention
09247c0 chacha20-arm: go with Ard's version to optimize for Cortex-A7
a329e0a chacha20-mips32r2: remove reorder directives
3b22533 chacha20-mips32r2: fix typo to allow reorder again
d4ac6bb poly1305-mips32r2: remove all reorder directives
197a30c global: put SPDX identifier on its own line
305806d ratelimiter: disable selftest with KASAN
4e06236 crypto: do not waste space on selftest items
5e0fd08 netlink: reverse my christmas trees
a61ea8b crypto: explicitly dual license
b161aff poly1305: account for simd being toggled off midway
470a0c5 allowedips: change from BUG_ON to WARN_ON
aa9e090 chacha20: prefer crypto_xor_cpy to avoid memmove
1b0adf5 poly1305: no need to trick gcc 8.1
a849803 blake2s: simplify final function
073f3d1 poly1305: better module description
Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
* blake2s-x86_64: fix whitespace errors
* crypto: do not use compound literals in selftests
* crypto: make sure UML is properly disabled
* kconfig: make NEON depend on CPU_V7
* poly1305: rename finish to final
* chacha20: add constant for words in block
* curve25519-x86_64: remove useless define
* poly1305: precompute 5*r in init instead of blocks
* chacha20-arm: swap scalar and neon functions
* simd: add __must_check annotation
* poly1305: do not require simd context for arch
* chacha20-x86_64: cascade down implementations
* crypto: pass simd by reference
* chacha20-x86_64: don't activate simd for small blocks
* poly1305-x86_64: don't activate simd for small blocks
* crypto: do not use -include trick
* crypto: turn Zinc into individual modules
* chacha20poly1305: relax simd between sg chunks
* chacha20-x86_64: more limited cascade
* crypto: allow for disabling simd in zinc modules
* poly1305-x86_64: show full struct for state
* chacha20-x86_64: use correct cut off for avx512-vl
* curve25519-arm: only compile if symbols will be used
* chacha20poly1305: add __init to selftest helper functions
* chacha20: add independent self test
Tons of improvements all around the board to our cryptography library,
including some performance boosts with how we handle SIMD for small packets.
* send/receive: reduce number of sg entries
This quells a powerpc stack usage warning.
* global: remove non-essential inline annotations
We now allow the compiler to determine whether or not to inline certain
functions, while still manually choosing so for a few performance-critical
sections.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* curve25519: arm: do not modify sp directly
* compat: support neon.h on old kernels
* compat: arch-namespace certain includes
* compat: move simd.h from crypto to compat since it's going upstream
This fixes a decent amount of compat breakage and thumb2-mode breakage
introduced by our move to Zinc.
* crypto: use CRYPTOGAMS license
Rather than using code from OpenSSL, use code directly from AndyP.
* poly1305: rewrite self tests from scratch
* poly1305: switch to donna
This makes our C Poly1305 implementation a bit more intensely tested and also
faster, especially on 64-bit systems. It also sets the stage for moving to a
HACL* implementation when that's ready.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* Kconfig: use new-style help marker
* global: run through clang-format
* uapi: reformat
* global: satisfy check_patch.pl errors
* global: prefer sizeof(*pointer) when possible
* global: always find OOM unlikely
Tons of style cleanups.
* crypto: use unaligned helpers
We now avoid unaligned accesses for generic users of the crypto API.
* crypto: import zinc
More style cleanups and a rearrangement of the crypto routines to fit how this
is going to work upstream. This required some fairly big changes to our build
system, so there may be some build errors we'll have to address in subsequent
snapshots.
* compat: rng_is_initialized made it into 4.19
We therefore don't need it in the compat layer anymore.
* curve25519-hacl64: use formally verified C for comparisons
The previous code had been proved in Z3, but this new code from upstream
KreMLin is directly generated from the F*, which is preferable. The
assembly generated is identical.
* curve25519-x86_64: let the compiler decide when/how to load constants
Small performance boost.
* curve25519-arm: reformat
* curve25519-arm: cleanups from lkml
* curve25519-arm: add spaces after commas
* curve25519-arm: use ordinary prolog and epilogue
* curve25519-arm: do not waste 32 bytes of stack
* curve25519-arm: prefix immediates with #
This incorporates ASM nits from upstream review.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* send: switch handshake stamp to an atomic
Rather than abusing the handshake lock, we're much better off just using
a boring atomic64 for this. It's simpler and performs better. Also, while
we're at it, we set the handshake stamp both before and after the
calculations, in case the calculations block for a really long time waiting
for the RNG to initialize.
* compat: better atomic acquire/release backport
This should fix compilation and correctness on several platforms.
* crypto: move simd context to specific type
This was a suggestion from Andy Lutomirski on LKML.
* chacha20poly1305: selftest: use arrays for test vectors
We no longer have lines so long that they're rejected by SMTP servers.
* qemu: add easy git harness
This makes it a bit easier to use our qemu harness for testing our mainline
integration tree.
* curve25519-x86_64: avoid use of r12
This causes problems with RAP and KERNEXEC for PaX, as r12 is a
reserved register.
* chacha20: use memmove in case buffers overlap
A small correctness fix that we never actually hit in WireGuard but is
important especially for moving this into a general purpose library.
* curve25519-hacl64: simplify u64_eq_mask
* curve25519-hacl64: correct u64_gte_mask
Two bitmath fixes from Samuel, which come complete with a z3 script proving
their correctness.
* timers: include header in right file
This fixes compilation in some environments.
* netlink: don't start over iteration on multipart non-first allowedips
Matt Layher found a bug where a netlink dump of peers would never terminate in
some circumstances, causing wg(8) to keep trying forever. We now have a fix as
well as a unit test to mitigate this, and we'll be looking to create a fuzzer
out of Matt's nice library.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Changelog taken from the version announcement
> == Changes ==
>
> * chacha20poly1305: selftest: split up test vector constants
>
> The test vectors are encoded as long strings -- really long strings -- and
> apparently RFC821 doesn't like lines longer than 998.
> https://cr.yp.to/smtp/message.html
>
> * queueing: keep reference to peer after setting atomic state bit
>
> This fixes a regression introduced when preparing the LKML submission.
>
> * allowedips: prevent double read in kref
> * allowedips: avoid window of disappeared peer
> * hashtables: document immediate zeroing semantics
> * peer: ensure resources are freed when creation fails
> * queueing: document double-adding and reference conditions
> * queueing: ensure strictly ordered loads and stores
> * cookie: returned keypair might disappear if rcu lock not held
> * noise: free peer references on failure
> * peer: ensure destruction doesn't race
>
> Various fixes, as well as lots of code comment documentation, for a
> small variety of the less obvious aspects of object lifecycles,
> focused on correctness.
>
> * allowedips: free root inside of RCU callback
> * allowedips: use different macro names so as to avoid confusion
>
> These incorporate two suggestions from LKML.
>
> This snapshot contains commits from: Jason A. Donenfeld and Jann Horn.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Yousong Zhou <yszhou4tech@gmail.com>
This watchdog script tries to re-resolve hostnames for inactive WireGuard peers.
Use it for peers with a frequently changing dynamic IP.
persistent_keepalive must be set, recommended value is 25 seconds.
Run this script from cron every minute:
echo '* * * * * /usr/bin/wireguard_watchdog' >> /etc/crontabs/root
Signed-off-by: Aleksandr V. Piskunov <aleksandr.v.piskunov@gmail.com>
[bump the package release]
Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
80b41cd version: bump snapshot
fe5f0f6 recieve: disable NAPI busy polling
e863f40 device: destroy workqueue before freeing queue
81a2e7e wg-quick: allow link local default gateway
95951af receive: use gro call instead of plain call
d9501f1 receive: account for zero or negative budget
e80799b tools: only error on wg show if all interfaces failk
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
[Added commit log to commit description]
Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
* device: print daddr not saddr in missing peer error
* receive: style
Debug messages now make sense again.
* wg-quick: android: support excluding applications
Android now supports excluding certain apps (uids) from the tunnel.
* selftest: ratelimiter: improve chance of success via retry
* qemu: bump default kernel version
* qemu: decide debug kernel based on KERNEL_VERSION
Some improvements to our testing infrastructure.
* receive: use NAPI on the receive path
This is a big change that should both improve preemption latency (by not
disabling it unconditionally) and vastly improve rx performance on most
systems by using NAPI. The main purpose of this snapshot is to test out this
technique.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
dfd9827 version: bump snapshot
88729f0 wg-quick: android: prevent outgoing handshake packets from being dropped
1bb9daf compat: more robust ktime backport
68441fb global: use fast boottime instead of normal boottime
d0bd6dc global: use ktime boottime instead of jiffies
18822b8 tools: fix misspelling of strchrnul in comment
0f8718b manpages: eliminate whitespace at the end of the line
590c410 global: fix a few typos
bb76804 simd: add missing header
7e88174 poly1305: give linker the correct constant data section size
fd8dfd3 main: test poly1305 before chacha20poly1305
c754c59 receive: don't toggle bh
Compile-tested-for: ath79 Archer C7 v2
Run-tested-on: ath79 Archer C7 v2
Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
0bc4230 version: bump snapshot
ed04799 poly1305: add missing string.h header
cbd4e34 compat: use stabler lkml links
caa718c ratelimiter: do not allow concurrent init and uninit
894ddae ratelimiter: mitigate reference underflow
0a8a62c receive: drop handshake packets if rng is not initialized
cad9e52 noise: wait for crng before taking locks
83c0690 netlink: maintain static_identity lock over entire private key update
0913f1c noise: take locks for ss precomputation
073f31a qemu: bump default kernel
bec4c48 wg-quick: android: don't forget to free compiled regexes
7ce2ef3 wg-quick: android: disable roaming to v6 networks when v4 is specified
9132be4 dns-hatchet: apply resolv.conf's selinux context to new resolv.conf
41a5747 simd: no need to restore fpu state when no preemption
6d7f0b0 simd: encapsulate fpu amortization into nice functions
f8b57d5 queueing: re-enable preemption periodically to lower latency
b7b193f queueing: remove useless spinlocks on sc
5bb62fe tools: getentropy requires 10.12
4e9f120 chacha20poly1305: use slow crypto on -rt kernels on arm too
Compiled-for: ar71xx, lantiq
Run-tested-on: ar71xx Archer C7 v2 & lantiq HH5a
Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
This version bump was made upstream mostly for OpenWRT, and should fix
an issue with a null dst when on the flow offloading path.
While we're at it, Kevin and I are the only people actually taking care
of this package, so trim the maintainer list a bit.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* chacha20poly1305: add mips32 implementation
"The OpenWRT Commit" - this significantly speeds up performance on cheap
plastic MIPS routers, and presumably the remaining MIPS32r2 super computers
out there.
* timers: reinitialize state on init
* timers: round up instead of down in slack_time
* timers: remove slack_time
* timers: clear send_keepalive timer on sending handshake response
* timers: no need to clear keepalive in persistent keepalive
Andrew He and I have helped simplify the timers and remove some old warts,
making the whole system a bit easier to analyze.
* tools: fix errno propagation and messages
Error messages are now more coherent.
* device: remove allowedips before individual peers
This avoids an O(n^2) traversal in favor of an O(n) one. Before systems with
many peers would grind when deleting the interface.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Drop package/network/services/wireguard/patches/100-portability.patch
Instead pass 'PLATFORM=linux' to make since we are always building FOR
linux.
Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
This makes it easier to distribute prefixes over a wireguard tunnel
interface, by simply setting the ip6prefix option in uci (just like with
other protocols).
Obviously, routing etc needs to be setup properly for things to work; this
just adds the config option so the prefix can be assigned to other
interfaces.
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
6b4a340 version: bump snapshot
faa2103 compat: don't clear header bits on RHEL
4014532 compat: handle RHEL 7.5's recent backports
66589bc queueing: preserve pfmemalloc header bit
37f114a chacha20poly1305: make gcc 8.1 happy
926caae socket: use skb_put_data
724d979 wg-quick: preliminary support for go implementation
c454c26 allowedips: simplify arithmetic
71d44be allowedips: produce better assembly with unsigned arithmetic
5e3532e allowedips: use native endian on lookup
856f105 allowedips: add selftest for allowedips_walk_by_peer
41df6d2 embeddable-wg-library: zero attribute padding
9a1bea6 keygen-html: add zip file example
f182b1a qemu: retry on 404 in wget for kernel.org race
Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
7cc2668 version: bump snapshot
860c7c7 poly1305: do not place constants in different sections
5f1e4ca compat: remove unused dev_recursion_level backport
7e4b991 blake2s: remove unused helper
13225fc send: simplify skb_padding with nice macro
a1525bf send: account for route-based MTU
bbb2fde wg-quick: account for specified fwmark in auto routing mode
c452105 qemu: bump default version
dbe5223 version: bump snapshot
1d3ef31 chacha20poly1305: put magic constant behind macro
cdc164c chacha20poly1305: add self tests from wycheproof
1060e54 curve25519: add self tests from wycheproof
0e1e127 wg-quick.8: fix typo
2b06b8e curve25519: precomp const correctness
8102664 curve25519: memzero in batches
1f54c43 curve25519: use cmov instead of xor for cswap
fa5326f curve25519: use precomp implementation instead of sandy2x
9b19328 compat: support OpenSUSE 15
3102d28 compat: silence warning on frankenkernels
8f64c61 compat: stable kernels are now receiving b87b619
62127f9 wg-quick: hide errors on save
Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
7c0d711 version: bump snapshot
b6a5cc0 contrib: add extract-handshakes kprobe example
37dc953 wg-quick: if resolvconf/run/iface exists, use it
1f9be19 wg-quick: if resolvconf/interface-order exists, use it
4d2d395 noise: align static_identity keys
14395d2 compat: use correct -include path
38c6d8f noise: fix function prototype
302d0c0 global: in gnu code, use un-underscored asm
ff4e06b messages: MESSAGE_TOTAL is unused
ea81962 crypto: read only after init
e35f409 Kconfig: require DST_CACHE explicitly
9d5baf7 Revert "contrib: keygen-html: rewrite in pure javascript"
6e09a46 contrib: keygen-html: rewrite in pure javascript
e0af0f4 compat: workaround netlink refcount bug
ec65415 contrib: embedded-wg-library: add key generation functions
06099b8 allowedips: fix comment style
ce04251 contrib: embedded-wg-library: add ability to add and del interfaces
7403191 queueing: skb_reset: mark as xnet
Changes:
* queueing: skb_reset: mark as xnet
This allows cgroups to classify packets.
* contrib: embedded-wg-library: add ability to add and del interfaces
* contrib: embedded-wg-library: add key generation functions
The embeddable library gains a few extra tricks, for people implementing
plugins for various network managers.
* crypto: read only after init
* allowedips: fix comment style
* messages: MESSAGE_TOTAL is unused
* global: in gnu code, use un-underscored asm
* noise: fix function prototype
Small cleanups.
* compat: workaround netlink refcount bug
An upstream refcounting bug meant that in certain situations it became
impossible to unload the module. So, we work around it in the compat code. The
problem has been fixed in 4.16.
* contrib: keygen-html: rewrite in pure javascript
* Revert "contrib: keygen-html: rewrite in pure javascript"
We nearly moved away from emscripten'ing the fiat32 code, but the resultant
floating point javascript was just too terrifying.
* Kconfig: require DST_CACHE explicitly
Required for certain frankenkernels.
* compat: use correct -include path
Fixes certain out-of-tree build systems.
* noise: align static_identity keys
Gives us better alignment of private keys.
* wg-quick: if resolvconf/interface-order exists, use it
* wg-quick: if resolvconf/run/iface exists, use it
Better compatibility with Debian's resolvconf.
* contrib: add extract-handshakes kprobe example
Small utility for extracting ephemeral key data from the kernel's memory.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Hans Dedecker <dedeckeh@gmail.com> (git log --oneline description)
Bump to latest WireGuard snapshot release:
44f8e4d version: bump snapshot
bbe2f94 chacha20poly1305: wire up avx512vl for skylake-x
679e53a chacha20: avx512vl implementation
10b1232 poly1305: fix avx512f alignment bug
5fce163 chacha20poly1305: cleaner generic code
63a0031 blake2s-x86_64: fix spacing
d2e13a8 global: add SPDX tags to all files
d94f3dc chacha20-arm: fix with clang -fno-integrated-as.
3004f6b poly1305: update x86-64 kernel to AVX512F only
d452d86 tools: no need to put this on the stack
0ff098f tools: remove undocumented unused syntax
b1aa43c contrib: keygen-html for generating keys in the browser
e35e45a kernel-tree: jury rig is the more common spelling
210845c netlink: rename symbol to avoid clashes
fcf568e device: clear last handshake timer on ifdown
d698467 compat: fix 3.10 backport
5342867 device: do not clear keys during sleep on Android
88624d4 curve25519: explictly depend on AS_AVX
c45ed55 compat: support RAP in assembly
7f29cf9 curve25519: modularize dispatch
Refresh patches.
Compile-test-for: ar71xx
Run-tested-on: ar71xx Archer C7 v2
Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
== Changes ==
* compat: support timespec64 on old kernels
* compat: support AVX512BW+VL by lying
* compat: fix typo and ranges
* compat: support 4.15's netlink and barrier changes
* poly1305-avx512: requires AVX512F+VL+BW
Numerous compat fixes which should keep us supporting 3.10-4.15-rc1.
* blake2s: AVX512F+VL implementation
* blake2s: tweak avx512 code
* blake2s: hmac space optimization
Another terrific submission from Samuel Neves: we now have an implementation
of Blake2s using AVX512, which is extremely fast.
* allowedips: optimize
* allowedips: simplify
* chacha20: directly assign constant and initial state
Small performance tweaks.
* tools: fix removing preshared keys
* qemu: use netfilter.org https site
* qemu: take shared lock for untarring
Small bug fixes.
Remove myself from the maintainers list: we have enough and I'm happy to
carry on doing package bumps on ad-hoc basis without the 'official'
title.
Run-tested: ar71xx Archer C7 v2
Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Bump to latest WireGuard snapshot release:
ed479fa (tag: 0.0.20171122) version: bump snapshot
efd9db0 chacha20poly1305: poly cleans up its own state
5700b61 poly1305-x86_64: unclobber %rbp
314c172 global: switch from timeval to timespec
9e4aa7a poly1305: import MIPS64 primitive from OpenSSL
7a5ce4e chacha20poly1305: import ARM primitives from OpenSSL
abad6ee chacha20poly1305: import x86_64 primitives from OpenSSL
6507a03 chacha20poly1305: add more test vectors, some of which are weird
6f136a3 compat: new kernels have netlink fixes
e4b3875 compat: stable finally backported fix
cc07250 qemu: use unprefixed strip when not cross-compiling
64f1a6d tools: tighten up strtoul parsing
c3a04fe device: uninitialize socket first in destruction
82e6e3b socket: only free socket after successful creation of new
df318d1 compat: fix compilation with PaX
d911cd9 curve25519-neon: compile in thumb mode
d355e57 compat: 3.16.50 got proper rt6_get_cookie
666ee61 qemu: update kernel
2420e18 allowedips: do not write out of bounds
185c324 selftest: allowedips: randomized test mutex update
3f6ed7e wg-quick: document localhost exception and v6 rule
Compile-tested-for: ar71xx
Run-tested-on: ar71xx Archer C7 v2
Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Check if the compiler defines __linux__, instead of assuming that the
host OS is the same as the target OS.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Update wireguard to latest snapshot:
9fc5daf version: bump snapshot
748ca6b compat: unbreak unloading on kernels 4.6 through 4.9
7be9894 timers: switch to kees' new timer_list functions
6be9a66 wg-quick: save all hooks on save
752e7af version: bump snapshot
2cd9642 wg-quick: fsync the temporary file before renaming
b139499 wg-quick: allow for saving existing interface
582c201 contrib: add reresolve-dns
8e04be1 tools: correct type for CTRL_ATTR_FAMILY_ID
c138276 wg-quick: allow for the hatchet, but not by default
d03f2a0 global: use fewer BUG_ONs
6d681ce timers: guard entire setting in block
4bf32ca curve25519: only enable int128 if compiler support is sound
86e06a3 device: expand scope of destruct lock
e3661ab global: get rid of useless forward declarations
bedc77a device: only take reference if netns is different
7c07e22 wg-quick: remember to rewind DNS settings on failure
2352ec0 wg-quick: allow specifiying multiple hooks
573cb19 qemu: test using four cores
e09ec4d global: style nits
4d3deae qemu: work around ccache bugs
7491cd4 global: infuriating kernel iterator style
78e079c peer: store total number of peers instead of iterating
d4e2752 peer: get rid of peer_for_each magic
6cf12d1 compat: be sure to include header before testing
3ea08d8 qemu: allow for cross compilation
d467551 crypto/avx: make sure we can actually use ymm registers
c786c46 blake2: include headers for macros
328e386 global: accept decent check_patch.pl suggestions
a473592 compat: fix up stat calculation for udp tunnel
9d930f5 stats: more robust accounting
311ca62 selftest: initialize mutex in routingtable selftest
8a9a6d3 netns: use time-based test instead of quantity-based
e480068 netns: use read built-in instead of ncat hack for dmesg
Compile-tested-for: ar71xx
Run-tested-on: ar71xx Archer C7 v2
Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
This is a simple version bump. Changes:
* noise: handshake constants can be read-only after init
* noise: no need to take the RCU lock if we're not dereferencing
* send: improve dead packet control flow
* receive: improve control flow
* socket: eliminate dead code
* device: our use of queues means this check is worthless
* device: no need to take lock for integer comparison
* blake2s: modernize API and have faster _final
* compat: support READ_ONCE
* compat: just make ro_after_init read_mostly
Assorted cleanups to the module, including nice things like marking our
precomputations as const.
* Makefile: even prettier output
* Makefile: do not clean before cloc
* selftest: better test index for rate limiter
* netns: disable accept_dad for all interfaces
Fixes in our testing and build infrastructure. Now works on the 4.14 rc
series.
* qemu: add build-only target
* qemu: work on ubuntu toolchain
* qemu: add more debugging options to main makefile
* qemu: simplify shutdown
* qemu: open /dev/console if we're started early
* qemu: phase out bitbanging
* qemu: always create directory before untarring
* qemu: newer packages
* qemu: put hvc directive into configuration
This is the beginning of working out a cross building test suite, so we do
several tricks to be less platform independent.
* tools: encoding: be more paranoid
* tools: retry resolution except when fatal
* tools: don't insist on having a private key
* tools: add pass example to wg-quick man page
* tools: style
* tools: newline after warning
* tools: account for padding being in zero attribute
Several important tools fixes, one of which suppresses a needless warning.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Move wireguard from openwrt/packages to base a package.
This follows the pattern of kmod-cake and openvpn. Cake is a fast-moving
experimental kernel module that many find essential and useful. The
other is a VPN client. Both are inside of core. When you combine the two
characteristics, you get WireGuard. Generally speaking, because of the
extremely lightweight nature and "stateless" configuration of WireGuard,
many view it as a core and essential utility, initiated at boot time
and immediately configured by netifd, much like the use of things like
GRE tunnels.
WireGuard has a backwards and forwards compatible Netlink API, which
means the userspace tools should work with both newer and older kernels
as things change. There should be no versioning requirements, therefore,
between kernel bumps and userspace package bumps.
Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Jo-Philipp Wich <jo@mein.io>
Acked-by: Felix Fietkau <nbd@nbd.name>