15 KiB
ZeroTier Planetary Switch Users Guide
This manual describes the design and operation of ZeroTier and its associated services, apps, and libraries. Its intended audience includes IT professionals, network administrators, information security experts, and developers.
ZeroTier Central, our enterprise web UI, has its own help and documentation that can be accessed through its interface.
Table of Contents
- Introduction
- How it Works
- ZeroTier One: The Network Virtualization Service
- Common Use Cases
- For Developers: Connecting IoT Devices and Apps
- Licensing
1. Introduction
ZeroTier is a smart Ethernet switch for planet Earth.
We've re-thought networking from first principles to deliver the flat end-to-end simplicity of the original pre-NAT pre-mobility Internet in a way that meets the security and mobility requirements of the 21st century. ZeroTier transforms the world into a unified modern data center where VPN, SDN, SD-WAN, and application peer to peer networking converge and where the distinction between the cloud and the endpoint largely disappears. All the complexity of managing these networking aspects as disparate systems is replaced by the simplicity of a single virtual cloud.
At first some users struggle with this paradigm, finding it difficult to forget the fragmentation and complexity that has accreted around networking over the past decade or two. We urge skeptical users to just try it and see how many networking acronyms vanish before their eyes.
Unlike most networking products it won't take you hours, days, or weeks to test or deploy ZeroTier. Most of the time everything just works with zero configuration, and most users with some level of TCP/IP knowledge can get up and running in minutes. More advanced features like rules, micro-segmentation, capability based security credentials, network monitoring, and clustering are available but you don't need to worry about them until they're needed.
The first section (2) of this guide explains ZeroTier's design and operation in detail and is written for users with at least an intermediate knowledge of topics like TCP/IP and Ethernet networking. Reading and understanding everything in it is not mandatory but we've written it as a deep technical dive as serious IT users typically like to understand the systems they deploy and use. Sections 3 and 4 deal more concretely with the ZeroTier One endpoint service software and how to deploy for common use cases.
2. How it Works
ZeroTier is comprised of two closely coupled but conceptually distinct layers in the OSI model sense: a virtual "wire" layer called VL1 that carries data and a virtual switched Ethernet layer called VL2 to provide devices and apps with a familiar communication paradigm.
2.1. VL1: The ZeroTier Peer to Peer Network
To build a planetary data center we first had to begin with the wiring. Tunneling into the Earth's core and putting a giant wire closet down there wasn't an option, so we decided to use software to build virtual wires over the existing Internet instead.
In conventional networks L1 (OSI layer 1) refers to the actual CAT5/CAT6 cables or wireless radio channels over which data is carried and the physical transciever chips that modulate and demodulate it. VL1 is a peer to peer network that does the same thing by using encryption, authentication, and a lot of networking tricks to create virtual wires as needed.
2.1.1. Network Topology and Peer Discovery
VL1's persistent structure is a hierarchical tree similar to DNS, but its leaves make direct ephemeral connections to one another on demand. At the base of the tree resides a pool of equal and fully redundant roots whose function is closely analogous to that of DNS root name servers.
Roots run the same software as regular endpoints but reside at fast stable locations on the network and are designated as such by a world definition. There are two kinds of world definitions: a planet and a moon. The ZeroTier protocol contains a secure mechanism allowing world definitions to be updated in band.
There is only one planet. Earth's root servers are operated by ZeroTier, Inc. as a free service. Their presence defines and unifies the global data center where we all reside.
Users can create "moons." These nominate additional roots for redundancy or performance. The most common reasons for doing this are to eliminate hard dependency on ZeroTier's third party infrastructure or to designate local roots inside your building or cloud so ZeroTier can work without a connection to the Internet. Moons are by no means required and most of our users get by just fine without them.
When peers start out they have no direct links to one another, only upstream to roots. Every peer on VL1 possesses a globally unique address, but unlike IP addresses these are opaque cryptographic identifiers that encode no routing information. To communicate peers first send packets "up" the tree, and as these packets traverse the network they trigger the opportunistic creation of direct links along the way. The tree is constantly trying to "collapse itself" to optimize itself to the pattern of traffic it is carrying.
In the simplest case using only global roots, an initial connection setup between peers A and B goes like this:
- A wants to send a packet to B, but since it has no direct path it sends it upstream to R (a root).
- R does have a direct link to B so it forwards the packet there.
- R also sends a message called RENDEZVOUS to A containing hints about how it might reach B, and to B informing it how it might reach A. We call this "transport triggered link provisioning."
- A and B get RENDEZVOUS and attempt to send test messages to each other, possibly accomplishing hole punching of any NATs or stateful firewalls that happen to be in the way. If this works a direct link is established and packets no longer need to take the scenic route.
VL1 provides instant always-on virtual L1 connectivity between all devices in the world. Indirect paths are automatically and transparently upgraded to direct paths whenever possible, and if a direct path is lost ZeroTier falls back to indirect communication and the process begins again.
If a direct path can never be established, indirect communication can continue forever with direct connection attempts also continuing indefinitely on a periodic basis. The protocol also contains other facilities for direct connectivity establishment such as LAN peer discovery, port prediction to traverse IPv4 symmetric NATs, and explicit endpoint advertisement that can be coupled with port mapping using uPnP or NAT-PMP if available. VL1 is persistent and determined when it comes to finding the most efficient way to move data.
This is not a wholly unique design. It shares features in common with STUN/ICE, SIP, and other protocols. The most novel aspect may be the simplification achieved through lazy transport triggered link provisioning. This trades the complicated state machines of STUN/ICE for a stateless algorithm with implicit empirical parameters. It also eliminates asymmetry. As mentioned above, roots run the same code as regular nodes.
2.1.2. Addressing
Every device (a "device" can be anything from a laptop to an app) is identified on VL1 by a 40-bit (10 hex digit) unique ZeroTier address. These are the addresses used to address packets in the process described in 2.1.1 above.
These addresses are computed from the public portion of a public/private key pair. An address along with its public key is called an identity. If you look at the home directory of a running ZeroTier instance you will see identity.public
and identity.secret
.
When ZeroTier starts for the first time it generates a new key pair and a new identity. It then attempts to advertise it upstream to the network. In the very unlikely event that the identity's 40-bit unique address is taken, it discards it and generates another.
Identities are claimed on a first come first serve basis and currently expire from global roots after 60 days of inactivity. If a long-dormant device returns it may re-claim its identity unless its address has been taken in the meantime (again, highly unlikely).
The address derivation algorithm used to compute addresses from public keys imposes a computational cost barrier against the intentional generation of a collision. Currently it would take approximately 10,000 CPU-years to do so (assuming e.g. a 3ghz Intel core). This is expensive but not impossible, but it's only the first line of defense. After generating a collision an attacker would then have to compromise all upstream nodes and replace the address's cached identity, not to mention also doing the same for peers that have seen the target identity recently.
In addition to assisting with communication, upstream nodes also act as identity caches. If the identity corresponding to an address is not known a peer may request it by sending a message called WHOIS upstream.
2.1.3. Encryption and Authentication
If you don't know much about cryptography you can safely skip this section. TL;DR: packets are end-to-end encrypted and can't be read by roots or anyone else, and we use modern 256-bit crypto in ways recommended by the professional cryptographers that created it.
Asymmetric public key encryption is Curve25519/Ed25519, a 256-bit elliptic curve variant.
Every VL1 packet is encrypted end to end using (as of the current version) 256-bit Salsa20 and authenticated using the Poly1305 message authentication (MAC) algorithm.
MAC is computed after encryption (encrypt-then-MAC) and the cipher/MAC composition used is identical to the NaCl reference implementation.
As of today we do not implement forward secrecy or other stateful cryptographic features in VL1. We don't do this for the sake of simplicity, reliability, and code footprint, and because frequently changing state makes features like clustering and fail-over much harder to implement. See our discussion on GitHub.
For those who have very high security needs and want forward secrecy, we currently recommend the use of encrypted protocols such as SSH and SSL over ZeroTier. Not only do these provide forward secrecy, the use of multiple layers of encryption in this way provides excellent defense in depth. The computational cost of this additional crypto is typically small, and the benefit can potentially be large. All software can contain bugs, but multiple layers of protection means that discovery of a catastrophic bug in any one layer does not result in compromise of your entire system. We recommend the same for authentication. While ZeroTier VL2 provides certificate-based network boundary enforcement, we do not recommend that users rely solely on this for access control to critical systems. It is always good to use more than one security measure whenever practical.
2.1.4. Trusted Paths for Fast Local SDN
To support the use of ZeroTier as a high performance SDN/NFV protocol over physically secure networks the protocol supports a feature called trusted paths. It is possible to configure all ZeroTier devices on a given network to skip encryption and authentication for traffic over a designated physical path. This can cut CPU use noticably in high traffic scenarios but at the expense of effectively all transport security over the configured trusted backplane.
Trusted paths do not prevent communication with devices elsewhere, since traffic over other paths will be encrypted and authenticated normally.
We don't recommend the use of this feature unless you really need the performance and you know what you're doing. Extra security is never a bad thing. We also recommend thinking carefully before disabling transport security on a cloud private network. Larger cloud providers such as Amazon and Azure tend to provide good network segregation but many less costly providers offer private networks that are "party lines." For these the encryption and authentication provided by ZeroTier is very desirable. In fact, we have a few users using ZeroTier exactly for this reason.