This is the version 2 of the libp2p Circuit Relay protocol.
Lifecycle Stage | Maturity | Status | Latest Revision |
---|---|---|---|
1A | DRAFT | Active | r1, 2021-12-17 |
Authors: @vyzo
Interest Group: @mxinden, @stebalien, @raulk
See the lifecycle document for context about maturity level and spec status.
This is the specification of v2 of the p2p-circuit relay protocol.
Compared to the first version of the protocol, there are some significant departures:
- The protocol has been split into two subprotocols,
hop
andstop
- The
hop
protocol is client-initiated, and is used when clients send commands to relays; it is used for reserving resources in the relay and opening a switched connection to a peer through the relay. - The
stop
protocol governs the endpoints of circuit switched connections.
- The
- The concept of resource reservation has been introduced, whereby peers wishing to use a relay explicitly reserve resources and obtain reservation vouchers which can be distributed to their peers for routing purposes.
- The concept of limited relaying has been introduced, whereby relays provide switched connectivity with a limited duration and data cap.
The evolution of the protocol towards v2 has been influenced by our experience in operating open relays in the wild. The original protocol, while very flexible, has some limitations when it comes to the practicalities of relaying connections.
The main problem is that v1 has no mechanism to reserve resources in the relay, which leads to continuous over-subscription of relays and the necessity of (often ineffective) heuristics for balancing resources. In practice, running a relay proved to be an expensive proposition requiring dedicated hosts with significant hardware and bandwidth costs. In addition, there is ongoing work in Hole Punching coordination for direct connection upgrade through relays, which doesn't require an unlimited relay connection.
In order to address the situation and seamlessly support pervasive hole punching, we have introduced limited relays and slot reservations. This allows relays to effectively manage their resources and provide service at a small scale, thus enabling the deployment of an army of relays for extreme horizontal scaling without excessive bandwidth costs and dedicated hosts.
Furthermore, the original decision to conflate circuit initiation and termination in the same protocol has made it very hard to provide relay service on demand, decoupled with whether client functionality is supported by the host.
In order to address this problem, we have split the protocol into the
hop
and stop
subprotocols. This allows us to always enable the
client-side functionality in a host, while providing the option to
later mount the relay service in public hosts, after the
reachability of the host has been determined through AutoNAT.
The following diagram illustrates the interaction between three peers, A, B, and R, in the course of establishing a relayed connection. Peer A is a private peer, which is not publicly reachable; it utilizes the services of peer R as the relay. Peer B is another peer who wishes to connect to peer A through R.
Instructions to reproduce diagram
Use https://plantuml.com/ and the specification below to reproduce the diagram.
@startuml
participant A
participant R
participant B
skinparam sequenceMessageAlign center
== Reservation ==
A -> R: [hop] RESERVE
R -> A: [hop] STATUS:OK
hnote over A: Reservation timeout approaching.
hnote over A: Refresh.
A -> R: [hop] RESERVE
R -> A: [hop] STATUS:OK
hnote over A: ...
== Circuit Establishment ==
B -> R: [hop] CONNECT to A
R -> A: [stop] CONNECT from B
A -> R: [stop] STATUS:OK
R -> B: [hop] STATUS:OK
B <-> A: Connection
@enduml
The first part of the interaction is A's reservation of a relay slot
in R. This is accomplished by opening a connection to R and
sending a RESERVE
message in the hop
protocol; if the reservation
is successful, the relay responds with a STATUS:OK
message and
provides A with a reservation voucher. A keeps the connection to
R alive for the duration of the reservation, refreshing the
reservation as needed.
The second part of the interaction is the establishment of a circuit
switch connection from B to A through R. It is assumed that B
has obtained a circuit multiaddr for A of the form
/p2p/QmR/p2p-circuit/p2p/QmA
out of band using some peer discovery
service (eg. the DHT or a rendezvous point).
In order to connect to A, B then connects to R, opens a hop
protocol stream and sends a CONNECT
message to the relay. The relay
verifies that it has a reservation and connection for A and opens a
stop
protocol stream to A, sending a CONNECT
message.
Peer A then responds to the relay with a STATUS:OK
message, which
responds to B with a STATUS:OK
message in the open hop
stream
and then proceeds to bridge the two streams into a relayed connection.
The relayed connection flows in the hop
stream between the
connection initiator and the relay and in the stop
stream between
the relay and the connection termination point.
The Hop protocol governs interaction between clients and the relay;
it uses the protocol ID /libp2p/circuit/relay/0.2.0/hop
.
There are two parts of the protocol:
- reservation, by peers that wish to receive relay service
- connection initiation, by peers that wish to connect to a peer through the relay.
In order to make a reservation, a peer opens a connection to the relay and sends a HopMessage
with
type = RESERVE
:
HopMessage {
type = RESERVE
}
The relay responds with a HopMessage
of type = STATUS
, indicating whether the reservation has
been accepted.
If the reservation is accepted, then the message has the following form:
HopMessage {
type = STATUS
status = OK
reservation = Reservation {...}
limit = Limit {...}
}
If the reservation is rejected, the relay responds with a HopMessage
of the form
HopMessage {
type = STATUS
status = ...
}
where the status
field has a value other than OK
. Common rejection status codes are:
PERMISSION_DENIED
if the reservation is rejected because of peer filtering using ACLs.RESERVATION_REFUSED
if the reservation is rejected for some other reason, e.g. because there are too many reservations.
The reservation
field provides information about the reservation itself; the struct has the following fields:
Reservation {
expire = ...
addrs = [...]
voucher = ...
}
- the
expire
field contains the expiration time as a UTC UNIX time. The reservation becomes invalid after this time and it's the responsibility of the client to refresh. - the
addrs
field contains all the public relay addrs, including the peer ID of the relay node but not the trailingp2p-circuit
part; the client can use this list to construct its ownp2p-circuit
relay addrs for advertising by encapsulatingp2p-circuit/p2p/QmPeer
whereQmPeer
is its peer ID. - the
voucher
is the binary representation of the reservation voucher -- see Reservation Vouchers for details.
The limit
field in HopMessage
, if present, provides information about the limits applied by the relay in relayed connection. When omitted, it indicates that the relay does not apply any limits.
The struct has the following fields:
Limit {
duration = ...
data = ...
}
- the
duration
field indicates the maximum duration of a relayed connection in seconds; if 0, there is no limit applied. - the
data
field indicates the maximum number of bytes allowed to be transmitted in each direction; if 0 there is no limit applied.
Note that the reservation remains valid until its expiration, as long as there is an active connection from the peer to the relay. If the peer disconnects, the reservation is no longer valid.
The server may drop a connection according to its connection management policy after all reservations expired. The expectation is that the server will make a best effort attempt to maintain the connection for the duration of any reservations and tag it to prevent accidental termination according to its connection management policy. If a relay server becomes overloaded however, it may still drop a connection with reservations in order to maintain its resource quotas.
Note: Implementations should not accept reservations over already relayed connections.
In order to initiate a connection to a peer through a relay, the initiator opens a connection and sends a HopMessage
of type = CONNECT
:
HopMessage {
type = CONNECT
peer = Peer {...}
}
The peer
field contains the peer ID
of the target peer and optionally the address of that peer for the case of active relay:
Peer {
id = ...
addrs = [...]
}
Note: Active relay functionality is considered deprecated for security reasons, at least in public relays.
The protocol reserves the field nonetheless to support the functionality for the rare cases where it is actually desirable to use active relay functionality in a controlled environment.
If the relay has a reservation (and thus an active connection) from the peer, then it opens the second hop of the connection using the stop
protocol; the details are not relevant for the hop
protocol and the only thing that matters is whether it succeeds in opening the relay connection or not.
If the relayed connection is successfully established, then the relay responds with HopMessage
with type = STATUS
and status = OK
:
HopMessage {
type = STATUS
status = OK
limit = Limit {...}
}
At this point the original hop
stream becomes the relayed connection.
The limit
field, if present, communicates to the initiator the
limits applied to the relayed connection with the semantics described
above.
If the relayed connection cannot be established, then the relay responds with a HopMessage
of type = STATUS
and the status
field having a value other than OK
.
Common failure status codes are:
PERMISSION_DENIED
if the connection is rejected because of peer filtering using ACLs.NO_RESERVATION
if there is no active reservation for the target peerRESOURCE_LIMIT_EXCEEDED
if there are too many relayed connections from the initiator or to the target peer.CONNECTION_FAILED
if the relay failed to terminate the connection to the target peer.
Note: Implementations should not accept connection initiations over already relayed connections.
The Stop protocol governs connection termination between the relay and the target peer;
it uses the protocol ID /libp2p/circuit/relay/0.2.0/stop
.
In order to terminate a relayed connection, the relay opens a stream using an existing connection to the target peer. If there is no existing connection, an active relay may attempt to open one using the initiator supplied address, but as discussed in the previous section this functionality is generally deprecated.
The relay sends a StopMessage
with type = CONNECT
and the following form:
StopMessage {
type = CONNECT
peer = Peer { ID = ...}
limit = Limit { ...}
}
- the
peer
field contains aPeer
struct with the peerID
of the connection initiator. - the
limit
field, if present, conveys the limits applied to the relayed connection with the semantics described above.
If the target peer accepts the connection it responds to the relay with a StopMessage
of type = STATUS
and status = OK
:
StopMessage {
type = STATUS
status = OK
}
At this point the original stop
stream becomes the relayed connection.
If the target fails to terminate the connection for some reason, then it responds to the relay with a StopMessage
of type = STATUS
and the status
code set to something other than OK
.
Common failure status codes are:
CONNECTION_FAILED
if the target internally failed to create the relayed connection for some reason.
Successful relay slot reservations should come with Reservation Vouchers. These are cryptographic certificates signed by the relay that testify that it is willing to provide service to the reserving peer. The intention is to eventually require the use of reservation vouchers for dialing relay addresses, but this is not currently enforced so the vouchers are only advisory.
The voucher itself is a Signed Envelope.
The envelope domain is libp2p-relay-rsvp
and uses the multicodec code 0x0302
.
The payload of the envelope has the following form, in canonicalized protobuf format:
message Voucher {
required bytes relay = 1;
required bytes peer = 2;
required uint64 expiration = 3;
}
- the
relay
field is the peer ID of the relay. - the
peer
field is the peer ID of the reserving peer. - the
expiration
field is the UNIX UTC expiration time for the reservation.
The wire representation is canonicalized, where elements of the message are written in field id order, with no unknown fields.
message HopMessage {
enum Type {
RESERVE = 0;
CONNECT = 1;
STATUS = 2;
}
required Type type = 1;
optional Peer peer = 2;
optional Reservation reservation = 3;
optional Limit limit = 4;
optional Status status = 5;
}
message StopMessage {
enum Type {
CONNECT = 0;
STATUS = 1;
}
required Type type = 1;
optional Peer peer = 2;
optional Limit limit = 3;
optional Status status = 4;
}
message Peer {
required bytes id = 1;
repeated bytes addrs = 2;
}
message Reservation {
required uint64 expire = 1; // Unix expiration time (UTC)
repeated bytes addrs = 2; // relay addrs for reserving peer
optional bytes voucher = 3; // reservation voucher
}
message Limit {
optional uint32 duration = 1; // seconds
optional uint64 data = 2; // bytes
}
enum Status {
OK = 100;
RESERVATION_REFUSED = 200;
RESOURCE_LIMIT_EXCEEDED = 201;
PERMISSION_DENIED = 202;
CONNECTION_FAILED = 203;
NO_RESERVATION = 204;
MALFORMED_MESSAGE = 400;
UNEXPECTED_MESSAGE = 401;
}