DevelopersTechnologyP2p
Technology

Peer-to-peer

Our peer-to-peer technology works at the heart of Spacedrive allowing all of your devices to seamlessly communicate and share data. This documentation outlines

Implementing features with P2P

TODO:

  • From frontend, to backend
  • Including authentication
  • Versioning/making breaking change
  • Show using sd_p2p_tunnel

Underlying technology

Terminology

  • Node: An application running Spacedrive's network stack.
    • This could be the Spacedrive app or the P2P relay.
    • If you have multiple Spacedrive installations open on your computer, each one is an independant node.
  • Library: A logical collection of your data within Spacedrive.
    • From a theorical perspective, a library is just the conflict resolved state of one or more instances although a lot of the time we don't stricly treat it that way.
  • Instance: An instance of a library running on a particular node.
    • An instance correlates directly to each SQLite file.
    • You could technically have more than one instance for a library on a single node, although our core would fall apart as we identify traffic by library.
  • Identity - A public/private keypair which represents the library or node.
  • RemoteIdentity - A public key which represents the library or node.
  • PeerId - The identifier libp2p uses. Can be derived from a RemoteIdentity.

sd_p2p crate

The P2P crate was designed from the group up to be modular.

The P2P struct is the core of the system, and suprisingly doesn't actually do any P2P functionality. It's a state manager and event bus along with providing a hook system for other components of the P2P system to register themselves.

This modular design helps with separting the concern which helps with comprehending the entire system and makes it easier for testing.

The sd_p2p crate provides a hook for:

  • Mdns - Local network discovery
  • Quic - Quic transport layer built on top of libp2p

What are hooks?

A hook is very similar to an actor. It's a component which can be registered with the P2P system.

A hook allows for processing events from the P2P system and also ensures when the P2P system shuts down, the hook is also shutdown.

Their are special hooks called listeners. These are implemented as a superset of a regular hook and are able to create and accept connections.

Subcrates:

sd_p2p_proto

This crate provides utilities for implementing asynchronous deserializers and matching synchronous serializers. The goal of these implementations is to really quickly send and receive Rust structs over the network.

This crate allows for creating implementations faster than other common options, at the cost of some developer experience.

Before building this I originally compared the performance of both msgpack and bincode against manual implementations using AsyncRead and I found that over the network using asynchronous deserialization was faster.

This makes logically makes sense as if you want to use a synchronous serializer you will do the following:

  • Send the total length of the message
  • Allocate a buffer for the message
  • Wait asynchronously for the buffer to be filled
  • Synchronously copy from the buffer into each of the struct fields

When using an asynchronous serializer you can skip sending the messages length and allocating the intermediate buffer as we can rely on the known length of each field while decoding and this is a win for performance and memory usage.

This crate provides utilities to make the implementations less error prone, however long term it would be great to replace this with a derive macro similar to how crates like serde work.

From my research no crate exists that meets these requirements. It is also a difficult problem because your juggling lifetimes and async which is rough. I attempted an implementation called binario, however it is still incomplete so we never adopted it. I suspect Rust's recent stablisation of RPITIT would make this much easier.

Local Network Discovery

Our local network discovery uses DNS-Based Service Discovery which itself is built around Multicast DNS (mDNS). This is a really well established technology and is used in Spotify Connect, Apple Airplay and many other services you use every day.

Service Structure

The following is an example of what would be broadcast from a single Spacedrive node:

# {remote_identity_of_self}._sd._udp.local.

name=Oscars Laptop        # Shown to the user to select a device
operating_system=macos    # Used for UI purposes
device_model=MacBook Pro  # Used for UI purposes
version=0.0.1 						# Spacedrive version

# For each library that's active on the Spacedrive node:
# {library_uuid}={remote_identity_of_self}
d66ed0c3-03ac-4f9b-a374-a927830dfd5b=0l9vTOWu+5aJs0cyWxdfJEGtloEepGRAXcEuDeTDRPk

Within sd-core this is defined in two parts. The PeerMetadata struct takes care of the node metadata and libraries are inserted by the libraries_hook.

Modes

note

This section discusses 'Contacts Only' which is not yet fully implemented (refer to ENG-1197).

Within Spacedrive's settings the user is able to choose between three modes for local network discovery:

  • Contacts only: Only devices that are in your contacts list will be able to see your device.
  • Enabled: All devices on the local network will be able to see your device.
  • Disabled: No devices on the local network will be able to see your device.

Enabled and Disabled are implemented by spawning and shutting down the sd_p2p::Mdns service as required within sd-core.

Contacts only the mDNS service will not contain the PeerMetadata fields and instead will contain a hash of the users Spacedrive identifier. If a Spacedrive node detects another node in the local network with a hash in it's contacts, it can make a request to the node and if the remote node also has the current node in it's contacts, it will respond with the full metadata.

Implementation

We make use of the mdns-sd crate.

Manual connection

The user can manually provide a set of SocketAddr's and the P2P system to attempt to connect to.

This feature primarily exists for usage in combination with Docker but it could be useful for working around difficult network setups.

Implementation

TODO - TODO

Problems with Docker

TODO - MDNS daemon TODO - Docker and why it's a pain mDNS. Explain the current stuff i've done with it.

Transport layer

TODO - Quic

TODO - Explain authentication

Relay

TODO

Direction Connect via Relay

TODO

Authentication

TODO - How we gonna restrict this???

Billing

TODO - How we gonna bill for this???

Design Decisions

TODO

Things I would do differently?

TODO

Crates

TODO

Security

Threat model

TODO - Risks of sharing IP's using discovery, risks of compromised relay, risks of compromised local network during pairing

Authentication

TODO

Authorization

TODO

Tracking

TODO - Link to Apple stuff

Version compatibility and breaking changes

TODO - Compatibility across versions of Spacedrive

libp2p

TODO - Why libp2p fork?, Why libp2p can be problematic for what we do TODO - How we transpose our certificates to libp2p certificates

Major issues

TODO - mDNS issues on Linux TODO - The double up of service discovery when using local and relay

TODO - Question? Why does remote_identity_of_self show up in metadata and the mDNS record itself.

TODO - Request flow. Eg. incoming goes from Quic to mpsc to the users code, to the handlers. TODO - Resumable uploads/transfers