Skip to main content

Milestones for XMTP Network

CreatedAuthor(s)Status
2022-07-11@pranayDraft

Overview​

This document outlines a development plan for XMTP Network.

Background & Motivation​

XMTP Network provides an infrastructure that allows users to send and receive messages on behalf of their blockchain wallets. Protocols used in this network are implemented by node software. Initially, the node software and the underlying network are maintained by XMTP Labs. Over time, the network will become completely decentralized i.e., the community owns the network and there will be multiple software implementations of the network protocols.

Why Decentralization​

The term decentralization is neither a precisely defined term nor a term specific to cryptocurrencies and blockchains. It has different meanings in different academic fields and contexts. The operating meaning of decentralization used here comes from Vitalik Buterin's article on Meaning of Decentralization. Vitalik presents three types of decentralization. In the context of XMTP Network, these can be interpreted as

  1. Architectural (de)centralization β€” How many nodes are in the XMTP Network?
  2. Political (de)centralization β€” How many individuals or organizations control the development of the XMTP Network?
  3. Logical (de)centralization β€” If the network is cut in half, including both nodes and users, will both halves continue to fully operate as independent units?

The consequences of achieving decentralization along these three axes imply several useful properties including

  • Fault resistance:
  • Attack resistance:
  • Collusion resistance:

These are all desirable properties to achieve in the XMTP Network. Therefore, the end goal for the XMTP Network and nodes is to achieve decentralization along all three axes. In this document, the focus is mainly on Architectural decentralization and Logical decentralization.

Terminology​

In the rest of the document, we use the following terms

  • Network - XMTP Network protocols
  • Node - Both the software implementation of network protocols and the hardware on which it's run
  • Client - Libraries and apps that connect to nodes
  • Users - End users of the XMTP network. Users connect to the network via clients

Goals / Non-goals​

Goals​

  • Identify stages of development for the network
  • Solution space for the various stages
  • Avoid status quo bias and move toward optimal solutions

Non-goals​

  • Timeline for the stages
  • Governance structure
  • Token economics

Summary​

The development of the XMTP network can be broken down into five stages:

  1. Current: Status quo.
  2. Benevolent third parties: Allow external parties to run nodes.
  3. Sharded message data store for better scalability.
  4. Referred nodes: Non-trusted third parties can run nodes.
  5. Incentivized node operation: Enable token incentives for node operation.

Milestones​

Current 2022-07-11​

The network protocol is built on top of Waku's relay protocol, which implements pub/sub over libp2p. Waku protocol trades message integrity and authentication for message unlinkability. This opens the network for many types of attacks including spam. We believe that by achieving decentralization, the network will be more resilient to faults and attacks.

Currently, the development of network protocols and node software is maintained by XMTP Labs.

Benevolent third parties​

In this future state, XMTP Labs will invite external parties to maintain XMTP nodes, and contribute to the development of node software. These new set of network actors are selected by their commitment to act in the protocol's best interest. At this stage, the network protocols use a more complex message replication strategy (for example CRDTs) so that all the nodes have a consistent view of the network state. The network is also more resilient to node churn, nodes with diverse computing and networking capacities.

Sharded message data store​

Use cases for the XMTP network go beyond simple text-based messaging between users. These include things like multiple message types, real-time messaging, etc. In this stage, the network supports these advanced use cases in a scalable way.

Sharding is predominantly a type of partitioning technique used in scalable systems. In most cases, the data is divided into smaller sets (based on a shard key) and stored/processed on separate machines/servers. Logical sharding is primarily used to divide the data based on a certain logical criterion. A combination of these techniques can be used to support the network.

The main challenge in implementing a sharding solution is finding an optimal sharding key that minimizes the dependencies between data stored in different shards. Ideally, there are no dependencies between the data stored in different shards. More realistically, there are unavoidable dependencies between the shards. In this case, nodes on the network might store entire data for a shard but only a partial view of a dependent shard. Authenticated Data Structures plays an important role in ensuring that the nodes with only a partial view get cryptographic guarantees about the data.

Referred nodes​

In this state, the network is open to more participants (via referrals from existing nodes) but not fully public. As a result, we expect the network to fail for a wide range of reasons including malicious participants, network failures, etc. Therefore, correct nodes are to be Byzantine Fault Tolerant (BFT) at this stage. BFT consensus mechanisms are used in both centralized distributed databases and public blockchain databases. The main difference between these practical deployments is that the centralized networks are optimized for throughput among a small number (often tens) of nodes, whereas decentralized networks sacrifice throughput to support a large number of nodes(often hundreds or thousands). As XMTP Labs still leads the development of network protocols, we will pick a consensus mechanism appropriate for the number of participants and other requirements at the time. The following is a summary of consensus mechanisms used in the top 30 (by market cap as of Aug 2022) blockchains.

  • Proof-of-work: Bitcoin, Ethereum, Dogecoin, Litecoin, Ethereum Classic, Monero, Bitcoin cash
  • BFT proof-of-stake: BNB, Ripple, Polygon, Cronos, Cosmos
  • Nakomoto proof-of-stake: Cardano, Solana, Polkadot, Tron, Near
  • Bespoke proof-of-stake: Avalance, Stellar, Algorand

Even though the network doesn't necessarily have a token at this stage, we need to use/build protocols that will support the decentralization vision!

Storage​

Over time, the amount of storage space required to store the state of the network becomes too large to manage reliably. In this case, the nodes could outsource storing a part of the state to long-term storage services. This provides several benefits including

  1. Snapshot the historical state and focus on the active state
  2. Easier for new nodes to join the network
  3. Different incentives based on the type of services provided

These services could either be part of the network or be completely separated from it. Irrespective of how these services are implemented, we need integrity of the outsourced data - that the services continue to store it. Ensuring that these services store the data faithfully is a challenging problem. There are three classes of solutions for reliable storage:

  1. Economic solutions: Barring altruistic service providers, price is a good signal of the QoS offered by a service provider. A service provider that charges too less might not be storing the data faithfully. Appropriate economic incentive schemes could ensure the integrity of the data.
  2. Reputation-based solutions: Nodes may rely on the reputation of large cloud companies.
  3. Cryptographic solutions: Utilize cryptographic primitives to ensure that the data can be retrieved. These solutions have been used in many existing decentralized storage networks, such as Filecoin, Storj, Sia, etc. At the core of these DSNs, there are users who want to store data and there are providers who provide storage for a fee. Protocols like Permacoin and Retricoin modified rather wasteful proof-of-work mining into a more useful goal of storing data. The specific cryptographic solution for the storage problem depends on the answers to the following questions
  • Public verifiability vs private verifiability: Who can verify data integrity ?
  • Data retrievability: Can the clients of the service retrieve the data from the servers that pass the audit test?
  • Client storage and processing overhead: How much information does the client need to store? How much preprocessing/postprocessing is required on the client's side?
  • Server storage and processing overhead: How much additional information is required to support audits? How many audits can the server support per file?
  • Tolerance for data corruption: What fraction of the data can be corrupted to go beyond the point of no recovery? In coding theory, this is called the code rate of a scheme.

There are impossibility results for some combinations of the above options. At the same time, there are protocols for the most reasonable combinations of options.

Incentivized node operation​

At this stage, the responsibility of network operation, and node software development belongs to the community! Nodes that participate in incentivization will be held responsible for their actionsβ€”being rewarded for desirable actions, and disincentivized from taking undesirable ones. Additionally, XMTP Labs will turn over control of the protocol to the community via decentralized governance. Some existing examples of the transfer of governance to communities include:

References​