Skip to main content

Message Storage Considerations

Status: Draft Type: Informational Date: 2022/03/16 Authors: Martin Kobetic

Background

Message storage is a large and complex topic, likely one of the hardest problems we will need to address in the context of the development of our protocol. This RFC is an attempt to summarize our current thinking on this topic, to serve as a starting point for our future efforts.

Message Storage

Message delivery requires storage because the recipient may not be available at the time the message is sent. The message may also travel through multiple nodes to reach its destination and the network must make sure it doesn't get lost while in transit.

Once a message reaches the recipient it needs to be stored to allow the user to see past conversations. At that point the message doesn't need to reach another party, just be available to the same one as long as desirable.

It seems there are relevant differences in the storage requirements pre and post delivery, that might drive each need towards a particular type of solution.

Both types of storage should be decentralized with sufficient redundnacy that guarantees safety of the messages (in the sense of messages not getting lost unintentionally). The decentralization will likely need to be incentivized, but the incentives could be different in each case as well.

Pre-delivery Storage

Pre-delivery storage supports the delivery process. There is no need to support arbitrary querying capability, the speed of retrieval may also be fairly relaxed as the user expects new messages to trickle in as they make it through the network. Obviously the delay cannot be arbitrarily long, but it's not the same as with messages that have already been received. For any given user, there would on average be many more received messages, than messages in transit, and it is usually acceptable for the latter to show up later and with some delay than the former.

Pre-delivery storage will likely need to be shared across the network as supporting the delivery function is its primary purpose. Messages could be opportunistically deleted once delivered, which could help with managing the overall storage capacity and reduce the exposure to potential future compromise.

Besides privacy the security considerations pre-delivery include things like authentication, obscuring the communication patterns, deniability, etc.

Post-delivery Storage

The storage is there to provide historical view of past and current conversations for a given participant. Messages are "at rest", they don't need to get somewhere else, beyond being available on any device that can host the participant's wallet. They need to be available fairly promptly, they should be the first ones to fill the Inbox view, usually in some sort of chronological order. As the body of conversations grows, some sort of pagination capability will be needed to manage the volume. A good filtering capability based on wide range of criteria would likely be desirable as well.

To support the speed of retrieval the storage will likely need to be "close" to the clients that need them, close in the sense of having direct connection, not mediated through many intermediaries. It doesn't necessarily have to be shared across the network, participants could in theory choose different storage options based on their preferences (cost, security, flexibility, control, etc). Users would likely expect the storage to be permanent by default, although they may want to have control over that to manage cost.

Primary security concern post-delivery is privacy, other pre-delivery concerns are not relevant at this stage.

Conclusions

It seems prudent to evaluate solutions in the light of the specific requirements of these two stages. We may end up with different solutions for each, but not necessarily. The cost and attack surface of two separate solutions need to be balanced against the value of the objectives. Even if we end up with the same solution for both as default, it might be desirable to allow the possibility for individual users to diverge, whether through partners offering alternatives, or by letting the user to "bring your own storage".