Motivations and Goals

There were numerous motivations that went into the design of Converge, and especially the core data model. I wanted a data synchronization tool, and there were a bunch of different things I wanted to be able to do with it.

Small Core

The core requirements to run a node should be small enough to fit in a moderately capable embedded device. It could be used to log sensor data or distribute configuration (or anything short of real-time control).

Transport Agnostic

Peers should be able to talk to each other over any point to point link. If a lower layer provides reliability, sure, don't add that. If a lower layer provides (sufficiently large) message framing, sure, don't add that either. But if all you've got is an unreliable bidirectional data stream, you should be able to run over it.

Indistinguishable from Random Transport-Adapter

The bits sent over that unreliable point-to-point link should appear to be indistinguishable from random, without knowing some shared key.

Forward and Post-Compromise Secure Transport-Adapter

If someone records the bits sent over the point-to-point link, and then later compromises the state of one or both parties, they should not be able to recover past message and should shortly become unable to recover messages in the future.

Topology Agnostic

Should be flexible enough for those point to point links to organized in any topology: highly connected DHT/BitTorrent swarm style, prearranged trusted connections only, as a hierarchical content distribution network, or extremely sparsely connected opportunistic networking.

Concurrent Modification and Disconnected Operation

Nodes should be able to modify objects without locking or even being connected to other nodes with that object.

End-to-End Encryption

In addition to the transport being encrypted, the data sent over that transport should be encrypted. Intermediate nodes shouldn't be able to read it, and storage node's shouldn't be able to read it, only the endpoint nodes that have the appropriate keys should be able to read it.

Public Versioning

The encrypted form of the objects should have sufficient information to determine the latest versions.

Public Dependencies

The encrypted form of the objects should have sufficient information to perform garbage collection and optimistic delivery of other related objects.

Eventual Consistency and Determinism

Two endpoints with the same information and making the same update, can easily create identical updates, with identical ciphertext. This is important for avoiding synchronization loops, where two nodes both merge each other's empty merge, producing a new pair of identical versions, which they then go and try to merge again once the new versions are propagated.

General Purpose

A lot of existing tools are dedicated to being a particular application - the most general being a file system. This should be very general purpose, such that distributed file systems and databases can be created with it, as well as other more specialized applications.

Access Authorized

It should be possible to restrict fetching objects from a node to authorized users, even with them being encrypted. This is less relevant to the "swarm mode", but ideally swarm mode is designed to limit what is available there.

Liveness

Updates to objects should be propagated quickly though the network to interested parties.

Retention

Operators should be able to mark objects (and trees of objects) to be kept longer term. That is to say, pinning, backups, et cetera.