Introduction

Personal computing began as scattered machines, data ferried from one to the other by hand on floppy disks or tapes. Interconnection between computers were slow, and grew slower with distance. These constraints ensured that the applications and data were in the operator's control, and could not be easily taken away by application providers. Though many providers tried, that power ultimately rested with the operators.

The rise of the Internet changed those constraints, and likewise the ecosystem of software within it. An enormous amount of the software now runs on computers in a datacenter somewhere, leaving us to interact with but a thin skin. The power, for the most part, has shifted to application providers.

Operators gave up control of their data and applications for many reasons, but the most significant one is the ease of collaboration. It is orders of magnitude easier to work on a centralized document than to send different versions of that document around and keep track of changes. Distributed version control systems provide a theoretical solution, but in practice their use is mostly restricted to software development and fails to provide anything like real-time collaboration.

Converge is a distributed data network for collaborative local applications.

  • Data is always encrypted, providing privacy regardless of where it is stored or how it is sent.

  • Just enough metadata is left unencrypted to efficiently synchronize the encrypted data.

  • Multiple people can edit the same data, then reconcile any conflicts after synchronization.

  • Synchronization can work over both network connections and storage media.

It is designed to carry all sorts of data, from short text messages to giant videos, from rapidly changing multiple-access databases to append-only archives. It is designed to work with existing relationships, between friends and family, colleagues, and collaborators, rather than only with anonymous people on the Internet (though it works for that too).

userappsuserappsalice'scachebob'scachefilesystemfilesystemwebservercloudstore

This manual is intended to be a complete guide to the core components of Converge. It is divided into four sections: operating Converge and using the file system, developing and building applications with the framework, implementing the framework and protocols, and finally the cryptography and how the security and privacy guarantees are met.

A Brief Tour

When you first install Converge on your phone, the primary change you'll see is a new shared drive that you can add files and directories to. Of course, until you establish peering with another device you can't share anything! So you install Converge on your laptop, and select peer on both devices.

Peering will ask you whether you're peering your own device or someone else's, prompt your for access to your secret storage, and then display a QR code. You select your own device, authenticate the access, scan both QR codes, and now are peered. Any directories or files you added to the shared storage on one device will now show up on the other, though the actual data won't copy until it's requested.

So you make a directory for pictures, select the directory and ask it to pin it to the phone and the laptop. This automatically keeps a copy of the directory and all its subdirectories on both devices, preemptively copying everything. You decide that you don't need to keep a copy of every photo on your phone, and so select a couple subdirectories and unpin them, most of the directory tree remains pinned, but those don't.

Then you decide to add your desktop. Peering your desktop goes much the same as your phone and laptop, except for one difference - your desktop doesn't have a camera. You scan the QR code from your desktop with your phone, and then your phone shows you a bunch of words to type in: let food note buy ring album time fire. You type them in and now both your phone and laptop are peered to your desktop.

Secure Backup

You want to keep an offsite backup of all your data in Converge, so you sign up with a cloud provider that offers storage. They provide you a link, and when you click on it Converge opens. Converge informs you what capabilities they've given you, probably fetch, write, subscribe, and pin, asks whether you want to accept those, whether you want to reveal your IP address (and potentially location) to them, and what name you want to give your new peer.

You'll then be able to pin files, directories, and other objects to them, backing up your data without revealing the contents of those objects.

Friends, Family, Colleagues, Collaborators

Philosophy and Approach

Some aspects of Converge work a little differently than you may be used to, in particular in the realm of sharing. There are principles at work here, and in this chapter we explore those principles and their implications.

  • the wishes of a device's operator override any others' wishes for what happens to that device
  • avoid functionality that cannot be guaranteed or verified
  • present the full functionality of the underlying system to the operator

Operator Control

We have firmly transitioned to the world where almost all devices have a primary operator, typically the person who would say "that's my device". The device may also have occasional other people who use it, but when we speak of the wishes of the operator overriding all others, it is that primary operator of which we speak.

There is some difficulty in determining who that operator is - but assuming we can, Converge should perform the requests of that operator to the best of its abilities. Some would assert that others' goals for that device should also be taken into account, the manufacturer, the cell service provider, the government, or their spouse, or even me, writing Converge to run on their device. We heartily reject that assertion. Tools must serve their operators first, and only those of their operators choosing after.

Verifiable Functionality

Converge does not implement any functionality that cannot be verified by the same actor. Most importantly, there is no mechanism to unshare what has already been shared, or to delete data from someone else's system. Sure, you can have your device refuse to hand over the data (but they still have the decryption key), or if they give you access to write to their files you can delete a file from a file tree (but it's still there in the history of the file tree).

Complete Operator Interfaces

If there's functionality in the underlying system, that functionality should be presented to the operator, and functionality that is not in the underlying system should not be presented to the operator.

Likely the most startling implication of this is the Peer History functionality - here you see a compilation of all the information your peer has leaked about itself to you, and that you have leaked to your peer about yourself.

Peer Privacy

When computers talk to each other across the Internet on our behalf, they usually reveal unintended information. Most importantly exchanging messages with someone else's computer reveals your computer's Internet address, which conveys some information about your physical location (to some extent), to that computer and all the computers along the way.

However, some servers on the Internet will exchange messages on behalf of your computer. A single such servers, often known as a proxy, will hide your location, albeit not from the proxy itself. But with a series of proxies, typically at least three, each proxy knows only the address of the previous and next computer to talk to. This disguises who exchanging messages with whom much more strongly. This is generally known as private routing, and can take a number of forms.

During peering you set how private you want your location to be from this peer:

  1. No location privacy, allow the peer to connect directly your device.
  2. Basic location privacy, use private routing when connecting over the Internet, but allow direct connections when both peers are on the same network.
  3. Complete location privacy, always use private routing with this peer.

The actual communication method used will be the maximum of what each peer is set for each other. However, the location information exposed to a peer is always at the level set.

Some Notes on Pinning

Objects are usually pinned to one or more devices, starting out with the device that the object was created on. This means that that device is responsible for keeping a copy of the latest versions of that object available,

Most pinning applies recursively - if you pin a directory to a device, all the files and directories within will also get implicitly pinned to that device. Implicit pinning can be interrupted by unpinning, and light pinning doesn't apply recursively at all.

Unpinning

Unpinning prevents an explicit pin from applying to a particular object. For example, if you have your music directory pinned to your phone, you might unpin directories for artists or albums that you don't want access to offline.

Light Pinning

Light pinning keeps a copy of the latest versions of an object, but doesn't apply it recursively. This is useful for keeping a directory listing up to date on a device, without automatically pulling everything inside that directory.

Light pinning is used by default on the root directory that is shared between all your devices, and the peer directories that you share with a connection.

Backup

While a pin makes a device responsible for keeping copies of the latest versions, a backup additionally makes a device responsible for keeping copies of historical versions.

Backups may either be partial or complete. Complete backups are what they sound like: keep every single version. Partial backups are more application-dependent. For the file system, it typically means keeping a version for each day.

Number of Copies

Pinning can specify how many copies you want stored on a device. Many devices only store one copy, because they only have one physical drive to store it one. However, your phone might have both its internal storage and an SD card, so it could store two copies, and a server likely has many hard drives, and could in theory store that many copies. This protects against losing the data to a drive failure.

In many cases, objects are pinned to multiple devices anyway, which protects against much more catastrophic failures, so pinning multiple copies isn't always beneficial.

However, pinning multiple copies can be very useful for backups, as you're less likely to keep historical versions on multiple devices. The other place it shines is that a Converge-aware storage provider should distribute multi-copy pins over multiple data centers, which can help protect against regional disasters. Of course, pinning one copy to a couple distant friends' servers accomplishes the same thing, with less centralized failure scenarios.

References and Acknowledgements

The pinning and backup functionality is using the pin and repin capabilities in the protocol.

The term pinning in this sense was borrowed from IPFS.

Paradigm

There are four broad problems in computing:

  • Communication: getting data from one computer to another.
  • Interface: getting data into and out of a computer.
  • Storage: holding on to some data for a while.
  • Compute: turning some data into different data.

Converge works at the junction of storage and communication. These have traditionally been considered quite distinct. Even network file systems don't quite bridge the gap; they provide access to files on other computers, and as such are much more communication than storage.

With converge, which computer in a network is holding the data is incidental, and any request for the data does not mention it. There are implications of the data being held on one computer or another, and mechanisms to keep a copy of some data on a particular computer, but every computer is an equally valid and authoritative host for any piece of data.

Yet converge can be used to communicate - updates are propagated through the network from interested host to interested host. This enables tremendous fan-out for multicast-like communications, while also providing a level of anonymity for both readers and writers. Perhaps the upstream computer originated the data, or perhaps it just got it from elsewhere, perhaps the downstream computer will present the data to its user, or perhaps it will send it on to another computer.

Motivations and Goals

There were numerous motivations that went into the design of Converge, and especially the core data model. I wanted a data synchronization tool, and there were a bunch of different things I wanted to be able to do with it.

Small Core

The core requirements to run a node should be small enough to fit in a moderately capable embedded device. It could be used to log sensor data or distribute configuration (or anything short of real-time control).

Transport Agnostic

Peers should be able to talk to each other over any point to point link. If a lower layer provides reliability, sure, don't add that. If a lower layer provides (sufficiently large) message framing, sure, don't add that either. But if all you've got is an unreliable bidirectional data stream, you should be able to run over it.

Indistinguishable from Random Transport-Adapter

The bits sent over that unreliable point-to-point link should appear to be indistinguishable from random, without knowing some shared key.

Forward and Post-Compromise Secure Transport-Adapter

If someone records the bits sent over the point-to-point link, and then later compromises the state of one or both parties, they should not be able to recover past message and should shortly become unable to recover messages in the future.

Topology Agnostic

Should be flexible enough for those point to point links to organized in any topology: highly connected DHT/BitTorrent swarm style, prearranged trusted connections only, as a hierarchical content distribution network, or extremely sparsely connected opportunistic networking.

Concurrent Modification and Disconnected Operation

Nodes should be able to modify objects without locking or even being connected to other nodes with that object.

End-to-End Encryption

In addition to the transport being encrypted, the data sent over that transport should be encrypted. Intermediate nodes shouldn't be able to read it, and storage node's shouldn't be able to read it, only the endpoint nodes that have the appropriate keys should be able to read it.

Public Versioning

The encrypted form of the objects should have sufficient information to determine the latest versions.

Public Dependencies

The encrypted form of the objects should have sufficient information to perform garbage collection and optimistic delivery of other related objects.

Eventual Consistency and Determinism

Two endpoints with the same information and making the same update, can easily create identical updates, with identical ciphertext. This is important for avoiding synchronization loops, where two nodes both merge each other's empty merge, producing a new pair of identical versions, which they then go and try to merge again once the new versions are propagated.

General Purpose

A lot of existing tools are dedicated to being a particular application - the most general being a file system. This should be very general purpose, such that distributed file systems and databases can be created with it, as well as other more specialized applications.

Access Authorized

It should be possible to restrict fetching objects from a node to authorized users, even with them being encrypted. This is less relevant to the "swarm mode", but ideally swarm mode is designed to limit what is available there.

Liveness

Updates to objects should be propagated quickly though the network to interested parties.

Retention

Operators should be able to mark objects (and trees of objects) to be kept longer term. That is to say, pinning, backups, et cetera.

Blobs

Blobs provide read-only content addressed encrypted storage.

They contain some ciphertext and a set of references to other objects.

#![allow(unused)]
fn main() {
struct Blob {
  data: Ciphertext,
  refs: BTreeSet<Reference>,
};
}

Serialization

Blobs are serialized using atlv, as an array of those two items, using the tag 0. The references must be listed in lexicographic order. As an example, a 128 byte ciphertext with a single reference to another blob:

offset  bytes  description
00:     03     the tag (0) and type (array) of the blob
01:     02     the number of elements in the array
02:     01     the tag (0) and type (binary) of the ciphertext
03:     01 00  the length of the ciphertext (128 bytes)
05…84:  ...    the ciphertext
85:     23     the tag (8) and type (array) of the reference vector
86:     01     the length of the reference vector
87:     42     the tag (16) and type (union) of a Blob Hash reference
88:     01 20  the tag (0) and type (binary) and length (32 bytes) of a blake3 hash
8a…a9:  ...    the content of the blake3 hash

Versions

Versions provide read-write storage.

They contain some ciphertext, a set of references to other Objects, and an ordered set of references to parent Versions.

#![allow(unused)]
fn main() {
struct Version {
  data: Ciphertext,
  refs: BTreeSet<Reference>,
  vers: Vec<VersionSignature>,
}
}

Version Serialization

Versions are serialized using atlv, as an array of those three fields, using the tag 1. The references must be listed in lexicographic order. As an example, a 128 byte ciphertext with a single reference to another braid and a single parent:

offset  bytes  description
00:     07     the tag (1) and type (array) of the version
01:     03     the number of elements in the array
02:     01     the tag (0) and type (binary) of the ciphertext
03:     01 00  the length of the ciphertext (128 bytes)
05…84:  ...    the ciphertext
85:     23     the tag (8) and type (array) of the reference vector
86:     01     the length of the reference vector
87:     4a     the tag (18) and type (union) of a Braid public key reference
88:     01 20  the tag (0) and type (binary) and length (32 bytes) of a blake3-schnorr-ristretto255 public key
8a…a9:  ...    the content of the blake3-schnorr-ristretto255 public key
aa:     47     the tag (17) and type (array) of the parents vector
ab:     01     the length of the parents vector
ac:     01 40  the tag (0) and type (binary) and length (64 bytes) of a blake3-schnorr-ristretto255 signature
ad…ed:  ...    the content of the blake3-schnorr-ristretto255 signature

Braids

Braids are the abstract objects made of a braided stream of versions. Cryptographically speaking, braids are the set of versions sharing the same verification key, from an algorithmic perspective a braid is the ever-growing directed acyclic graph with multiple entry-points formed by those versions.

This provides the illusion of a concurrently mutable data structure built on immutable data. The set of entry points into this graph marks the current versions. That is, d and e are the current versions in the graph below:

abcde

Distribution Protocol

The Converge Distribution Protocol is a peer to peer message based protocol that all nodes use to exchange data with each other. It can run over any transport that can reliably pass sufficiently large datagrams.

Operations

The protocol is based on granting and invoking capabilities. These capabilities are represented by byte strings that are meaningful to the recipient, but opaque to any callers. The only requirement is that they be unforgeable, so they may be anything from an encrypted and authenticated payload to random bytes indexing into a local database.

Give

The give message provides the peer with a capability and an explanation as to how the capability may be used. It does not request any response.

Explain

The explain message requests an explanation as to how a set of capabilities may be used with this peer. This explanation takes the form of replies of give messages.

Note that the peer should always return a give message, even if it just explains that the capability is meaningless.

Delegate

Requests a capability that has the functionality of the greatest lower bound of the provided capability and explanation.

For example, a capability that can fetch anything could be delegated into one that can only fetch a specific reference.

Select

FIXME: not sure what i think of including this one

The select message requests a list of capabilities the peer grants the requestor that match a particular explanation. This does not list all the capabilities that are valid for the requestor to use, merely those specifically recorded as granted to that requestor. These capabilities are returned in give messages.

This enables peers to host public content without distributing capabilities beforehand.

Invoke

Invokes this capability, providing any required parameters. Invocation also includes any capabilities needed to respond, along with their explanations.

Capabilities

The actual distribution and synchronization of data takes place over invoke messages.

Not all nodes must support every capability, but they must not give out capabilities that they do not support. This generally means that they must support fetch and write. They also may support additional capabilities not listed here!

Fetch

The fetch capability requests that the peer find a particular object and send its contents back. It has two parameters:

  • the set of references to objects that can be requested
  • the probability of forwarding the request to other peers

Each invocation of a fetch capability provides a write capability, scoped to enable the peer to reply with the requested object.

Write

The write capability provides the contents of an object and some of its children to a peer. It has just one parameter:

  • the set of references to objects that can be sent

Each invocation of a write capability includes the contents of the object and its children, and provides a fetch capability, scoped to allow the peer to fetch any children or parents listed but not included on any of the included objects.

Subscribe

The subscribe capability requests that a peer proactively inform this node of any new [versions][version] of a particular [braid]. It has two parameters:

  • the set of [braid references][braid] that can be subscribed to
  • whether the peer will subscribe to this braid on other peers

Each invocation of a subscribe capability provides a write capability, scoped to enable the peer to reply with the requested object.

Pin

The pin capability creates a new pin, requesting that a peer maintain one or more copy of a set of objects and objects related to them. It has four parameters:

  • the set of references to objects to be pinned
  • the set of references to objects to be excluded from this pin
  • the pin filter, describing which related objects to implicitly pin
  • how many copies to pin

Each invocation of a pin capability provides either a subscribe or fetch capability, scoped to enable the peer to subscribe to changes on, or request, the pinned objects. The pin capability responds with a repin capability, to change that pin.

Pin Filters

The types of recursive pinning that might be constructed are unbounded, which entails an equally complex task of applying those pins or explaining them to the user. Following the operator's notes on pinning, we must support at least:

  • pinning only the latest versions of an object
  • pinning the latest versions and all objects they reference
  • pinning historical versions (following first parents) and all objects they reference
  • pinning all versions and all objects they reference

There are options here to support either those options directly, or create a tiny DSL that can make those options and others. Something like:

data Pin
   = Pin Word8 -- pin this number of copies
   | Repeat (Maybe Word) Pin -- apply the pin, when it's done, apply it again, if a count is supplied, limit to that many times
   | Branch [Pin] -- apply the pins, all starting from the same location
   | FirstParent Pin -- go to the first parent, and apply the contained pin
   ) Parents Pin -- go to each parent, and apply the pin
   ) References Pin -- go to each reference, and apply the pin
   | IsBlob Pin -- if the current reference is a blob, apply the contained pin, otherwise do nothing
   | IsVersion Pin -- if the current reference is a version, apply the contained pin, otherwise do nothing
   | IsBraid Pin -- if the current reference is a braid, apply the contained pin, otherwise do nothing

This is much more flexible, but seems unnecessarily complex.

Repin

The repin capability modifies a pin. It has two parameters:

  • the set of objects to add to the pin
  • the set of objects to exclude from the pin

Explanations

Introduction Protocol

The [Transport Protocol] requires that the peers have two ephemeral shared keys, one for authentication and one for encryption, and each have an ephemeral public-private keypair to ratchet the connection.

This protocol provides the means to establish these.

Persistent Fully-Encrypted Transport Version 0

PFETv0 provides a persitant multiplexed stream transport between two peers. While it is broadly inspired by [QUIC], it differs from it in several important ways:

  1. It may be encapsulated by both datagram and stream-oriented protocols.

  2. Every payload sent is indistinguishable from random.

  3. Rather than between a client and a server, it is between two peers.

  4. A connection persists indefinitely between those peers. Both ends may change addresses and encapsulations without disconnecting. The peers must have access to non-volitile storage to persist keys.

  5. There is no mechanism within the protocol to start a new connection. Peers must be introduced to each other outside the transport protocol in order to communicate.

Outer Packet Format

BytesDescription
32Authentication code.
restCiphertext.

How the length of a packet is determined varies by the type of encapsulation.

  • In datagram-oriented protocols, a packet fills the entire datagram.
  • In a stream-oriented protocol, the size of each packet is predetermined. At the initialization of each stream it is set to 256 bytes, and then may be changed with a Set Packet Length frame.

The authentication code must pass before parsing any packet

Ciphertext

BytesDescription
24Random Nonce.
restXChaCha8 ciphertext.

Inner Packet Format

The inner packet, the plaintext, is made up of a header and one or more frames. The first byte of each frame is indicates the frame type, followed by a variable length quantity indicating the length of that frame. Frames are contained within a single packet.

BytesDescription
3Packet Number
restFrames

Frame Types

Set Packet Length

Set the length of packets in a stream encapsulation. In a multi-stream encapsulation, it applies only to the stream the packet is in. In a datagram encapsulation the presence of this frame type invalidates the packet.

BytesDescription
10, Frame Type
vlqFuture Packet Length

Acknowledge

Acknowledgement frames sent by Alice to Bob indicate which of Bob's packets were received. Bob will then transmit any stream or cryptographic data held in those packets in new packets.

BytesDescription
11, Frame type.
3Latest Acknowledged
vlqAcknowledgement Delay
vlqn, Count of Counts
1+2nAcknowledgement Counts

Latest Acknowledged

The most recent packet number being acknowledged.

Acknowledgement Delay

The time in microseconds between receiving the latest acknowledged packet and sending this packet. This count must only include delays introduced by the endpoint application.

Acknowledgement Counts

Alternating counts of acknowledgement and gaps, each 1 byte long.

The first number includes the latest acknowledgement, so must be greater than 0.

Acknowledgements may only be set, not cleared, so gaps may include packets that have previously been acknowledged.

In a multi-stream encpasulation packets may be acknowledged over any stream.

Key Update

A key update sends a partial update of this peer's KEM or Diffie-Hellman key.

BytesDescription
116 | flags
vlqPublic Key Data Offset
vlqLength of Frame
restPublic Key Data

Flags

BitDescription
0Initial Update Packet
1Final Update Packet
2Type of Update

The initial bit is set when this packet contains the first byte in a key update. The final bit is set when this packet contains the last byte in a key update.

TypeDescription
0Key Encapsulation Key
1Diffie-Hellman Key

Public Key Data Offset

Stream Data

BytesConditionValueDescription
1240-248Field Type | last || initial || final
vlqidStream Identifier
vlqoffStream Data Offset
vlq¬lastnLength of Data
nStream Data

Flags

BitNameDescription
0finalThis frame is the final frame in a stream.
1initialThis frame is the first frame in a new stream.
2lastThis frame is the last frame in this packet.

Terminate Stream

BytesDescription
1252 | yours
vlqStream Identifier
vlqApplication Error

This unconditionally terminates a stream, providing an application layer error code.

yours is a one bit value indicating whether the identifier refers to the receiving host's streams. That is, frames of type 252 terminate the sender's streams, and frames of type 253 terminate streams that the receiver is sending.

Terminate Connection

BytesConditionValueDescription
1254Field Type
vlqApplication Error

This unconditionally terminates the connection, providing an application-specific error as explanation. Both sides must delete all keys, including from non-volitile storage, and no further communication on this connection is valid.

The sender should wait for their peer to acknowledge a packet containing this frame, resending as necessary. After sending this frame however, the sender must not send any frames other than padding. If the peer continues to send packets and does not acknowledge any packet containing this frame, the sender should then procede to fully terminate the connection, though it may keep the MAC key to detect further packets of this connection.

Padding

BytesDescription
1255, Frame type.
restignored

Padding may be used to reach a minimum packet size or fill out a packet to a fixed size to obscure length analysis. Padding always consumes the rest of the packet.

Databases

Converge is designed to be able to support a secure, distributed, eventually consistent database as a library.

Overview

There are a lot of types of databases. Here we are concerned with what you might call indexed value stores. We have a value, and we associate it with keys on one or more indexes. Then we can look up either the key or an area of keys on the index, and return the value or values matching that key or area of keys.

The keys may be derived from the value in some way, but often they are arbitrary. When we're layering a database on converge this distinction matters, because we cannot use braids to reference any part of a value that a key is derived from. If we do, the value could change without notice.

B-tree Indexes

[B-trees] are self-balancing tree structures with a large fan-out. Variants are heavily used in databases and file systems, including copy-on-write file systems, which anything based on converge must be.

Encoding a B-tree in converge is relatively straight-forward. The root node is a [version] and each other node is a [blob]. Each node holds a list of (first key, shared key, reference) or (key, value) tuples, where the value may contain a shared key and reference.

This may be compacted by rewriting the references to the index of the reference list in the object's plaintext. While it may be tempting to sort the object references by the order of entries, this will leak significant information about which values are associated with which keys!

Another potential compaction is, for variable length keys, to include the common prefix of all the keys in a header. This provides some of the benefit of radix trees while preserving the log(n) search time of B-trees and leaking less information. However, it does open some possibility of leaking the rough number of keys with a given prefix if an attacker can insert or remove entries.

Security

Observation

This is the typical case for a backed up database, or a node participating in synchronization. The attacker is assumed to be able to see the entire history of the database, but has no control over its contents.

Given the relatively large and consistent fan-out, the attacker can likely distinguish a B-tree from other types of data. With knowledge of the application, this leaks the approximate number of keys, by the number and size of leaf nodes. The quality of this estimate may be reduced by values that reference blobs that also have similar fan-out.

While this attacker cannot associate a particular node with a set of keys (even relative to other keys), they can watch changes to similar areas over time, especially if there's a commit after every change. Batching changes muddies this leakage, one to one mappings become n to n. If there are no external references, a full rebalance (assuming it changes every node) will break continuity entirely.

Entry Modification

If the attacker can update the non-indexed part of a value and knows the key to that value, or if they can insert a value with a known key, they can locate the area of the key.

The more values an attacker can control this way, the more they insight they can gain into the structure of the data. With arbitrary insertion or modification an attacker can efficiently find the approximate boundaries of each node.

This is likewise mitigated by batching changes, though not eliminated if the attacker knows which changes contained updates to the same key. Rebalancing the tree breaks continuity (module external references).

External Reference Knowledge

If the database values include external references, related values may be determined by having nearby keys. This is especially relevant to file system use cases, as similar files are likely to be stored together, and files are likely to be external values.

Cryptographic Requirements

All of Converge's cryptographic functions are parameterized by a domain string - a statically defined byte string that expresses how the primitive is being used. The general principle is that results of these functions in one domain must not be valid, or transformable to be valid, in other domains.

Public Key Scheme

A public key cryptographic scheme is a class of schemes that utilize a secret key and a public key.

-- Key Derivation
from_master :: Domain -> MasterKey -> SecretKey
-- Public Key Derivation
to_public :: SecretKey -> PublicKey

Self-Certifying Public Key Scheme

Some public key cryptographic schemes can produce self-certifying certificates, that is they provide these additional functions:

-- Key Derivation
from_master_certifiable :: Domain -> MasterKey -> SecretKey
-- Certficate Production
issue_certficate :: Domain -> MasterKey -> T -> Certificate T
-- Certficate Verification
verify_certficate :: PublicKey -> Certificate T -> Either Error T

Where the certificates cannot be created from secret keys, but require the master key.

Hash Functions

Converge requires a hash function that takes a domain and message and produces a fixed length output, the digest.

-- Hashing
hash :: Domain -> Message -> Digest

It must be infeasible to:

  • find two domain-message pairs that hash to the same digest - collision resistance.

  • find a domain-message pair that hashes to a particular digest - preimage resistance.

  • transform a digest of one domain-message pair into that of another domain-message pair without access to the target domain and message.

Deterministic Signatures

Converge requires a signature scheme that is a self-certifying public key scheme. Deterministic signature schemes have two functions, beyond those provided by a public key scheme.

-- Signing
sign :: Domain -> SecretKey -> Message -> Signature
-- Verifying
verify :: Domain -> PublicKey -> Message -> Signature -> Either Error ()

It must be infeasible to:

  • derive information about a secret key for any other domain and master key from a secret key

  • create a secret key that matches a given public key, even given a vast number of signatures produced with that secret key

  • find two domain-message-key tuples with the same signature

  • find a domain-message-key tuple that results in a particular signature

  • transform a signature on any message into one that's valid on a chosen domain and message with the same key

  • produce a signature for a particular public key without access to the corresponding secret key

    note that this does not exclude all malleability. in particular, enabling threshold, one-of-many, and all-of-many signatures is encouraged.

Deterministic Authenticated Encryption with Associated Data

Symmetric encryption obscures a plaintext as a ciphertext with a key, and provides a function to turn the ciphertext back into a plaintext with that key. Authenticated Encryption additionally provides strong evidence that the ciphertext was not modified in transit. Authenticated Encryption with Associated Data further provides strong evidence that additional unencrypted data was not modified in transit. Deterministic Authenticated Encryption with Associated Data guarantees a one-to-one relationship between plaintexts and ciphertexts for a given shared key and associated data.

Converge requires Deterministic Authenticated Encryption with Associated Data with four functions:

-- Key Derivation
from_master :: Domain -> MasterKey -> SharedKey
-- Convergent Key Derivation
from_plaintext :: Domain -> ConvergenceDomain -> Plaintext -> AssociatedData -> SharedKey
-- Encryption
encrypt :: Domain -> SharedKey -> Plaintext -> AssociatedData -> Ciphertext
-- Decryption
decrypt :: Domain -> SharedKey -> Ciphertext -> AssociatedData -> Either Error Plaintext

Blobs

Blobs provide read-only content addressed encrypted storage. As such they vulnerable to chosen-plaintext attacks, including the confirmation of file and learn the remaining information attacks described here by Tahoe-LAFS. Converge employs a similar mitigation strategy to Tahoe-LAFS, the key derivation process for each blob has a secret convergence domain added, so blobs are only deduplicated within that convergence domain.

References

Blobs references are generated by hashing the entire serialization of the Blob. The hash, no matter which primitive, uses the domain "Converge Blob Reference".

Encryption and Decryption

The ciphertext held within a Blob is produced with using deterministic authenticated encryption with associated data, with a domain of "Converge Blob Encryption" and the associated data of the serialized set of references (byte 85 to the end in this example). The convergence domain is determined by the calling application, and may be empty.

Applications must distinguish data stored in blobs with convergence domains under 96 bits as being less private!

Versions

Versions provide read-only deterministically encrypted storage, such that multiple versions can be interpreted a read-write Braid. While deterministic encryption is usually subject to chosen-plaintext attacks, The structure of versions provides significant mitigation in practice.

Key Derivation

The Braid that the Version is to be part of determines which signing and encryption keys it uses. See Braids for details on their key derivation.

References

References to a particular Version are generated by signing the entire serialization of the Version. The signing and verifying functions, no matter which scheme, use the domain "Converge Version Reference".

Encryption and Decryption

The ciphertext held within a Version is produced with using deterministic authenticated encryption with associated data, with a domain of "Converge Version Encryption" for the encryption and the associated data of:

  1. the public key of the braid
  2. the serialized set of references
  3. the serialized list of parent versions

All use the atlv encoding, so the second two are similar to the end of this example starting at byte 0x85.

Chosen Plaintext Attacks

Chosen Plaintext Attacks on deterministic authenticated encryption with associated data schemes depend on the attacker having control of the associated data. Including the parent versions in the associated data introduces significant mitigation. Even if an attacker controls the content (plaintext and references) of a version that a node encrypts, that node will list different parent versions each time.

Braids

Braids are abstract objects defined as the set of versions that share the same signature keys.

Desired Properties

Braids provide the illusion of a concurrently mutable data structure built on immutable data. There are thus four types of actors working with that data structure:

  • fetchers can fetch the versions that make up a braid, verify that those versions were made by writers, and see the braids' public data
  • readers can fetch, and read the versions' secret data
  • writers can create new versions
  • administrators can end the braid and rotate the keys, forming a new braid to replace it

Key Derivation

Both the shared encryption key and secret signing key are derived from a master key, which is generated from a high-entropy source.

The shared key is derived with the domain "Converge Braid Shared Key".

The secret signing key is derived with the domain "Converge Braid Signing Key", and should be derived in a way that can provide a rotation certificate, but may be derived directly from the master key if the braid is short lived.

References

A braid is referenced by the public key used to verify them. That typically is taken to mean the latest versions on the braid, but may refer to the entire braid.

Selected Functions

While it would be simpler and more robust in the short term to define a fixed set of cryptographic functions, Converge must be able to maintain backwards compatibility to read old data during a transition to new functions. So instead, each data type for each function is tagged to indicate which cryptographic system it is for.

That said, there is currently only one system defined to serve each function.

Sets of cryptographic functions should be defined to minimize redundant functionality, in particular the signature schemes and DAEADs should employ most current hash function defined. If this means re-paramaterizing a signature scheme or DAEAD to minimize code bloat, so be it. The initial set of blake3, schnorr-blake3-ristretto255, and blake3-xchacha8 exemplify this, as the latter two are based on blake3.

Defined Hashes

blake3

Blake3 is an extremely fast cryptographic hash function, based on a merkle tree. It is secure against length extension attacks and leaf substitution attacks and is safe to use as a PRF. Converge uses the derive_key mode and keyed_hash modes, to implement the initial domain separation and any necessary subdomain separation.

The output is 32 bytes.

The tag for blake3 is 0.

Defined Signatures

schnorr-blake3-ristretto255

r255b3 is a fast and compact signature scheme based on Schnorr signatures in their original formulation and Ristretto255. It features built-in and required domain separation, and optionally deterministic signatures. The secret key is an integer modulo the order of Ristretto255, while the public key is a member of the Ristretto255 group.

Key rotation is provided for by generating [committed keys] for [self-evident certificates].

The secret and public keys are 32 bytes, and the signatures are 48 bytes.

The tag for secret keys, public keys, and signatures are all 0.

Defined DAEADs

blake3-xchacha8

This is a MAC and Encrypt scheme where the MAC includes the domain, plaintext, associated data, MAC Context, and shared key, and is used as the initialization vector for the encryption. The chosen plaintext attack on MAC and Encrypt is minimized to the extent possible for a deterministic encryption scheme - if any of the inputs change the MAC changes unpredictably, as does the entire ciphertext.

The shared key can be derived from the plaintext with sk = Blake3(domain, len(pt) | pt | len(data) | data | "Blake3-XChaCha8 Key Derivation" | convergence_domain), where len(x) is the length of x in bytes as a 64-bit little endian unsigned integer. Or it can be derived from a master key with sk = Blake3(domain, master_key).

Encryption first proceeds with deriving the initialization vector, iv = Blake3(domain, len(pt) | pt | len(data) | data | "Blake3-XChaCha8 Nonce Derivation" | sk)[0..24]. Note that the initial portion of the hash input is the same, so a key and nonce can be created in a single pass over the plaintext and data. Then the plaintext is encrypted with the key and iv, ct = iv | XChaCha8(sk, iv, pt).

Decryption begins with splitting the first 24 bytes from the ciphertext to get the iv and ct'. The plaintext is then decrypted pt = XChaCha8(sk, iv, ct'). Finally, the mac/iv is recalculated iv' = Blake3(domain, len(pt) | pt | len(data) | data | "Blake3-XChaCha8 Nonce Derivation" | sk)[0..24], and compared to the actual initialization vector iv == iv'. If they are equal the plaintext is returned, otherwise an error is returned.

Given the 24 byte nonce, there is approximately a 1 in 2⁶⁵ chance of nonce collision after 2⁶⁴ messages.

The tag for the shared key is 0.