NIPs nostr improvement proposals

NIP-77 - Negentropy Syncing

Table of Contents

Negentropy Syncing

draft optional

This document describes a protocol extension for syncing events. It works for both client-relay and relay-relay scenarios. If both sides of the sync have events in common, then this protocol will use less bandwidth than transferring the full set of events (or even just their IDs).

It is a Nostr-friendly wrapper around the Negentropy protocol, which uses a technique called Range-Based Set Reconciliation.

Since Negentropy is a binary protocol, this wrapper hex-encodes its messages. The specification for Negentropy Protocol V1 is attached as an appendix to this NIP below.

High-Level Protocol Description

We're going to call the two sides engaged in the sync the client and the relay (even though the initiator could be another relay instead of a client).

The above protocol only results in the client learning about IDs it has/needs, and does not actually transfer events. Given these IDs, the client can upload events it has with EVENT, and/or download events it needs with REQ. This can be performed over the same websocket connection in parallel with subsequent NEG-MSG messages. If a client is only interested in determining the number of unique events (ie, reaction counts), it may choose to not download/upload at all.

Nostr Messages

Initial message (client to relay):

[
"NEG-OPEN",
<subscription ID string>,
<filter>,
<initialMessage, hex-encoded>
]

Error message (relay to client):

If a request cannot be serviced by the relay, an error is returned to the client:

[
"NEG-ERR",
<subscription ID string>,
<reason code string>
]

Error reasons are the same format as in NIP-01. They should begin with a machine-readable single-word prefix, followed by a : and then a human-readable message with more information.

The current suggested error reasons are

After a NEG-ERR is issued, the subscription is considered to be closed.

Subsequent messages (bidirectional):

Relay and client alternate sending each other NEG-MSGs:

[
"NEG-MSG",
<subscription ID string>,
<message, hex-encoded>
]

Close message (client to relay):

When finished, the client should tell the relay it can release its resources with a NEG-CLOSE:

[
"NEG-CLOSE",
<subscription ID string>
]

Appendix: Negentropy Protocol V1

Preparation

There are two protocol participants: Client and server. The client creates an initial message and transmits it to the server, which replies with its own message in response. The client continues querying the server until it is satisifed, and then terminates the protocol. Messages in either direction have the same format.

Each participant has a collection of records. A records consists of a 64-bit numeric timestamp and a 256-bit ID. Each participant starts by sorting their items according to timestamp, ascending. If two timestamps are equal then items are sorted lexically by ID, ascending by first differing byte. Items may not use the max uint64 value (2**64 - 1) as a timestamp since this is reserved as a special "infinity" value.

The goal of the protocol is for the client to learn the set of IDs that it has and the server does not, and the set of items that the server has and it does not.

Varint

Varints (variable-sized unsigned integers) are represented as base-128 digits, most significant digit first, with as few digits as possible. Bit eight (the high bit) is set on each byte except the last.

Varint := <Digit+128>* <Digit>

Id

IDs are represented as byte-strings of length 32:

Id := Byte{32}

Message

A reconciliation message is a protocol version byte followed by an ordered list of ranges:

Message := <protocolVersion (Byte)> <Range>*

The current protocol version is 1, represented by the byte 0x61. Protocol version 2 will be 0x62, and so forth. If a server receives a message with a protocol version that it cannot handle, it should reply with a single byte containing the highest protocol version it supports, allowing the client to downgrade and retry its message.

Each Range corresponds to a contiguous section of the timestamp/ID space. The first Range starts at timestamp 0 and an ID of 0 bytes. Ranges are always adjacent (no gaps). If the last Range doesn't end at the special infinity value, an implicit Skip to infinity Range is appended. This means that the list of Ranges always covers the full timestamp/ID space.

Range

A Range consists of an upper bound, a mode, and a payload:

Range := <upperBound (Bound)> <mode (Varint)> <payload (Skip | Fingerprint | IdList)>

The contents of the payload is determined by mode:

Bound

Each Range is specified by an inclusive lower bound and an exclusive upper bound. As defined above, each Range only includes an upper bound: the lower bound of a Range is the upper bound of the previous Range, or 0 timestamp/0 ID for the first Range.

A Bound consists of an encoded timestamp and a variable-length disambiguating prefix of an ID (in case multiple items have the same timestamp):

Bound := <encodedTimestamp (Varint)> <length (Varint)> <idPrefix (Byte)>*

Fingerprint Algorithm

The fingerprint of a Range is computed with the following algorithm: