Partially Ordered Dataset

pod is a new layer-1 primitive that takes transactions as input and produces a log (sequence of transactions) as output. It is a service that receives unordered transactions submitted into the mempool and orders them into a log. Contrary to blockchains and other consensus protocols, pod does not provide a persistent total order of transactions. While it does give some order, this order is subject to change with time — transactions “wiggle” around their positions in a somewhat predictable fashion. Accepting this wiggle room allows for making a very performant system. In particular, pod is latency-optimal and throughput-optimal. To accomplish this, we first eliminate inter-validator communication. Instead, clients send transactions directly to all validators; each validator independently processes transactions and appends them to its log. Clients then receive and extract information from these logs. Check out the pod-core paper for detailed pseudo-code and formal analysis.

Figure 1 (consensus vs pod): Transactions in pod can wiggle but their bounds are known.

This post provides a high-level overview of pod’s key components and design choices. As we roll out the protocol, we anticipate these systems will evolve (or completely change). We will release blogs that explain the concepts in depth and papers that formally analyze each component.

Key design principles:

  • Optimal Latency. Transactions are confirmed in one network round trip (~200ms). This confirmation latency is optimal because it hits the speed of light limit. The latency is the same as what you would expect when you interact with traditional Web2 (client ↔ server) architecture. Formally, we achieve liveness u equal to the physical network latency round-trip time 2δ in synchrony and eventual liveness in asynchrony.
  • Streaming. All aspects of our system are push rather than pull. Each node (full or light) connecting to a validator subscribes to a channel indicating what streams of data they’re interested in. The validator then sends the nodes the relevant data as soon as it is available. The same principle applies in the connection between a secondary and its validator or a client and a secondary (see below). This follows the classical design pattern of “publish/subscribe.” The way this is realized in practice is by using (web)sockets. This dictates a large part of our design, giving rise to ideas such as blocklessness. When you have blocks on a blockchain, you must wait for a block to appear in order to receive a transaction confirmation, adding artificial delay. Our streaming design allows us to confirm transactions as soon as they receive sufficient signatures.
  • Simplicity. pod-core uses a radically straightforward design, making it easy to implement, audit, and formally analyze. While it has many extensions enabling advanced behaviors, pod’s “consensus” core is just a few hundred lines of Rust code. The dependencies used in pod’s core are just hash functions and signatures—no zero knowledge, multiparty computation, or other moon math. pod’s extensions employ advanced cryptographic techniques to enable more powerful features on top, but the heart of the construction is very simple.
  • Scalability. pod borrows its design from traditional relational DBMS systems from the 80s, 90s, 00s, and 10s, before the Web3 era. These techniques are battle-tested but have not been designed to be byzantine-resilient. pod leverages these techniques to scale to Internet scale. Such structures include separating write from read validators (the primary–secondary paradigm), efficient caching and indexing, load balancing, and hot swapping. We’ve also borrowed techniques (such as Merkle Mountain Ranges and no validator-to-validator communication) from Certificate Transparency, a backbone behind the security of X509/HTTPS, which processes Internet-scale traffic.
  • Flexibility. A flexible design provides different guarantees to people with different needs. Some clients may favor safety over liveness, some may be on-chain, while others are off-chain. A small minority may believe in the security of TEEs. Our design is not tailored to a particular belief, but we allow these clients to interoperate with each other, each reaping their preferred benefits (as long as their beliefs correspond to reality).
  • Modularity. While pod’s design is simple (see above), it is a feature-rich system. To achieve this richness without sacrificing simplicity, each component of pod is designed to be stand-alone and provide a clean interface to its friends. One such example, pioneered by Celestia, is the separation of “consensus” from execution; consensus confirms transactions, whereas execution settles state.
  • Accountability. Every validator claim in pod is accountable — from the confirmation of a single transaction to the response to a light client query about a particular smart contract to the full log report provided to full nodes. This enables faulty validators to be slashed, giving rise to economic security.
  • Censorship Resistance. While liveness guarantees that every honest transaction is confirmed soon, censorship resistance imposes a shorter confirmation time frame in which honest transactions cannot be selectively censored; any censorship attack is forced to stall the whole system or confirm all honest transactions. In pod’s case, because we are leaderless and blockless, the stalling case never takes place, and we can guarantee liveness matching censorship resistance.

pod-core

(check out the paper for pseudo-code and formal analysis)

pod's core construction is simple and can be explained in just a couple of minutes. Its infrastructure consists of a set of active validators whose job is to record transactions. Validators do not talk to each other directly—this is exactly what makes pod so fast. The active validator set is known to the clients. Clients connect to these validators and send transactions to them that are eventually confirmed. Clients can then query the log from the validators to discover the confirmed transactions and their wiggle room.

Each validator maintains a local, totally ordered temporal log. A temporal log is a sequence of transactions, each associated with a timestamp. The timestamps have millisecond precision and must be non-decreasing. Each validator’s temporal log is its own local data structure and is unaffected by other validators’ logs. A client that wishes to write a transaction connects to all validators and sends them the transaction of interest. Whenever a validator receives a new transaction from a client, it appends it to its local log, together with the current timestamp based on its local clock. It then signs the transaction with the timestamp and hands it back to the client. The client receives the signed transaction and timestamp and validates the validator’s signature using its known public key. As soon as the client has collected a certain number of signatures from the validators (e.g., α = 2/3 of the validators), the client considers the transaction confirmed. The client associates the transaction with a timestamp, too: the median among the timestamps signed by the validators.

A full node that wishes to obtain a full picture of confirmed transactions reaches out to all the validators and requests their full log. Upon such a request, the validator returns the full log containing timestamped transactions, together with the current timestamp, all signed by the validator key. The node then considers any transaction with more than α signatures confirmed and assigns the median timestamp to it. The node orders the transactions using these median timestamps, which are the node’s log. Because different nodes may receive a response from a different subset of validators, they may not arrive at the same timestamp for a given transaction. That’s what gives rise to a transaction’s wiggle room. However, a node can find upper and lower bounds for that wiggle room, a parameter we call a transaction’s minimum and maximum temperature.

Figure 2 (transaction flow): The transaction flows from the client to the set of validators and back to client, requiring one network round trip end-to-end.

In principle, you can see why this system is already capable of achieving optimal latency and throughput. To confirm a transaction, the client requires one roundtrip to the validators — achieving optimal latency. The number of transactions a validator can process in a unit of time is bounded by the bandwidth of the channel connecting the client to the validators, achieving optimal throughput. It is straightforward to see that no better performance can be achieved in any system that requires all validators to see all transactions (such as a blockchain system) because the system matches the physical limits of the validators’ capacity (if the requirement that all validators see all transactions is removed, we can achieve better bandwidth utilization).

Execution

Blockchain systems take the ordered set of confirmed transactions and deduce a state by applying them one on top of the other in the given order. This is referred to as state-machine replication. Current systems apply a global lock to the state machine, and the whole system must wait for the transaction to be applied before a future transaction can be processed, similar to a table-level lock in a traditional DBMS. In pod, we can settle non-conflicting transactions more quickly. Each transaction locks only the part of the state it touches, similar to a row-level lock in a traditional DBMS. In a nutshell, two (or more) transactions whose mutual order has not yet been decided can all be applied if they commute, i.e., their effect on the system's state is independent of the order in which they end up being confirmed. For applications that need ordering, pod allows custom ordering (sequencing) gadgets to be built on top that inherit the security of pod. This enables ordering-sensitive applications to decide on how they handle their application’s MEV while still maintaining fast composability with the base layer. pod supports EVMx, a backward-compatible extension of EVM. EVMx is designed to minimize the uplift of application developers required to leverage the fast path of pod. More on this is coming out soon.

Extensions

pod has a radically simple core (see above). pod employs several extensions on top that use cryptographic schemes or techniques from traditional databases to allow for additional features and optimizations. These extensions are designed in a trust-minimized fashion such that the security of the pod network relies only on the security of pod-core. Here are some such extensions:

Secondaries. We separate the computers that handle the write instructions and those that handle the read instructions. Secondaries are untrusted, read-only nodes that offload the burden of serving frequent read requests from validators, which handle only write instructions. Each validator signs and forwards new transactions to its secondaries, who cache these signed updates and forward them to the relevant subscribed nodes without overloading the validator. Because secondaries do not sign responses, they require no additional trust; the only harm they can do is stop responding, in which case a user simply switches to another secondary for the same validator.

Figure 3 (secondaries of a validator): Read operations are much more common than write operations in a blockchain system. More secondaries can be added to a validator to scale the read operations as much as needed.

Gateways. Even though we have sharded the read instructions (i.e., not every read instruction hits every validator), having each client connect to all validators for write purposes is uneconomical. Instead, we operate helpful but untrusted gateways. A gateway is a node that maintains an open connection to all validators. Whenever a client wishes to write a transaction, all it needs to do is reach out to a gateway and send the relevant write instruction. The gateway then forwards this write to all validators, receives their signatures, assembles a confirmation certificate consisting of α signatures, and forwards this back to the client. If the client is unsatisfied from the performance of a gateway, he can switch to a different one. Like secondaries, gateways do not sign their responses, and therefore do not need to be trusted.

Figure 4 (the gateway architecture): Clients can avoid connecting to all validators by using a gateway, which maintains open connections to the current validator set.

Thin validators. To further improve the network's decentralization, we can reduce the storage needed by the active validators by not requiring them to store past logs. This can be done by using Merkle Mountain Ranges (MMR), where each leaf of the tree is a pair of transactions and its corresponding timestamp. Instead of storing the complete historical log, the validators are now only required to maintain the latest peaks of the MMR. Whenever a validator wishes to add a new transaction to the log, it updates this MMR accordingly and sends its corresponding secondaries attested root together with the timestamp.

Figure 5 (a validator’s MMR): The validator only maintains the right-most slope of the MMR of logarithmic size. Whenever a new transaction arrives, the slope is sufficient to calculate the new MMR slope and its root.

Light clients. pod has built-in light client support based on an elegant and simple, yet efficient, data structure called a Merkle Segment Mountain Range that uses bloom filters. The structure combines Merkle Trees with Segment Trees to allow for accountable light clients. Light clients can not only verifiably obtain information about the smart contracts they’re interested in but can also verify, in an accountable fashion, that no information has been omitted. The construction does not require the light client to trust any server intermediaries. Our light client data structure borrows and extends older, well-understood protocols such as Google’s Certificate Transparency.

Security Analysis

This is a high-level overview of pod-core's security analysis. For the complete analysis and findings, please refer to the pod-core paper.

The security analysis of pod-core rests on two critical parameters: the quorum size α corresponding to liveness resilience n - α, and the safety resilience β. Classical quorum-based consensus systems set n - α = β = 1/3. In pod, these parameters are configurable.

When a client observes a transaction signed by γ ≥ α of the validators (a certificate), the median of the timestamps signed by those validators for that transaction is taken as the confirmed timestamp. The minimum and maximum timestamp assurances for a given transaction can be computed as follows. Initially, all the timestamps associated with a transaction by validators are collected and sorted. Those validators who have not yet included the transaction in their log are counted as having included it with the timestamp of their last good response (if a validator has never responded, their last good response has a timestamp of 0). The highest β of these timestamps are set to their worst-case value of 0, and the median among the lowest α is taken as the pessimistic minimum. For calculating the pessimistic maximum, the timestamps are similarly collected and sorted, the lowest β of these are set to their worst-case value of positive infinity, and the median among the highest α is taken as the pessimistic maximum. Naturally, if fewer than α signatures (conditioned on α > 2β) have been received about a transaction, its minimum is set to 0, and its maximum is set to positive infinity, for now.

Additionally, a client maintains a perfect timestamp. For any transaction yet to be observed, its confirmed timestamp is guaranteed to be greater than the client's perfect timestamp. This essentially ensures that no new transaction can be confirmed with a past timestamp. This perfect timestamp is calculated by collecting the timestamp of the last good response by each of the validators, sorting these timestamps, setting the highest β of these to 0, and taking the median of the lowest α.

More formally, pod provides the following guarantees. In the following, α is the quorum size, n - α is the liveness resilience, β is the safety resilience, Δ is the upper bound on the network delay after GST, and δ is the actual network delay.

  • Accountable Liveness: As long as the adversary controls fewer than n - α validators, in partial synchrony, after GST, any honest transaction is confirmed within 2δ, and with a confirmation timestamp within δ, from the moment the honest transaction is sent to the network. Additionally, in asynchrony, transactions are eventually confirmed. Furthermore, after GST, an adversary who controls fewer than min(α, n - α) validators can be held accountable for not confirming transactions within Δ. Moreover, after GST, if f < α/2, a transaction’s temperature max - min in the view of all honest parties will be at most δ for all transactions, regardless of whether they are honest or adversarial.
  • Accountable Safety: As long as the adversary controls fewer than β validators, if a transaction is marked as confirmed with confirmed timestamp t1​ by one honest party, then this confirmed timestamp will fall between the min and max reported by every honest party earlier or later. Additionally, any transaction that was not seen by an honest party by a particular point in time will never appear confirmed with a timestamp prior to the perfect timestamp reported by that party at that point in time. In case either of these properties fails, at least β adversarial-only validators can be held accountable.

Gratitude and credits for past work

This work wouldn’t be possible without decades of amazing work done by scientists and engineers in the blockchain space. We especially want to thank the Ethereum community, Mysten Labs (SUI), and Flashbots. Check out the pod-core paper for related works and a complete list of references.