Teranode Data Model - Subtrees
The Subtrees are an innovation aimed at improving the scalability and real-time processing capabilities of the blockchain system.
Structure
The concept of subtrees is a distinct feature not found in the BTC design.
-
A subtree acts as an intermediate data structure to hold batches of transaction IDs (including metadata) and their corresponding Merkle root.
- The size of a subtree can be any number of transactions, as long as it is a power of 2 (16, 32, 64, etc.). The only requirement is that all subtrees in a block must be the same size. At peak throughput, subtrees will contain millions of transaction IDs.
-
Each subtree computes its own Merkle root, which is a single hash representing the entire set of transactions within that subtree.
Here's a table documenting the structure of the Subtree
type:
Field | Type | Description |
---|---|---|
Height | int | The height of the subtree within the blockchain. |
Fees | uint64 | Total fees associated with the transactions in the subtree. |
SizeInBytes | uint64 | The size of the subtree in bytes. |
FeeHash | chainhash.Hash | Hash representing the combined fees of the subtree. |
Nodes | []SubtreeNode | An array of SubtreeNode objects, representing individual "nodes" within the subtree. |
ConflictingNodes | []chainhash.Hash | List of hashes representing nodes that conflict, requiring checks during block assembly. |
Here, a `SubtreeNode is a data structure representing a transaction hash, a fee, and the size in bytes of said TX.
Note - For subtree files in the subtree-store
S3 buckets, each subtree has a size of 48MB.
Subtree Composition
Each subtree consists of:
- root hash: 32 bytes
- fees: 8 bytes (uint64)
- sizeInBytes: 8 bytes (uint64)
- numberOfNodes: 8 bytes (uint64)
- nodes: 48 bytes per node (hash:32 + fee:8 + size:8)
- numberOfConflictingNodes: 8 bytes (uint64)
- conflictingNodes: 32 bytes per conflicting node
Calculation:
Fixed header: 32 + 8 + 8 + 8 + 8 = 64 bytes
Additional - Per transaction node: 48 bytes
Data Transfer Between Nodes
However - only 32MB is transferred between the nodes. Each subtree transfer includes:
- hash: 32 bytes
1024 * 1024 * (32) = 32MB
Efficiency
Subtrees are broadcast every second (assuming a baseline throughput of 1M transactions per second), making data propagation more continuous rather than batched every 10 minutes. Although blocks are still created every 10 minutes, subtrees are broadcast every second.
-
Broadcasting subtrees at this high frequency allows receiving nodes to validate batches quickly and continuously, essentially "pre-approving" them for inclusion in a block.
-
This contrasts with the BTC design, where a new block and its transactions are broadcast approximately every 10 minutes after being confirmed by miners.
Lightweight
Subtrees only include transaction IDs, not full transaction data, since all nodes already possess the transactions, reducing the size of the propagated data.
-
All network nodes are assumed to already have the full transaction data (which they receive and store as transactions are created and spread through the network). Therefore, it's unnecessary to rebroadcast full details with each subtree.
-
Subtrees allow nodes to confirm they have all relevant transactions and update their state accordingly without having to process large amounts of data repeatedly.