Skip to content

Implementation of Archive Node on TRON #6289

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
halibobo1205 opened this issue Apr 16, 2025 · 41 comments
Open

Implementation of Archive Node on TRON #6289

halibobo1205 opened this issue Apr 16, 2025 · 41 comments
Assignees

Comments

@halibobo1205
Copy link
Contributor

Background

Currently, Java-tron does not support Archive Node. An archive node can query historical state data from the blockchain. Unlike a fullnode, an archive node not only saves the most recent blockchain state data but also completely records all historical state data from the genesis block to the latest block. This allows users to directly query state data such as account balances and contract storage at any block height without needing to recalculate from the genesis block (replay transactions), greatly improving the efficiency of historical state data queries.

Rationale

Why should this feature exist?

Fullnodes only store the current latest block state (used to verify new blocks), while archive nodes additionally store state snapshots of each historical block (e.g., balances for each address, contract storage data, etc.). Archive nodes can support historical state queries. For example, querying the balance of an address at block height 10,000 cannot be directly provided by a fullnode, but an archive node can do this.

What are the use-cases?

  • DApp developers: testing and verifying smart contract behavior under different historical states
  • Data analysis services: providing advanced analytical functions based on historical states
  • Researchers: analyzing historical transaction patterns, state changes, and network behavior

Specification

Support Ethereum-compatible interfaces, including but not limited to:

  • eth_getBalance: query an account's balance at a specific block height
  • eth_getCode: query contract bytecode at a specific block height
  • eth_getStorageAt: query the value at a specific position in contract storage at a specific block height
  • eth_call: execute read-only contract calls on the state at a specific block height

Implementation

Archive nodes need a world state trie, which TRON needs to implement.

Ethereum state trie implementation

Ethereum supported a state trie in its initial version, implementing core state trie logic:

  1. The block header contains three roots: transactionRoot, receiptRoot, and stateRoot
  2. Every block state data change is reflected in the stateRoot
  3. Account data, including contracts (nonce, balance, storageRoot, codeHash), is all uniformly encapsulated in the account
  4. Uses MPT (Merkle Patricia Trie) for state data storage (including CRUD operations)

Image

TRON's current situation

  1. The block header contains 2 roots: txTrieRoot and accountStateRoot. TxTrieRoot is consistent with Ethereum's transactionRoot, but accountState only includes balance (the proposal is not enabled) and does not include contract data (storageRoot, codeHash) like ETH does.

  2. Supports historical account balance queries: After enabling balance.history.lookup, account balances at specific heights can be queried via wallet/getaccountbalance.

  3. State data is quite dispersed and not fully encapsulated in accounts (e.g., contract data, TRC10 are stored separately in different DBs). After organization, 25 databases need state statistics, including account data, TRC10 data, contract data, voting data, delegation data, etc. All state data needs to be organized and collected to generate the stateRoot

Implementation considerations for Archive nodes:

  1. stateRoot generation
    TRON's state data is quite dispersed, involving many state databases. Unlike ETH, which generates root directly based on accounts, TRON needs to collect dispersed state data to generate a stateRoot.

  2. Should stateRoot participate in consensus?
    Consensus issues:

    • A proposal is needed to generate stateRoot at a specified height, requiring consideration of how to calculate the initial effective height of stateRoot, whether to generate a genesis stateRoot based on the current state, and if not, how to ensure network-wide data consistency.
    • Historical data before this cannot generate stateRoot, and cannot support historical state queries.
    • Generating and verifying state roots impacts performance, potentially reducing TPS and transaction execution efficiency, affecting SRs and the ordinary fullNode.
  3. Compatibility with all ETH state query interfaces

    • Support for eth_getBalance, eth_getCode, eth_getStorageAt, eth_call
    • Future support for eth_getBlockReceipts(Implement eth_getBlockReceipts method #5910 ) and debug_traceCall(Support the debug_traceCall API #5778 )
    • For discussion: whether to support eth_getTransactionCount (requires adding nonce field to Account, requiring a hard fork proposal) and eth_getProof (requires storageRoot, lite fullnodes, L2 support, etc.)
  4. Performance impact
    Archive nodes significantly impact block execution and transaction execution efficiency, especially when historical state data is large. The implementation must ensure the normal operation of block synchronization and transaction execution.

  5. Storage impact
    The transaction volume of TRON is far greater than that of ETH, with the archive node data expected to be 3-4 times that of ETH, estimated at 80 T+. Current single disk storage will be limited, requiring consideration of using multiple segmented Archive Nodes combined into a complete historical query collection.

Based on the above analysis, stateRoot will not participate in consensus to reduce the impact on SRs and the ordinary fullnodes during archive node implementation. Segmented multi-disk storage of state data will be used to solve single-disk storage limitations. Additionally, since archive-related functionality implementation involves significant changes affecting many functions, integration into the main branch is not currently considered, and development will first proceed on an independent branch.

@waynercheung
Copy link
Contributor

Hi @@halibobo1205 , it's a good news that TRON will implement an archive node to offer the historical information.

And about the stateRoot generation, you have mentioned that it's different with ETH, can you provide more details about it?
such as is there also a MPT for account?

@lily309
Copy link

lily309 commented Apr 18, 2025

As I understand it, since java-tron uses protobuf as a prototype for database storage and API interaction. In the definition of the TRON protocol, the structure of accounts is more complex, so what kind of form will the archive node take when storing these historical account states?

A simple example may make it easier to describe the problem, for example, the structure of account A has balance/stake information/vote information, etc. If in a certain block, account A initiates a transfer, then only one balance field in the information of account A has changed. In the archive node implementation, does the account A corresponding to the height of the block where the transaction occurred still store all the complete account information? Or is there some special storage structure that reduces the storage overhead?

@halibobo1205
Copy link
Contributor Author

@lily309 still stores all the complete account information. The storage may become larger in this case, but the query operation will become easier. Still, at the same time, due to the database storage becoming larger, the query may also increase the time-consuming.
Storing only changing fields is good for storage, but it can get complicated for queries. Are there any known optimizations?
BTW, if Tron only implements historical queries, replace the MPT with flat KV(such as erigon's Flat KV) instead of MPTs?

@halibobo1205
Copy link
Contributor Author

stateRoot generation
ETH is about the MPT of the account; Tron's state data is split and not centralized in the account. The MPT tree is roughly designed as follows:

  1. Each type of data has its own MPT
Image
  1. All data forms a single MPT
Image

@waynercheung, I favor the second one, which I'll explain in more detail later.

@lily309
Copy link

lily309 commented Apr 18, 2025

Understood.

Of course, I'm also quite interested in erigon, who have implemented an archive node for Ethereum with only a tiny disk space.

I may not know much about the exact technical details yet, but it's nice to see that the TRON development community is considering a more efficient storage structure to solve the huge disk footprint problem posed by archive nodes, so if you guys have a follow-up update, please let me know.

@waynercheung
Copy link
Contributor

Hi @halibobo1205 , where is the storage of contract data?
Does the "Storage Keys" in the above image include the contract data?

For ETH, each account has a storage root, as "Storage trie is where all contract data lives. There is a separate storage trie for each account." mentioned by storage-trie.

@waynercheung
Copy link
Contributor

@waynercheung, I favor the second one, which I'll explain in more detail later.

Hi @halibobo1205 , I have another question about this, to my knowledge, java-tron has an important concept called 'proposal'. Is this data included in the MPT tree?

@Federico2014
Copy link
Contributor

Ethereum will use the Verkle tree to replace Merkle-Patricia Trie. Will you consider adopting the Verkle tree directly to improve performance?

@317787106
Copy link
Contributor

317787106 commented Apr 21, 2025

@halibobo1205 The stateRoot can compare with other nodes to ensure consistency, but how to query the balance by specifying the block height in eth_getBalance? Accumulate changes?

@317787106
Copy link
Contributor

There are 25 database modifications included, so what is the selection criteria? receiptRoot of ETH is not generated, it's transactioninfo of TRON, why not used instateRoot ?

@halibobo1205
Copy link
Contributor Author

halibobo1205 commented Apr 21, 2025

Does the "Storage Keys" in the above image include the contract data?

Yes, contract data, Tron is currently associated with code, contract, storage-row, contract-state, that is, if using a separate root, there are 4.
cc @yanghang8612

@waynercheung
Copy link
Contributor

Yes, contract data, Tron is currently associated with code, contract, storage-row, contract-state, that is, if using a separate root, there are 4. cc @yanghang8612

What does the "Storage Row Data" mean?
It just includes the contract data or it also includes other data?

Besides that, for Tron, there is only one MPT and just one stateRoot?
Because for ETH, there are four MPT which is State Trie, Storage Trie, Transactions Trie and Receipts Trie, and there are three roots: stateRoot, transactionsRoot and receiptsRoot.

@bladehan1
Copy link
Contributor

Ethereum has implemented the path-based-storage-model to solve the problem of dynamic pruning of the world state tree for ordinary full nodes. In this change (or considering the future), ordinary full nodes also generate StateRoot. Should we consider using the path-based-storage-model?

@fortuna502
Copy link

There are 25 database modifications included, so what is the selection criteria? receiptRoot of ETH is not generated, it's transactioninfo of TRON, why not used instateRoot ?

@halibobo1205 Can you show more details about the 25 database? what are they?

@fortuna502
Copy link

How can I enable state if I want to run archive node from a fullnode which have synced for a while, for example, the current max block number is 100, and I enable state, does the block 100 generate the stateRoot? How does the stateRoot generate?

If not, it means that the block 101 will have the stateRoot, how does the stateRoot generate?

@waynercheung
Copy link
Contributor

BTW, if Tron only implements historical queries, replace the MPT with flat KV(such as erigon's Flat KV) instead of MPTs?

Hi @halibobo1205 , so you mean that maybe stateRoot doesn't participate in consensus, just implemented for historical queries, right?

@shiziwen
Copy link

Ethereum will use the Verkle tree to replace Merkle-Patricia Trie. Will you consider adopting the Verkle tree directly to improve performance?

Hi @Federico2014 , is there any schedule about the progress?

As I know, Geth is actively working on replacing the Merkle-Patricia Trie with the Verkle Tree. Testnet support is already in place, and the mainnet upgrade is anticipated to occur with the Osaka upgrade in late 2025 or 2026.

I mean, using a Verkle tree might be a long-term solution, right?

@yanghang8612
Copy link
Contributor

My view is that the implementation of archive node on TRON should only serve historical data query as well as based on which certain interfaces that were difficult to implement before, such as eth_call, debug_traceCall and so on, should be implemented.

Based on this, we should focus on the overall performance and stability of the archive node.

@waynercheung
Copy link
Contributor

the implementation of archive node on TRON should only serve historical data query

From this point of view, historical data query is the first target of archive node, and flat KV mentioned by @halibobo1205 may be a good idea.

@waynercheung
Copy link
Contributor

Ethereum has implemented the path-based-storage-model to solve the problem of dynamic pruning of the world state tree for ordinary full nodes. In this change (or considering the future), ordinary full nodes also generate StateRoot. Should we consider using the path-based-storage-model?

Hi @halibobo1205 , what is the key for MPT in Tron?

path-based-storage-model shifts from storing trie nodes by their hash to using encoded paths and specific key prefixes, such as account address hashes for account tries and storage address hashes for storage tries. This change facilitates inline pruning and reduces data redundancy, addressing the unsustainable growth of state data for full nodes.

@halibobo1205
Copy link
Contributor Author

What is the key for MPT in Tron?
StateType + original key : 0x01 + address for Account for example
@waynercheung

public enum StateType {

  UNDEFINED((byte) 0x00, "undefined"),

  Account((byte) 0x01, "account"),
  Code((byte) 0x02, "code"),
  Contract((byte) 0x03, "contract"),
  StorageRow((byte) 0x04, "storage-row"),
  ....


  private final byte value;
  @Getter
  private final String name;

  StateType(byte value, String name) {
    this.value = value;
    this.name = name;
  }

  public byte value() {
    return this.value;
  }

  public static byte[] encodeKey(StateType type, byte[] key) {
    byte[] p = new byte[]{type.value};
    return Bytes.concat(p, key);
  }

}

@halibobo1205
Copy link
Contributor Author

@halibobo1205 The stateRoot can compare with other nodes to ensure consistency, but how to query the balance by specifying the block height in eth_getBalance? Accumulate changes?

Historical states store all accumulated changes.

@waynercheung
Copy link
Contributor

path-based-storage-model
replace the MPT with flat KV(such as erigon's Flat KV) instead of MPTs?

Hi @halibobo1205 @bladehan1 , both the path-based storage model and flat key-value storage replace the Merkle Patricia Trie (MPT) with a key-value approach. They may share many similarities.

We can conduct further research to determine if we can draw inspiration from them to enhance the implementation of MPT in Tron.

If my understanding is incorrect, please let me know.

@waynercheung
Copy link
Contributor

@halibobo1205 The stateRoot can compare with other nodes to ensure consistency, but how to query the balance by specifying the block height in eth_getBalance? Accumulate changes?

Historical states store all accumulated changes.

Hi @halibobo1205 , about the accumulated changes, can you give an example to explain it?

For example, we enable the state feature from block 100, and there is an account_A which balance is 10 and account_B which balance is 5.
Then in block 101, there is a transfer transaction that account_A transfer 2 TRX to account_B.
Then in block 102, there is no changes for account_A and account_B.

So for block 100, 101 and 102, how does the stateRoot calculate?

@fortuna502
Copy link

We can conduct further research to determine if we can draw inspiration from them to enhance the implementation of MPT in Tron.

Good idea, I think we can do some more research about the path-based storage model, references like [Ethereum Path-Based Storage Model and Newly Inline State Prune](https://ethresear.ch/t/ethereum-path-based-storage-model-and-newly-inline-state-prune/14932) and Optimizing Blockchain Storage: BSC's Path-Based Storage System, maybe it will help to optimize the storage.

@waynercheung
Copy link
Contributor

Performance impact
Archive nodes significantly impact block execution and transaction execution efficiency, especially when historical state data is large. The implementation must ensure the normal operation of block synchronization and transaction execution.

Hi @halibobo1205 , you have mentioned that there maybe performance impact for archive node, do you have do some tests about it?

@waynercheung
Copy link
Contributor

StateType + original key : 0x01 + address for Account for example

Hi @halibobo1205 , what is the key for the trie node?

For ETH, the traditional MPT is using the hash of trie node as key and the collapsed trie node as value.
How about TRON?

@halibobo1205
Copy link
Contributor Author

StateType + original key : 0x01 + address for Account for example

Hi @halibobo1205 , what is the key for the trie node?

For ETH, the traditional MPT is using the hash of trie node as key and the collapsed trie node as value. How about TRON?

They're the same.

@halibobo1205
Copy link
Contributor Author

performance impact

The eth_call interface is a typical state-querying interface used in blockchain applications, particularly for Ethereum nodes. We conducted performance tests on an archive node to evaluate how it handles block scenarios, focusing on Queries Per Second (QPS), latency, and error rates.

Test Setup
The tests were performed on an archive node(16C 32G) with p2p-disable = true, ensuring no peer-to-peer network interference. The archive Data is 1.8 T , and the server was restarted with cache cleared before each test to maintain consistency.

Performance Results
Below is a detailed table summarizing the performance metrics for the eth_call interface across different scenarios. Note, in these scenarios, select one specific block in a different time range:

Scenario QPS Avg Latency (ms) Min Latency (ms) Max Latency (ms) Error Rate (%)
Latest Block 331.0 1 0 181 0.00
One block in recent 100 Blocks 327.2 5 1 281 0.00
One block 1 Day Before 327.8 2 1 243 0.00
Random Blocks ~50 N/A N/A N/A 0.73

Scenario-Specific Observations

  • One block in the recent 100 Blocks. This scenario, representing queries against the most recent block, achieved a QPS of 331.0, with an average latency of 1ms, minimum latency of 0ms, and maximum latency of 181ms. Notably, there were no TVM query timeouts, indicating stable and efficient performance under high load for the latest state queries.

  • One block in recent 100 Blocks: Queries one specific block within the last 100 blocks showed a QPS of 327.2, with an average latency of 5ms, minimum latency of 1ms, and maximum latency of 281ms. The performance was minimally changed from the latest block scenario, with no errors recorded, suggesting robustness for recent historical data access.

  • One block 1 Day Before: For one block approximately one day prior, the QPS was 327.8, with an average latency of 2ms, minimum latency of 1ms, and maximum latency of 243ms. This scenario also exhibited consistent performance with no TVM query timeouts, reinforcing the interface's reliability for slightly older but still recent data.

  • Random Blocks: Random block queries, the error rate was 0.73%, indicating that even at reduced query volumes, there is sensitivity to certain query types. Latency details were not available for this scenario, marked as "N/A" in the table, but the presence of timeouts suggests potential issues with query handling under specific conditions.

Key Findings and Implications
For one block in recent blocks, including the latest block, the recent 100 blocks, and blocks from one day before, the eth_call interface demonstrates high efficiency, with QPS consistently around 327-331 and negligible error rates.

Conversely, the significant drop in performance for random blocks, QPS is at 50/s, with a 0.73% error rate, further underscores the interface's sensitivity to query types rather than just load.

All random block QPS tests at 50/s, with a 0.73% error rate, further underscore the interface's sensitivity to query types rather than just load.
The current implementation is not very performant. As of today, Tron's trading volume is about 4 times that of ETH. The archive node for ETH is about 21 T. It is estimated that Tron's archive node will reach 84 T. What needs to be addressed now is the extreme data bloat and the performance issues that come with it. @waynercheung

@waynercheung
Copy link
Contributor

It is estimated that Tron's archive node will reach 84 T. What needs to be addressed now is the extreme data bloat and the performance issues that come with it.

Hi @halibobo1205 , the low QPS might be barely acceptable.

As hard drive capacities continue to grow, could this impact transaction and block processing? Has there been any related testing?

@halibobo1205
Copy link
Contributor Author

halibobo1205 commented May 8, 2025

@waynercheung
Previously, the archive nodes implemented based on MPT have experienced rapid state data growth, with the total data volume now reaching 78 T. When the state data on a single node exceeds 500G, block processing performance degrades significantly. Each transaction involves approximately 20 state queries, each taking about 0.15ms, resulting in around 3ms per transaction. Currently, blocks contain between 200 and 400 transactions, causing block processing times to range from 800ms to 1.6s, occasionally exceeding 2-3 seconds.
Note: Once transaction volume exceeds 700 per block, processing time surpasses 3 seconds, forcing the system into synchronization mode and temporarily suspending transaction broadcasting capabilities.

@halibobo1205
Copy link
Contributor Author

Facing significant scaling and performance challenges:

Current State

  • Transaction Volume: TRON processes approximately three times the transaction volume of Ethereum
  • Account Complexity: TRON's account structure is more complex than Ethereum's
  • Storage Implementation: The archive node implementation is based on Ethereum's Merkle Patricia Trie (MPT)

Key Technical Challenges

  1. Massive State Data Growth

    • Current state data has reached 75TB, which is extremely large and difficult to manage
  2. Performance Degradation with Scale

    • Even with segmented state data generation, block processing time increases significantly when state data exceeds 500GB
    • When blocks contain more than 700 transactions, processing time exceeds 3 seconds per block
  3. Poor Query Performance

    • JSON-RPC interfaces that support state queries demonstrate inadequate QPS (Queries Per Second) performance

These challenges stem from the combination of high transaction throughput, complex account structures, and the limitations of the current state storage architecture.

@waynercheung
Copy link
Contributor

When the state data on a single node exceeds 500G, block processing performance degrades significantly.

@halibobo1205 , so if we want to run archive node, we need to start a new node when the disk is larger than 500G?
Do you have a solution about it?

@halibobo1205
Copy link
Contributor Author

@waynercheung Based on the MPT implementation approach, creating a new node when the state data exceeds 500GB is a balanced compromise between block processing and API query performance. Further exploration and optimization of new solutions will be required moving forward.

@halibobo1205
Copy link
Contributor Author

halibobo1205 commented May 15, 2025

Erigon represents a fundamental rethinking of Ethereum's core infrastructure, not only addressing current technical challenges but also laying the foundation for future expansion. Through its innovative flat storage model and Change Sets architecture, Erigon significantly improves node performance and efficiency while reducing the hardware requirements for running a archive node.
These solutions have direct reference value for the challenges faced by TRON (75TB state data, performance bottlenecks, and JSON-RPC query performance) because both networks face similar infrastructure scaling issues, though TRON's scale is larger, making the need for solutions more urgent.

Erigon Storage Design: Store state in plain KV DB MDBX.

Table Key Value
Account History Index block number Set of block numbers that changes the account.
Account ChangeSets (address, block number) Store the prior value for the accounts that are changed in the current block.
Storage History Index block number A Set of block numbers that changes the contract storage slot.
Storage ChangeSets (address, slot, block number) Store the prior value for the contract storage slots that have changed in the current block.
Plain States address or (address, slot) The latest states, just like the current state of Java-tron.

Erigon Disk-space

Image

reth Database design

erDiagram
CanonicalHeaders {
    u64 BlockNumber "PK"
    B256 HeaderHash "Value for CanonicalHeaders"
}
HeaderNumbers {
    B256 BlockHash "PK"
    u64 BlockNumber
}
Headers {
    u64 BlockNumber "PK"
    Header Data
}
BlockBodyIndices {
    u64 BlockNumber "PK"
    u64 first_tx_num
    u64 tx_count
}
BlockOmmers {
    u64 BlockNumber "PK"
    Header[] Ommers
}
BlockWithdrawals {
    u64 BlockNumber "PK"
    Withdrawal[] Withdrawals
}
Transactions {
    u64 TxNumber "PK"
    TransactionSigned Data
}
TransactionHashNumbers {
    B256 TxHash "PK"
    u64 TxNumber
}
TransactionBlocks {
    u64 MaxTxNumber "PK"
    u64 BlockNumber
}
Receipts {
    u64 TxNumber "PK"
    Receipt Data
}
Bytecodes {
    B256 CodeHash "PK"
    Bytes Code
}
PlainAccountState {
    Address Account "PK"
    Account Data
}
PlainStorageState {
    Address Account "PK"
    B256 StorageKey "PK"
    U256 StorageValue
}
AccountsHistoryIndex {
    B256 Account "PK"
    BlockNumberList BlockNumberList "List of transitions where account was changed"
}
StoragesHistoryIndex {
    B256 Account "PK"
    B256 StorageKey "PK"
    BlockNumberList BlockNumberList "List of transitions where account storage entry was changed"
}
AccountChangeSets {
    u64 BlockNumber "PK"
    B256 Account "PK"
    ChangeSet AccountChangeSets "Account before transition"
}
StorageChangeSets {
    u64 BlockNumber "PK"
    B256 Account "PK"
    B256 StorageKey "PK"
    ChangeSet StorageChangeSets "Storage entry before transition"
}
HashedAccounts {
    B256 HashedAddress "PK"
    Account Data
}
HashedStorages {
    B256 HashedAddress "PK"
    B256 HashedStorageKey "PK"
    U256 StorageValue
}
AccountsTrie {
    StoredNibbles Nibbles "PK"
    BranchNodeCompact Node
}
StoragesTrie {
    B256 HashedAddress "PK"
    StoredNibblesSubKey NibblesSubKey "PK"
    StorageTrieEntry Node
}
TransactionSenders {
    u64 TxNumber "PK"
    Address Sender
}
TransactionHashNumbers ||--|| Transactions : "hash -> tx id"
TransactionBlocks ||--|{ Transactions : "tx id -> block number"
BlockBodyIndices ||--o{ Transactions : "block number -> tx ids"
Headers ||--o{ AccountChangeSets : "each block has zero or more changesets"
Headers ||--o{ StorageChangeSets : "each block has zero or more changesets"
AccountsHistoryIndex }|--|{ AccountChangeSets : index
AccountsHistoryIndex }|--|{ StorageChangeSets : index
Headers ||--o| BlockOmmers : "each block has 0 or more ommers"
BlockBodyIndices ||--|| Headers : "index"
HeaderNumbers |o--|| Headers : "block hash -> block number"
CanonicalHeaders |o--|| Headers : "canonical chain block number -> block hash"
Transactions ||--|| Receipts : "each tx has a receipt"
PlainAccountState }o--o| Bytecodes : "an account can have a bytecode"
PlainAccountState ||--o{ PlainStorageState : "an account has 0 or more storage slots"
Transactions ||--|| TransactionSenders : "a tx has exactly 1 sender"

PlainAccountState ||--|| HashedAccounts : "hashed representation"
PlainStorageState ||--|| HashedStorages : "hashed representation"
Loading

@halibobo1205
Copy link
Contributor Author

halibobo1205 commented May 15, 2025

TRON's database architecture is distributed across multiple specialized databases. When writing to these various databases, any data changes caused by a block are pre-written to a WAL (Write-Ahead Logging) database, commonly referred to as a checkpoint database. In this system, keys are composed of the database name plus the original key, while values consist of the operation type plus the original value.
This architecture means that Java-TRON inherently possesses ChangeSet capabilities as part of its design.

Current Tron State Database

We have identified approximately 20 state databases that could leverage the ChangeSet approach. This classification may be subject to future changes.

DB Prefix
account 0x01
account-asset 0x02
account-index 0x03
accountid-index 0x04
asset-issue-v2 0x05
code 0x06
contract 0x07
delegation 0x08
DelegatedResource 0x09
DelegatedResourceAccountIndex 0x0a
exchange 0x0b
exchange-v2 0x0c
IncrementalMerkleTree 0x0d
market_account 0x0e
market_order 0x0f
market_pair_price_to_order 0x10
market_pair_to_price 0x11
nullifier 0x12
properties 0x13
proposal 0x14
storage-row 0x15
votes 0x16
witness 0x17
witness_schedule 0x18
contract-state 0x19
... ... ... ...

ChangeSet Simulation Results (Original Data Size)

Block Height Transaction Count StateChangeSet Records StateChangeSet Size
72169692 262 962 144,328 bytes
72216532 489 1,425 215,634 bytes

Storage Efficiency Projection

The current total transaction volume is 10,372,202,496(2025-5-15 from tronscan)
Estimated state data size using the ChangeSet approach:
10,372,202,496 ÷ 262 × 144,328 ÷ 1024⁴ = 5.5 TB
This represents a significant improvement over the previous MPT-based approach, which required 75 TB of storage space - a reduction of approximately 92%.

Note: These are simulation results and may not represent the final production implementation outcomes. Further optimizations and adjustments are expected during actual implementation.

@u59149403
Copy link

Please, collect user cases before writing archive node. Not everybody needs history. And not everybody needs full history. I wrote a trading bot using alloy for Ethereum (and I plan to add TRON later), and in my experience I never needed history earlier than week ago. So, again: ask users what exactly they need, how much history they need, etc.

Moreover, I can tolerate full lack of history. The only thing I truly need is basic alloy compatibility. I. e. I should be able to call contracts and send transactions using alloy. As well as I understand it is currently not possible due to "data-vs-input" and "pending-vs-latest" problem: alloy-rs/alloy#2371 . So, please, fix these small issues first. Try to write small alloy application and make sure it works.

(For users, who experience alloy/foundry/TRON problems: here is some workaround: alloy-rs/alloy#2371 . Also, as a quick workaround you can write your own JSON-RPC proxy, it is, in fact, very easy thing to do. I wrote my own JSON-RPC proxy [limited to features I need], which makes TRON more Ethereum-compatible. I can share code, if you want.)

@bladehan1
Copy link
Contributor

Some typical application scenarios of archive nodes

  1. Block explorers and data indexing services
    • Provide complete queries for transaction history, account states, smart contract invocations, and other on-chain data.
    • Enable users to retrieve on-chain information from any historical timestamp using parameters like block height, transaction hash, or wallet address.
  2. On-chain data analysis and auditing
    • Analyze historical transaction patterns, token flow tracking, and DeFi protocol activities to generate insights or compliance reports.
    • Audit historical state changes of smart contracts or verify the compliance of a project’s on-chain operations.
  3. Backend services for DApps
    • DApps may require access to a user’s historical interactions (e.g., staking, voting, token transfers) to display past activities.
    • Rely on historical data to compute user reputation scores, achievement systems, or loyalty programs.

@halibobo1205
Copy link
Contributor Author

@u59149403 @bladehan1
Thank you for your attention and feedback!

The Archive node is currently under active discussion. Its primary goals are compatibility with Ethereum (ETH) interfaces and support for historical state queries. As mentioned, we agree it's important to gather more detailed user requirements, including (but not limited to) how long historical data should be retained and which specific interfaces need to be supported.

On the topic of historical data storage, the Ethereum community is also exploring related optimizations (e.g., EIP-4444), which we are closely following and evaluating for potential alignment.

Please feel free to share any specific use cases or expectations — that would be very helpful in shaping the final design.

@u59149403
Copy link

I implemented JSON-RPC proxy for making TRON more Ethereum-compatible. Currently very few features are supported. Check it here: https://github.com/u59149403/tjrp

@halibobo1205
Copy link
Contributor Author

I implemented JSON-RPC proxy for making TRON more Ethereum-compatible. Currently very few features are supported. Check it here: https://github.com/u59149403/tjrp

Thank you for being so interested. It's still in the planning stages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants